agentic_huge_data_base / wiki
页面 Argilla · 7.2 REST 接口·DeepWiki 中文全文译文

7.2 · REST 接口(REST API)

人工复核与反馈数据 · 聚焦本章的模块关系、源码依据与实现要点。

项目Argilla 章节7.2 状态全文译文 模块接口与服务契约、评测、反馈与人工复核、界面与交互、配置治理
源码线索
  • argilla-frontend/CHANGELOG.md
  • argilla-frontend/components/features/annotation/container/questions/form/span/EntityLabelSelection.component.vue
  • argilla-frontend/components/features/annotation/settings/Validation.vue
  • argilla-frontend/components/features/dataset-creation/configuration/DatasetConfigurationForm.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationFieldSelector.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationLabels.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationQuestion.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationRating.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationSpan.vue
  • argilla-frontend/package.json
模块标签
  • 接口与服务契约
  • 评测、反馈与人工复核
  • 界面与交互
  • 配置治理
  • 文档对象与元数据

章节正文

REST 接口

REST 接口

相关源文件

本章引用的主要源码文件:

  • argilla-frontend/CHANGELOG.md
  • argilla-frontend/components/features/annotation/container/questions/form/span/EntityLabelSelection.component.vue
  • argilla-frontend/components/features/annotation/settings/Validation.vue
  • argilla-frontend/components/features/dataset-creation/configuration/DatasetConfigurationForm.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationFieldSelector.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationLabels.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationQuestion.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationRating.vue
  • argilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationSpan.vue
  • argilla-frontend/package.json
  • argilla-frontend/translation/de.js
  • argilla-frontend/translation/en.js
  • argilla-frontend/translation/es.js
  • argilla-frontend/v1/domain/entities/hub/DatasetCreation.test.ts
  • argilla-frontend/v1/domain/entities/hub/QuestionCreation.ts
  • argilla-frontend/v1/domain/entities/hub/Subset.ts
  • argilla-server/CHANGELOG.md
  • argilla-server/src/argilla_server/_version.py
  • argilla-server/src/argilla_server/alembic/versions/580a6553186f_add_datasets_users_table.py
  • argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py
  • argilla-server/src/argilla_server/api/schemas/v1/datasets.py
  • argilla-server/src/argilla_server/bulk/records_bulk.py
  • argilla-server/src/argilla_server/contexts/datasets.py
  • argilla-server/src/argilla_server/database.py
  • argilla-server/src/argilla_server/models/database.py
  • argilla-server/tests/factories.py
  • argilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_create_dataset_records_bulk.py
  • argilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_dataset_records_bulk_with_responses.py
  • argilla-server/tests/unit/api/handlers/v1/datasets/test_get_dataset_progress.py
  • argilla-server/tests/unit/api/handlers/v1/responses/test_create_current_user_responses_bulk.py
  • argilla-server/tests/unit/api/handlers/v1/test_datasets.py
  • argilla-server/tests/unit/api/handlers/v1/test_records.py
  • argilla-server/tests/unit/database/models/test_dataset_user_model.py
  • argilla-server/tests/unit/test_database.py
  • argilla-v1/src/argilla_v1/_version.py
  • argilla/CHANGELOG.md
  • argilla/src/argilla/__init__.py
  • argilla/src/argilla/_version.py

本文档介绍了 Argilla REST API,您可以通过该 API 以编程方式与 Argilla 服务器进行交互。虽然大多数用户会通过 Python SDK(参见 Python API)与 Argilla 交互,但了解 REST API 对于高级用例、构建自定义集成或排查问题会很有帮助。

概述

Argilla 提供了一套遵循 RESTful 原则的全面 REST API。所有 API 端点均以 /api/v1/ 为前缀以支持版本管理。该 API 使用标准 HTTP 方法并返回 JSON 格式的响应。

Argilla · 概述 · 图 1
Argilla · 概述 · 图 1

来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py argilla-server/src/argilla_server/contexts/datasets.py

认证

所有 API 请求都必须使用 API 密钥进行认证。API 密钥应在请求头中提供:

X-Argilla-Api-Key: YOUR_API_KEY

您可以通过 Argilla UI 的用户设置页面或使用 /api/v1/token 端点生成 API 密钥。

来源: argilla-server/tests/unit/api/handlers/v1/test_datasets.py:177-178 argilla-server/src/argilla_server/constants.py

API 资源与数据模型

API 围绕以下主要资源进行组织:

Argilla · API 资源与数据模型 · 图 2
Argilla · API 资源与数据模型 · 图 2

来源: argilla-server/src/argilla_server/models/database.py:46-66 argilla-server/src/argilla_server/api/schemas/v1/datasets.py

API 端点

数据集端点
方法端点描述
GET/api/v1/me/datasets列出当前用户可访问的数据集
GET/api/v1/datasets/{dataset_id}获取特定数据集
POST/api/v1/datasets创建新数据集
PATCH/api/v1/datasets/{dataset_id}更新数据集
DELETE/api/v1/datasets/{dataset_id}删除数据集
GET/api/v1/datasets/{dataset_id}/fields列出数据集的字段
GET/api/v1/datasets/{dataset_id}/progress获取数据集的进度指标
POST/api/v1/datasets/{dataset_id}/export将数据集导出到 Hugging Face Hub
POST/api/v1/datasets/{dataset_id}/import从 Hugging Face Hub 导入数据集

来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:75-97 argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:99-107 argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:136-144 argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:164-176

记录端点
方法端点描述
GET/api/v1/records/{record_id}获取特定记录
PATCH/api/v1/records/{record_id}更新特定记录
DELETE/api/v1/records/{record_id}删除记录
GET/api/v1/datasets/{dataset_id}/records列出数据集中的记录
POST/api/v1/datasets/{dataset_id}/records/bulk批量创建多条记录
PUT/api/v1/datasets/{dataset_id}/records/bulk批量更新多条记录
POST/api/v1/datasets/{dataset_id}/records/search在无用户上下文的情况下搜索记录
POST/api/v1/me/datasets/{dataset_id}/records/search在用户上下文下搜索记录

来源: argilla-server/src/argilla_server/bulk/records_bulk.py:46-155 argilla-server/src/argilla_server/bulk/records_bulk.py:158-226 argilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_create_dataset_records_bulk.py:59-301

响应端点
方法端点描述
POST/api/v1/me/responses/bulk为当前用户批量创建响应
GET/api/v1/responses/{response_id}获取特定响应
PATCH/api/v1/responses/{response_id}更新响应
DELETE/api/v1/responses/{response_id}删除响应

来源: argilla-server/src/argilla_server/contexts/datasets.py:480-542 argilla-server/src/argilla_server/contexts/datasets.py:544-573

元数据属性端点
方法端点描述
GET/api/v1/datasets/{dataset_id}/metadata-properties列出数据集的元数据属性
POST/api/v1/datasets/{dataset_id}/metadata-properties为数据集创建元数据属性
PATCH/api/v1/metadata-properties/{metadata_property_id}更新元数据属性
DELETE/api/v1/metadata-properties/{metadata_property_id}删除元数据属性
GET/api/v1/metadata-properties/{metadata_property_id}/metrics获取元数据属性的指标

来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:121-133 argilla-server/src/argilla_server/contexts/datasets.py:246-270 argilla-server/src/argilla_server/contexts/datasets.py:273-282

向量设置端点
方法端点描述
GET/api/v1/datasets/{dataset_id}/vectors-settings列出数据集的向量设置
POST/api/v1/datasets/{dataset_id}/vectors-settings为数据集创建向量设置
PATCH/api/v1/vectors-settings/{vector_settings_id}更新向量设置
DELETE/api/v1/vectors-settings/{vector_settings_id}删除向量设置

来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:110-118 argilla-server/src/argilla_server/contexts/datasets.py:289-326

用户与认证端点
方法端点描述
POST/api/v1/token为用户生成新的 API 令牌
GET/api/v1/me获取当前用户信息
GET/api/v1/users列出所有用户
GET/api/v1/users/{user_id}获取特定用户
POST/api/v1/users创建新用户
DELETE/api/v1/users/{user_id}删除用户

来源: argilla-server/CHANGELOG.md:204-215

工作区端点
方法端点描述
POST/api/v1/workspaces创建新工作区
GET/api/v1/workspaces/{workspace_id}/users获取工作区中的用户
POST/api/v1/workspaces/{workspace_id}/users将用户添加到工作区
DELETE/api/v1/workspaces/{workspace_id}/users/{user_id}从工作区移除用户

来源: argilla-server/CHANGELOG.md:210-213

Webhook 端点
方法端点描述
GET/api/v1/webhooks列出 Webhook
POST/api/v1/webhooks创建 Webhook
GET/api/v1/webhooks/{webhook_id}获取 Webhook
POST/api/v1/webhooks/{webhook_id}/ping测试 Webhook
DELETE/api/v1/webhooks/{webhook_id}删除 Webhook
PATCH/api/v1/webhooks/{webhook_id}更新 Webhook

来源: argilla-server/CHANGELOG.md:57-60

系统信息端点
方法端点描述
GET/api/v1/version获取当前 Argilla 版本
GET/api/v1/status获取 Argilla 服务状态
GET/api/v1/settings获取 Argilla 和 Hugging Face 设置

来源: argilla-server/CHANGELOG.md:214-215 argilla-server/CHANGELOG.md:226-227

常见请求/响应模式

创建数据集

请求:

POST /api/v1/datasets
Content-Type: application/json
X-Argilla-Api-Key: YOUR_API_KEY

{
  "name": "my-dataset",
  "guidelines": "Annotation guidelines for this dataset",
  "allow_extra_metadata": true,
  "distribution": {
    "strategy": "overlap",
    "min_submitted": 1
  },
  "workspace_id": "workspace-uuid"
}

响应:

{
  "id": "dataset-uuid",
  "name": "my-dataset",
  "guidelines": "Annotation guidelines for this dataset",
  "allow_extra_metadata": true,
  "status": "draft",
  "distribution": {
    "strategy": "overlap",
    "min_submitted": 1
  },
  "metadata": null,
  "workspace_id": "workspace-uuid",
  "last_activity_at": "2023-06-01T12:00:00Z",
  "inserted_at": "2023-06-01T12:00:00Z",
  "updated_at": "2023-06-01T12:00:00Z"
}

来源: argilla-server/tests/unit/api/handlers/v1/test_datasets.py:100-159

批量添加记录

请求:

POST /api/v1/datasets/{dataset_id}/records/bulk
Content-Type: application/json
X-Argilla-Api-Key: YOUR_API_KEY

{
  "items": [
    {
      "fields": {
        "prompt": "What is machine learning?",
        "response": "Machine learning is a subfield of artificial intelligence..."
      },
      "metadata": {
        "source": "wikipedia",
        "difficulty": 2
      },
      "external_id": "record-1"
    },
    {
      "fields": {
        "prompt": "Explain deep learning",
        "response": "Deep learning is a subset of machine learning..."
      },
      "metadata": {
        "source": "textbook",
        "difficulty": 3
      },
      "external_id": "record-2"
    }
  ]
}

响应:

{
  "items": [
    {
      "id": "record-1-uuid",
      "fields": {
        "prompt": "What is machine learning?",
        "response": "Machine learning is a subfield of artificial intelligence..."
      },
      "metadata": {
        "source": "wikipedia",
        "difficulty": 2
      },
      "external_id": "record-1",
      "status": "pending",
      "dataset_id": "dataset-uuid",
      "inserted_at": "2023-06-01T12:01:00Z",
      "updated_at": "2023-06-01T12:01:00Z"
    },
    {
      "id": "record-2-uuid",
      "fields": {
        "prompt": "Explain deep learning",
        "response": "Deep learning is a subset of machine learning..."
      },
      "metadata": {
        "source": "textbook",
        "difficulty": 3
      },
      "external_id": "record-2",
      "status": "pending",
      "dataset_id": "dataset-uuid",
      "inserted_at": "2023-06-01T12:01:00Z",
      "updated_at": "2023-06-01T12:01:00Z"
    }
  ]
}

来源: argilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_create_dataset_records_bulk.py:59-301 argilla-server/src/argilla_server/bulk/records_bulk.py:46-155

搜索记录

请求:

POST /api/v1/me/datasets/{dataset_id}/records/search
Content-Type: application/json
X-Argilla-Api-Key: YOUR_API_KEY

{
  "query": "machine learning",
  "filter": {
    "metadata": {
      "difficulty": {
        "gt": 2
      }
    },
    "response_status": "pending"
  },
  "sort": [
    {
      "field": "inserted_at",
      "order": "desc"
    }
  ],
  "limit": 10,
  "offset": 0
}

响应:

{
  "items": [
    {
      "id": "record-2-uuid",
      "fields": {
        "prompt": "Explain deep learning",
        "response": "Deep learning is a subset of machine learning..."
      },
      "metadata": {
        "source": "textbook",
        "difficulty": 3
      },
      "external_id": "record-2",
      "status": "pending",
      "dataset_id": "dataset-uuid",
      "inserted_at": "2023-06-01T12:01:00Z",
      "updated_at": "2023-06-01T12:01:00Z"
    }
  ],
  "total": 1,
  "offset": 0,
  "limit": 10
}

来源: argilla-server/CHANGELOG.md:417-426

常见工作流模式

下图展示了使用 Argilla REST API 进行数据集创建和标注的典型工作流:

Argilla · 常见工作流模式 · 图 3
Argilla · 常见工作流模式 · 图 3

来源:

错误处理

API 使用标准 HTTP 状态码来表示请求的成功或失败:

状态码描述
200 OK请求成功
201 Created资源创建成功
400 Bad Request请求无效或无法理解
401 Unauthorized需要认证或认证失败
403 Forbidden已认证用户没有访问该资源的权限
404 Not Found请求的资源不存在
422 Unprocessable Entity请求格式正确但因语义错误无法处理
429 Too Many Requests客户端在给定时间内发送了过多请求
500 Server Error服务器发生错误

错误响应遵循以下结构:

{
  "detail": {
    "code": "error_code",
    "message": "人类可读的错误信息"
  }
}

来源:

速率限制

API 实现了速率限制以防止滥用。如果您在短时间内发送过多请求,可能会收到 429 Too Many Requests 响应。发生这种情况时,您应该等待一段时间后再重试。

来源:

分页

列表端点通过 limitoffset 参数支持分页:

  • limit:返回的最大项目数(默认值:100,最大值:1000)
  • offset:开始返回项目之前要跳过的项目数(默认值:0)

响应中会包含 totallimitoffset 字段以帮助进行分页:

{
  "items": [...],
  "total": 1500,
  "limit": 100,
  "offset": 0
}

来源: argilla-server/CHANGELOG.md:456-458

过滤与排序

许多端点通过查询参数或请求体字段支持过滤和排序:

  • 过滤:按字段、元数据或响应状态过滤记录
  • 排序:按 inserted_atupdated_at 或元数据属性等字段对结果进行排序

搜索请求中过滤和排序的示例:

{
  "filter": {
    "metadata": {
      "difficulty": {
        "gt": 2
      }
    },
    "response_status": "pending"
  },
  "sort": [
    {
      "field": "inserted_at",
      "order": "desc"
    }
  ]
}

来源:

总结

Argilla REST API 提供了一套全面的端点,用于以编程方式与平台进行交互。无论您是构建自定义集成、自动化工作流还是排查问题,了解该 API 都能帮助您更好地利用 Argilla。

对于大多数用例,我们建议使用 Python SDK,它提供了更高级、更友好的 API 接口。不过,REST API 提供了更精细的控制,并且可以从任何编程语言或环境中使用。