REST 接口
REST 接口
相关源文件
本章引用的主要源码文件:
argilla-frontend/CHANGELOG.mdargilla-frontend/components/features/annotation/container/questions/form/span/EntityLabelSelection.component.vueargilla-frontend/components/features/annotation/settings/Validation.vueargilla-frontend/components/features/dataset-creation/configuration/DatasetConfigurationForm.vueargilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationFieldSelector.vueargilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationLabels.vueargilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationQuestion.vueargilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationRating.vueargilla-frontend/components/features/dataset-creation/configuration/questions/DatasetConfigurationSpan.vueargilla-frontend/package.jsonargilla-frontend/translation/de.jsargilla-frontend/translation/en.jsargilla-frontend/translation/es.jsargilla-frontend/v1/domain/entities/hub/DatasetCreation.test.tsargilla-frontend/v1/domain/entities/hub/QuestionCreation.tsargilla-frontend/v1/domain/entities/hub/Subset.tsargilla-server/CHANGELOG.mdargilla-server/src/argilla_server/_version.pyargilla-server/src/argilla_server/alembic/versions/580a6553186f_add_datasets_users_table.pyargilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.pyargilla-server/src/argilla_server/api/schemas/v1/datasets.pyargilla-server/src/argilla_server/bulk/records_bulk.pyargilla-server/src/argilla_server/contexts/datasets.pyargilla-server/src/argilla_server/database.pyargilla-server/src/argilla_server/models/database.pyargilla-server/tests/factories.pyargilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_create_dataset_records_bulk.pyargilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_dataset_records_bulk_with_responses.pyargilla-server/tests/unit/api/handlers/v1/datasets/test_get_dataset_progress.pyargilla-server/tests/unit/api/handlers/v1/responses/test_create_current_user_responses_bulk.pyargilla-server/tests/unit/api/handlers/v1/test_datasets.pyargilla-server/tests/unit/api/handlers/v1/test_records.pyargilla-server/tests/unit/database/models/test_dataset_user_model.pyargilla-server/tests/unit/test_database.pyargilla-v1/src/argilla_v1/_version.pyargilla/CHANGELOG.mdargilla/src/argilla/__init__.pyargilla/src/argilla/_version.py
本文档介绍了 Argilla REST API,您可以通过该 API 以编程方式与 Argilla 服务器进行交互。虽然大多数用户会通过 Python SDK(参见 Python API)与 Argilla 交互,但了解 REST API 对于高级用例、构建自定义集成或排查问题会很有帮助。
概述
Argilla 提供了一套遵循 RESTful 原则的全面 REST API。所有 API 端点均以 /api/v1/ 为前缀以支持版本管理。该 API 使用标准 HTTP 方法并返回 JSON 格式的响应。
来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py argilla-server/src/argilla_server/contexts/datasets.py
认证
所有 API 请求都必须使用 API 密钥进行认证。API 密钥应在请求头中提供:
X-Argilla-Api-Key: YOUR_API_KEY
您可以通过 Argilla UI 的用户设置页面或使用 /api/v1/token 端点生成 API 密钥。
来源: argilla-server/tests/unit/api/handlers/v1/test_datasets.py:177-178 argilla-server/src/argilla_server/constants.py
API 资源与数据模型
API 围绕以下主要资源进行组织:
来源: argilla-server/src/argilla_server/models/database.py:46-66 argilla-server/src/argilla_server/api/schemas/v1/datasets.py
API 端点
数据集端点
| 方法 | 端点 | 描述 |
|---|---|---|
| GET | /api/v1/me/datasets | 列出当前用户可访问的数据集 |
| GET | /api/v1/datasets/{dataset_id} | 获取特定数据集 |
| POST | /api/v1/datasets | 创建新数据集 |
| PATCH | /api/v1/datasets/{dataset_id} | 更新数据集 |
| DELETE | /api/v1/datasets/{dataset_id} | 删除数据集 |
| GET | /api/v1/datasets/{dataset_id}/fields | 列出数据集的字段 |
| GET | /api/v1/datasets/{dataset_id}/progress | 获取数据集的进度指标 |
| POST | /api/v1/datasets/{dataset_id}/export | 将数据集导出到 Hugging Face Hub |
| POST | /api/v1/datasets/{dataset_id}/import | 从 Hugging Face Hub 导入数据集 |
来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:75-97 argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:99-107 argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:136-144 argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:164-176
记录端点
| 方法 | 端点 | 描述 |
|---|---|---|
| GET | /api/v1/records/{record_id} | 获取特定记录 |
| PATCH | /api/v1/records/{record_id} | 更新特定记录 |
| DELETE | /api/v1/records/{record_id} | 删除记录 |
| GET | /api/v1/datasets/{dataset_id}/records | 列出数据集中的记录 |
| POST | /api/v1/datasets/{dataset_id}/records/bulk | 批量创建多条记录 |
| PUT | /api/v1/datasets/{dataset_id}/records/bulk | 批量更新多条记录 |
| POST | /api/v1/datasets/{dataset_id}/records/search | 在无用户上下文的情况下搜索记录 |
| POST | /api/v1/me/datasets/{dataset_id}/records/search | 在用户上下文下搜索记录 |
来源: argilla-server/src/argilla_server/bulk/records_bulk.py:46-155 argilla-server/src/argilla_server/bulk/records_bulk.py:158-226 argilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_create_dataset_records_bulk.py:59-301
响应端点
| 方法 | 端点 | 描述 |
|---|---|---|
| POST | /api/v1/me/responses/bulk | 为当前用户批量创建响应 |
| GET | /api/v1/responses/{response_id} | 获取特定响应 |
| PATCH | /api/v1/responses/{response_id} | 更新响应 |
| DELETE | /api/v1/responses/{response_id} | 删除响应 |
来源: argilla-server/src/argilla_server/contexts/datasets.py:480-542 argilla-server/src/argilla_server/contexts/datasets.py:544-573
元数据属性端点
| 方法 | 端点 | 描述 |
|---|---|---|
| GET | /api/v1/datasets/{dataset_id}/metadata-properties | 列出数据集的元数据属性 |
| POST | /api/v1/datasets/{dataset_id}/metadata-properties | 为数据集创建元数据属性 |
| PATCH | /api/v1/metadata-properties/{metadata_property_id} | 更新元数据属性 |
| DELETE | /api/v1/metadata-properties/{metadata_property_id} | 删除元数据属性 |
| GET | /api/v1/metadata-properties/{metadata_property_id}/metrics | 获取元数据属性的指标 |
来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:121-133 argilla-server/src/argilla_server/contexts/datasets.py:246-270 argilla-server/src/argilla_server/contexts/datasets.py:273-282
向量设置端点
| 方法 | 端点 | 描述 |
|---|---|---|
| GET | /api/v1/datasets/{dataset_id}/vectors-settings | 列出数据集的向量设置 |
| POST | /api/v1/datasets/{dataset_id}/vectors-settings | 为数据集创建向量设置 |
| PATCH | /api/v1/vectors-settings/{vector_settings_id} | 更新向量设置 |
| DELETE | /api/v1/vectors-settings/{vector_settings_id} | 删除向量设置 |
来源: argilla-server/src/argilla_server/api/handlers/v1/datasets/datasets.py:110-118 argilla-server/src/argilla_server/contexts/datasets.py:289-326
用户与认证端点
| 方法 | 端点 | 描述 |
|---|---|---|
| POST | /api/v1/token | 为用户生成新的 API 令牌 |
| GET | /api/v1/me | 获取当前用户信息 |
| GET | /api/v1/users | 列出所有用户 |
| GET | /api/v1/users/{user_id} | 获取特定用户 |
| POST | /api/v1/users | 创建新用户 |
| DELETE | /api/v1/users/{user_id} | 删除用户 |
来源: argilla-server/CHANGELOG.md:204-215
工作区端点
| 方法 | 端点 | 描述 |
|---|---|---|
| POST | /api/v1/workspaces | 创建新工作区 |
| GET | /api/v1/workspaces/{workspace_id}/users | 获取工作区中的用户 |
| POST | /api/v1/workspaces/{workspace_id}/users | 将用户添加到工作区 |
| DELETE | /api/v1/workspaces/{workspace_id}/users/{user_id} | 从工作区移除用户 |
来源: argilla-server/CHANGELOG.md:210-213
Webhook 端点
| 方法 | 端点 | 描述 |
|---|---|---|
| GET | /api/v1/webhooks | 列出 Webhook |
| POST | /api/v1/webhooks | 创建 Webhook |
| GET | /api/v1/webhooks/{webhook_id} | 获取 Webhook |
| POST | /api/v1/webhooks/{webhook_id}/ping | 测试 Webhook |
| DELETE | /api/v1/webhooks/{webhook_id} | 删除 Webhook |
| PATCH | /api/v1/webhooks/{webhook_id} | 更新 Webhook |
来源: argilla-server/CHANGELOG.md:57-60
系统信息端点
| 方法 | 端点 | 描述 |
|---|---|---|
| GET | /api/v1/version | 获取当前 Argilla 版本 |
| GET | /api/v1/status | 获取 Argilla 服务状态 |
| GET | /api/v1/settings | 获取 Argilla 和 Hugging Face 设置 |
来源: argilla-server/CHANGELOG.md:214-215 argilla-server/CHANGELOG.md:226-227
常见请求/响应模式
创建数据集
请求:
POST /api/v1/datasets
Content-Type: application/json
X-Argilla-Api-Key: YOUR_API_KEY
{
"name": "my-dataset",
"guidelines": "Annotation guidelines for this dataset",
"allow_extra_metadata": true,
"distribution": {
"strategy": "overlap",
"min_submitted": 1
},
"workspace_id": "workspace-uuid"
}
响应:
{
"id": "dataset-uuid",
"name": "my-dataset",
"guidelines": "Annotation guidelines for this dataset",
"allow_extra_metadata": true,
"status": "draft",
"distribution": {
"strategy": "overlap",
"min_submitted": 1
},
"metadata": null,
"workspace_id": "workspace-uuid",
"last_activity_at": "2023-06-01T12:00:00Z",
"inserted_at": "2023-06-01T12:00:00Z",
"updated_at": "2023-06-01T12:00:00Z"
}
来源: argilla-server/tests/unit/api/handlers/v1/test_datasets.py:100-159
批量添加记录
请求:
POST /api/v1/datasets/{dataset_id}/records/bulk
Content-Type: application/json
X-Argilla-Api-Key: YOUR_API_KEY
{
"items": [
{
"fields": {
"prompt": "What is machine learning?",
"response": "Machine learning is a subfield of artificial intelligence..."
},
"metadata": {
"source": "wikipedia",
"difficulty": 2
},
"external_id": "record-1"
},
{
"fields": {
"prompt": "Explain deep learning",
"response": "Deep learning is a subset of machine learning..."
},
"metadata": {
"source": "textbook",
"difficulty": 3
},
"external_id": "record-2"
}
]
}
响应:
{
"items": [
{
"id": "record-1-uuid",
"fields": {
"prompt": "What is machine learning?",
"response": "Machine learning is a subfield of artificial intelligence..."
},
"metadata": {
"source": "wikipedia",
"difficulty": 2
},
"external_id": "record-1",
"status": "pending",
"dataset_id": "dataset-uuid",
"inserted_at": "2023-06-01T12:01:00Z",
"updated_at": "2023-06-01T12:01:00Z"
},
{
"id": "record-2-uuid",
"fields": {
"prompt": "Explain deep learning",
"response": "Deep learning is a subset of machine learning..."
},
"metadata": {
"source": "textbook",
"difficulty": 3
},
"external_id": "record-2",
"status": "pending",
"dataset_id": "dataset-uuid",
"inserted_at": "2023-06-01T12:01:00Z",
"updated_at": "2023-06-01T12:01:00Z"
}
]
}
来源: argilla-server/tests/unit/api/handlers/v1/datasets/records/records_bulk/test_create_dataset_records_bulk.py:59-301 argilla-server/src/argilla_server/bulk/records_bulk.py:46-155
搜索记录
请求:
POST /api/v1/me/datasets/{dataset_id}/records/search
Content-Type: application/json
X-Argilla-Api-Key: YOUR_API_KEY
{
"query": "machine learning",
"filter": {
"metadata": {
"difficulty": {
"gt": 2
}
},
"response_status": "pending"
},
"sort": [
{
"field": "inserted_at",
"order": "desc"
}
],
"limit": 10,
"offset": 0
}
响应:
{
"items": [
{
"id": "record-2-uuid",
"fields": {
"prompt": "Explain deep learning",
"response": "Deep learning is a subset of machine learning..."
},
"metadata": {
"source": "textbook",
"difficulty": 3
},
"external_id": "record-2",
"status": "pending",
"dataset_id": "dataset-uuid",
"inserted_at": "2023-06-01T12:01:00Z",
"updated_at": "2023-06-01T12:01:00Z"
}
],
"total": 1,
"offset": 0,
"limit": 10
}
来源: argilla-server/CHANGELOG.md:417-426
常见工作流模式
下图展示了使用 Argilla REST API 进行数据集创建和标注的典型工作流:
来源:
- argilla-server/src/argilla_server/contexts/datasets.py:102-146
- argilla-server/src/argilla_server/contexts/datasets.py:480-542
- argilla-server/src/argilla_server/contexts/datasets.py:387-407
错误处理
API 使用标准 HTTP 状态码来表示请求的成功或失败:
| 状态码 | 描述 |
|---|---|
| 200 OK | 请求成功 |
| 201 Created | 资源创建成功 |
| 400 Bad Request | 请求无效或无法理解 |
| 401 Unauthorized | 需要认证或认证失败 |
| 403 Forbidden | 已认证用户没有访问该资源的权限 |
| 404 Not Found | 请求的资源不存在 |
| 422 Unprocessable Entity | 请求格式正确但因语义错误无法处理 |
| 429 Too Many Requests | 客户端在给定时间内发送了过多请求 |
| 500 Server Error | 服务器发生错误 |
错误响应遵循以下结构:
{
"detail": {
"code": "error_code",
"message": "人类可读的错误信息"
}
}
来源:
速率限制
API 实现了速率限制以防止滥用。如果您在短时间内发送过多请求,可能会收到 429 Too Many Requests 响应。发生这种情况时,您应该等待一段时间后再重试。
来源:
分页
列表端点通过 limit 和 offset 参数支持分页:
limit:返回的最大项目数(默认值:100,最大值:1000)offset:开始返回项目之前要跳过的项目数(默认值:0)
响应中会包含 total、limit 和 offset 字段以帮助进行分页:
{
"items": [...],
"total": 1500,
"limit": 100,
"offset": 0
}
来源: argilla-server/CHANGELOG.md:456-458
过滤与排序
许多端点通过查询参数或请求体字段支持过滤和排序:
- 过滤:按字段、元数据或响应状态过滤记录
- 排序:按
inserted_at、updated_at或元数据属性等字段对结果进行排序
搜索请求中过滤和排序的示例:
{
"filter": {
"metadata": {
"difficulty": {
"gt": 2
}
},
"response_status": "pending"
},
"sort": [
{
"field": "inserted_at",
"order": "desc"
}
]
}
来源:
总结
Argilla REST API 提供了一套全面的端点,用于以编程方式与平台进行交互。无论您是构建自定义集成、自动化工作流还是排查问题,了解该 API 都能帮助您更好地利用 Argilla。
对于大多数用例,我们建议使用 Python SDK,它提供了更高级、更友好的 API 接口。不过,REST API 提供了更精细的控制,并且可以从任何编程语言或环境中使用。