6.4 · LLM 适配器架构（LLM Adapter Architecture）

记忆管道与知识图谱构建 · 聚焦本章的模块关系、源码依据与实现要点。

项目Cognee 章节6.4 状态全文译文模块模型调用与提供方适配、界面与交互、系统架构、测试、发布与运维

项目要点页2.5 参考项目项目章节目录Cognee DeepWiki 原始章节LLM Adapter Architecture 上一章6.3 下一章6.5

源码线索

cognee-mcp/src/strip_vectors.py
cognee/infrastructure/llm/LLMGateway.py
cognee/infrastructure/llm/structured_output_framework/baml/baml_src/extraction/acreate_structured_output.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/anthropic/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/gemini/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py
cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py

模块标签

模型调用与提供方适配
界面与交互
系统架构
测试、发布与运维
记忆与上下文

章节正文

LLM 适配器架构

原始 DeepWiki 页面https://deepwiki.com/topoteretes/cognee/6.4-llm-adapter-architecture

大语言模型（LLM）适配器架构

架构总览

大语言模型适配器系统结合了基于协议（Protocol）的接口和类继承，在保持类型安全的同时提供了可扩展性。该架构由四层组成：

网关层：LLMGateway 负责高层路由、上下文注入和使用量跟踪 cognee/infrastructure/llm/LLMGateway.py:52-111。
协议层：LLMInterface 定义了所有适配器的契约 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py:8-34。
基础适配器层：GenericAPIAdapter 使用 litellm 和 instructor 为兼容 OpenAI 的 API 提供通用功能 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:58-114。
提供商适配器：针对每个大语言模型提供商的专门实现。

适配器类层次结构

Cognee · 适配器类层次结构 · 图 1

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:58-114、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py:37-56、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py:38-44、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py:32-39、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:34-57

LLMGateway 与上下文注入

LLMGateway 是所有大语言模型请求的入口点。它负责选择结构化输出框架（Instructor 或 BAML）、注入持久化内存上下文以及记录会话使用量。

内存注入逻辑

在请求发送到适配器之前，LLMGateway 会调用 _inject_agent_memory cognee/infrastructure/llm/LLMGateway.py:65。此函数使用 get_current_agent_memory_context() 检索活动的 AgentMemoryContext cognee/infrastructure/llm/LLMGateway.py:15-17。如果内存上下文处于活动状态且包含检索到的数据，则会将其前置到原始用户输入之前 cognee/infrastructure/llm/LLMGateway.py:18-21。

def _inject_agent_memory(text_input: str) -> str:
    from cognee.modules.agent_memory import get_current_agent_memory_context

    context = get_current_agent_memory_context()
    if context is None or not context.memory_context:
        return text_input

    return f"附加内存上下文：\n{context.memory_context}\n\n原始输入：\n{text_input}"

来源：cognee/infrastructure/llm/LLMGateway.py:14-21、cognee/modules/agent_memory/runtime.py:79-81

使用量跟踪

网关将大语言模型调用包装在 _record_session_usage_after 中 cognee/infrastructure/llm/LLMGateway.py:92。此工具使用 record_llm_call 将输入文本、模型名称和序列化响应（对于 Pydantic 模型使用 model_dump_json()）记录到活动会话跟踪器中 cognee/infrastructure/llm/LLMGateway.py:35-46。

来源：cognee/infrastructure/llm/LLMGateway.py:24-49

LLMInterface 协议

LLMInterface 协议定义了所有大语言模型适配器必须实现的最小契约。它使用 Python 的 Protocol 类型实现结构子类型化。

核心方法

方法	用途	返回类型
`acreate_structured_output()`	使用 Pydantic 模型进行异步结构化输出生成。	`BaseModel`
`create_transcript()`	处理音频到文本的转录。	`TranscriptionReturnType`
`transcribe_image()`	处理图像到文本/描述的任务。	`str`

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llm_interface.py:8-34、cognee/infrastructure/llm/LLMGateway.py:59-111

大语言模型适配器实现

GenericAPIAdapter

这是使用 litellm 和 instructor 组合的提供商的基类。它使用 tenacity 实现了标准重试逻辑 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:118-126，并在主模型因内容策略违规失败时支持回退模型 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:187-210。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/generic_llm_api/adapter.py:58-210

OpenAI 和 AzureOpenAI

OpenAIAdapter 处理标准 GPT 模型，并默认对新模型使用 json_schema_mode cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py:58-104。AzureOpenAIAdapter 扩展了此功能以支持 Azure 特定的端点和通过 DefaultAzureCredential 实现的托管标识认证 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py:108-171。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/openai/adapter.py:37-106、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/azure_openai/adapter.py:38-171

LlamaCppAPIAdapter

通过 llama-cpp-python 支持本地大语言模型执行。它提供两种模式：

服务器模式：连接到兼容 OpenAI 的 HTTP 服务器 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:127-141。
本地模式：直接在进程中加载模型文件，并使用 instructor 修补 Llama 对象 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:96-125。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/llama_cpp/adapter.py:34-125

多模态能力

适配器实现了 create_transcript 和 transcribe_image 以处理非文本输入。

转录：MistralAdapter 使用原生 Mistral 客户端的音频 API cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py:168-174。OllamaAPIAdapter 使用兼容 OpenAI 的 whisper 端点 cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py:161-166。
视觉：OllamaAPIAdapter 通过将图像进行 base64 编码，并使用 "这张图片里有什么？" 提示发送给模型来实现 transcribe_image cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py:203-219。

来源：cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/mistral/adapter.py:148-176、cognee/infrastructure/llm/structured_output_framework/litellm_instructor/llm/ollama/adapter.py:142-219

代理内存集成

大语言模型适配器架构与 agent_memory 模块紧密耦合，该模块使用 contextvars 在异步调用之间维护状态。

执行上下文数据流

Cognee · 执行上下文数据流 · 图 2

来源：cognee/infrastructure/llm/LLMGateway.py:59-92、cognee/modules/agent_memory/runtime.py:59-95、examples/guides/agent_memory_quickstart.py:43-52

结果消毒

对于 MCP 和其他上下文敏感的客户端，strip_vectors cognee-mcp/src/strip_vectors.py:15-27 用于在将搜索结果传递给大语言模型或客户端之前递归移除大型 text_vector 字段，以防止上下文窗口耗尽 cognee-mcp/src/strip_vectors.py:4-9。

来源：cognee-mcp/src/strip_vectors.py:1-27