wow-agent-day04 用Llama-index创建Agent

构建 llm#

from openai import OpenAI
from pydantic import Field  # 导入Field，用于Pydantic模型中定义字段的元数据
from llama_index.core.llms import (
   CustomLLM,
   CompletionResponse,
   LLMMetadata,
)
from llama_index.core.embeddings import BaseEmbedding
from llama_index.core.llms.callbacks import llm_completion_callback
from typing import List, Any, Generator
# 定义OurLLM类，继承自CustomLLM基类
class OurLLM(CustomLLM):
   api_key: str = Field(default=api_key)
   base_url: str = Field(default=base_url)
   model_name: str = Field(default=chat_model)
   client: OpenAI = Field(default=None, exclude=True)  # 显式声明 client 字段

   def __init__(self, api_key: str, base_url: str, model_name: str = chat_model, **data: Any):
       super().__init__(**data)
       self.api_key = api_key
       self.base_url = base_url
       self.model_name = model_name
       self.client = OpenAI(api_key=self.api_key, base_url=self.base_url)  # 使用传入的api_key和base_url初始化 client 实例

   @property
   def metadata(self) -> LLMMetadata:
       """Get LLM metadata."""
       return LLMMetadata(
           model_name=self.model_name,
       )

   @llm_completion_callback()
   def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
       response = self.client.chat.completions.create(model=self.model_name, messages=[{"role": "user", "content": prompt}])
       if hasattr(response, 'choices') and len(response.choices) > 0:
           response_text = response.choices[0].message.content
           return CompletionResponse(text=response_text)
       else:
           raise Exception(f"Unexpected response format: {response}")

   @llm_completion_callback()
   def stream_complete(
       self, prompt: str, **kwargs: Any
   ) -> Generator[CompletionResponse, None, None]:
       response = self.client.chat.completions.create(
           model=self.model_name,
           messages=[{"role": "user", "content": prompt}],
           stream=True
       )

       try:
           for chunk in response:
               chunk_message = chunk.choices[0].delta
               if not chunk_message.content:
                   continue
               content = chunk_message.content
               yield CompletionResponse(text=content, delta=content)

       except Exception as e:
           raise Exception(f"Unexpected response format: {e}")

llm = OurLLM(api_key=api_key, base_url=base_url, model_name=chat_model)

代码解析：
整段代码定义了一个 OurLLM 类，继承自 CustomLLM 类。此类用于与自定义的 LLM 进行交互。

模块导入：

OpenAI：用于与 OpenAI 的 API 进行交互。
Field：来自 pydantic，用于定义数据模型字段的元数据。
CustomLLM、CompletionResponse、LLMMetadata：来自 llama_index.core.llms，用于定义自定义语言模型及其响应和元数据。
BaseEmbedding：用于嵌入模型。
llm_completion_callback：用于装饰器，处理 LLM 的完成回调。
List, Any, Generator：来自 typing，用于类型注解。

OurLLM 类：

字段：
- api_key、base_url、model_name：使用 Field 定义默认值。
- client：定义了一个名为 client 的字段，其类型为 OpenAI，即 client 为 OpenAI 类的实例。client 字段的默认值是 None，当这个模型对象转换为 JSON 或其他格式时，client 字段不会出现在结果中。这对于保护敏感信息或减少数据量非常有用。
构造函数
- init: 初始化 api_key、base_url、model_name，并创建 OpenAI 客户端实例。
- metadata: 返回模型的元数据，包含模型名称，返回类型为 LLMMetadata。metadata 方法被 @property 装饰器修饰，这意味着可以通过 llm.metadata 来访问 metadata 方法的返回值，而不需要使用 llm.metadata ()。
-complete : llm_completion_callback 通常是一个回调函数，它会在语言模型（LLM）完成文本生成任务（即生成完成的文本）时被调用。它可以用于多种目的，主要包括：监控和日志记录，资源管理，后续处理。使用 self.client.chat.completions.create 方法调用 API。检查 response 是否有 choices 属性，并且 choices 列表的长度大于 0。如果条件满足，从 choices 中提取第一个 message 的 content，并将其作为 CompletionResponse 的文本返回。
- stream_complete：使用 for 循环遍历响应中的每个 chunk。chunk.choices [0].delta 提取当前块的消息内容。如果 chunk_message.content 为空，则跳过该块。使用 yield 关键字返回 CompletionResponse 对象，包含当前块的文本和增量内容。

检测定义的 LLM 类是否成功#

response = llm.stream_complete("你是谁？")
for chunk in response:
    print(chunk, end="", flush=True)

result:
我是一个名为 ChatGLM 的人工智能助手，是基于清华大学 KEG 实验室和智谱 AI 公司于 2024 年共同训练的语言模型开发的。我的任务是针对用户的问题和要求提供适当的答复和支持。

自定义 agent 中的某些处理流程#

import sys
import os

sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..")))

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool


def multiply(a: float, b: float) -> float:
    """Multiply two numbers and returns the product"""
    return a * b


def add(a: float, b: float) -> float:
    """Add two numbers and returns the sum"""
    return a + b


def main():

    multiply_tool = FunctionTool.from_defaults(fn=multiply)
    add_tool = FunctionTool.from_defaults(fn=add)

    # 创建ReActAgent实例
    agent = ReActAgent.from_tools([multiply_tool, add_tool], llm=llm, verbose=True)

    response = agent.chat("20+（2*4）等于多少？使用工具计算每一步")

    print(response)


if __name__ == "__main__":
    main()

代码解析：

定义 add，multiply 函数
使用 FunctionTool.from_defaults 方法将 multiply 和 add 函数包装成工具。然后，创建一个 ReActAgent 实例，传入这些工具和一个语言模型 llm。verbose=True 表示在执行过程中会输出详细信息。
agent.chat (...) 方法用于与代理进行对话，这里传入了一个数学问题，要求代理使用工具逐步计算。

ReActAgent 通过结合推理（Reasoning）和行动（Acting）来创建动态的 LLM Agent 的框架。该方法允许 LLM 模型通过在复杂环境中交替进行推理步骤和行动步骤来更有效地执行任务。ReActAgent 将推理和动作形成了闭环，Agent 可以自己完成给定的任务。

一个典型的 ReActAgent 遵循以下循环：
初始推理：代理首先进行推理步骤，以理解任务、收集相关信息并决定下一步行为。行动：代理基于其推理采取行动 —— 例如查询 API、检索数据或执行命令。观察：代理观察行动的结果并收集任何新的信息。优化推理：利用新信息，代理再次进行推理，更新其理解、计划或假设。重复：代理重复该循环，在推理和行动之间交替，直到达到满意的结论或完成任务。

运行代码结果得到：

Running step 8b8dc223-de0e-41ac-836e-87917ab466a3. Step input: 20+（2*4）等于多少？使用工具计算每一步
Thought: The user is asking for a calculation involving addition and multiplication. I need to use the tools to help me answer the question.
Action: multiply
Action Input: {'a': 2, 'b': 4}
Observation: 8
Running step 13110be6-d7f8-4cad-85ed-60148842b327. Step input: None
Thought: The user wants to calculate 20 + (2 * 4). I have already calculated the multiplication part. Now I need to add 20 to the result of the multiplication.
Action: add
Action Input: {'a': 20, 'b': 8}
Observation: 28
Running step efe67457-e8f7-4f3d-9a27-c48d3c883f18. Step input: None
Thought: I can answer without using any more tools. I'll use the user's language to answer
Answer: 28
28

但是如果多重复生成几次，可以看到出来的答案并没有稳定性，这一点可能还需要加强。

添加查询天气的方法#

当我们问大模型一个天气的问题，当没有工具时，大模型这么回答，作为大语言模型，他不知道天气情况并给出去哪里可以查到天气情况。现在为我们的 Agent 添加一个查询天气的方法，返回假数据做测试。

def get_weather(city: str) -> int:
    """
    Gets the weather temperature of a specified city.

    Args:
    city (str): The name or abbreviation of the city.

    Returns:
    int: The temperature of the city. Returns 20 for 'NY' (New York),
         30 for 'BJ' (Beijing), and -1 for unknown cities.
    """

    # Convert the input city to uppercase to handle case-insensitive comparisons
    city = city.upper()

    # Check if the city is New York ('NY')
    if city == "NY":
        return 20  # Return 20°C for New York

    # Check if the city is Beijing ('BJ')
    elif city == "BJ":
        return 30  # Return 30°C for Beijing

    # If the city is neither 'NY' nor 'BJ', return -1 to indicate unknown city
    else:
        return -1

weather_tool = FunctionTool.from_defaults(fn=get_weather)

agent = ReActAgent.from_tools([multiply_tool, add_tool, weather_tool], llm=llm, verbose=True)

response = agent.chat("纽约天气怎么样?")

Running step 12923071-c8af-4154-9922-242a1fe96dd0. Step input: 纽约天气怎么样？
Thought: The user is asking for the weather in New York. I need to use a tool to help me answer the question.
Action: get_weather
Action Input: {'city': 'NY'}
Observation: 20
Running step 7a5ab665-2f32-498e-aaa0-4caa5a7d6913. Step input: None
Thought: I have obtained the weather information for New York. Now, I need to provide the answer to the user.
Answer: 纽约现在的天气是 20 度。

可以看到模型的推理能力很强，将纽约转成了 NY。ReActAgent 使得业务自动向代码转换成为可能，只要有 API 模型就可以调用，很多业务场景都适用，LlamaIndex 提供了一些开源的工具实现，可以到官网查看。

虽然 Agent 可以实现业务功能，但是一个 Agent 不能完成所有的功能，这也符合软件解耦的设计原则，不同的 Agent 可以完成不同的任务，各司其职，Agent 之间可以进行交互、通信，类似于微服务。