Create Agent with Llama-index on wow-agent-day04

Build LLM#

from openai import OpenAI
from pydantic import Field  # Import Field to define metadata for fields in Pydantic models
from llama_index.core.llms import (
   CustomLLM,
   CompletionResponse,
   LLMMetadata,
)
from llama_index.core.embeddings import BaseEmbedding
from llama_index.core.llms.callbacks import llm_completion_callback
from typing import List, Any, Generator
# Define OurLLM class, inheriting from CustomLLM base class
class OurLLM(CustomLLM):
   api_key: str = Field(default=api_key)
   base_url: str = Field(default=base_url)
   model_name: str = Field(default=chat_model)
   client: OpenAI = Field(default=None, exclude=True)  # Explicitly declare client field

   def __init__(self, api_key: str, base_url: str, model_name: str = chat_model, **data: Any):
       super().__init__(**data)
       self.api_key = api_key
       self.base_url = base_url
       self.model_name = model_name
       self.client = OpenAI(api_key=self.api_key, base_url=self.base_url)  # Initialize client instance with provided api_key and base_url

   @property
   def metadata(self) -> LLMMetadata:
       """Get LLM metadata."""
       return LLMMetadata(
           model_name=self.model_name,
       )

   @llm_completion_callback()
   def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:
       response = self.client.chat.completions.create(model=self.model_name, messages=[{"role": "user", "content": prompt}])
       if hasattr(response, 'choices') and len(response.choices) > 0:
           response_text = response.choices[0].message.content
           return CompletionResponse(text=response_text)
       else:
           raise Exception(f"Unexpected response format: {response}")

   @llm_completion_callback()
   def stream_complete(
       self, prompt: str, **kwargs: Any
   ) -> Generator[CompletionResponse, None, None]:
       response = self.client.chat.completions.create(
           model=self.model_name,
           messages=[{"role": "user", "content": prompt}],
           stream=True
       )

       try:
           for chunk in response:
               chunk_message = chunk.choices[0].delta
               if not chunk_message.content:
                   continue
               content = chunk_message.content
               yield CompletionResponse(text=content, delta=content)

       except Exception as e:
           raise Exception(f"Unexpected response format: {e}")

llm = OurLLM(api_key=api_key, base_url=base_url, model_name=chat_model)

Code Analysis:
The entire code defines a class OurLLM that inherits from the CustomLLM class. This class is used to interact with a custom LLM.

Module Imports:

OpenAI: Used to interact with the OpenAI API.
Field: From pydantic, used to define metadata for data model fields.
CustomLLM, CompletionResponse, LLMMetadata: From llama_index.core.llms, used to define custom language models and their responses and metadata.
BaseEmbedding: Used for embedding models.
llm_completion_callback: Used as a decorator to handle LLM completion callbacks.
List, Any, Generator: From typing, used for type annotations.

OurLLM Class:

Fields:
- api_key, base_url, model_name: Defined with Field to set default values.
- client: Defines a field named client, which is of type OpenAI, meaning client is an instance of the OpenAI class. The default value for the client field is None, and it will not appear in the result when this model object is converted to JSON or other formats. This is useful for protecting sensitive information or reducing data size.
Constructor
- init: Initializes api_key, base_url, model_name, and creates an OpenAI client instance.
- metadata: Returns the model's metadata, including the model name, with a return type of LLMMetadata. The metadata method is decorated with @property, meaning it can be accessed via llm.metadata without needing to use llm.metadata().
- complete: llm_completion_callback is typically a callback function that is called when the language model (LLM) completes a text generation task (i.e., generates completed text). It can be used for various purposes, mainly including: monitoring and logging, resource management, and post-processing. It calls the API using the self.client.chat.completions.create method. It checks if the response has a choices attribute and that the length of the choices list is greater than 0. If the conditions are met, it extracts the content of the first message from choices and returns it as the text of CompletionResponse.
- stream_complete: Uses a for loop to iterate through each chunk in the response. It extracts the current chunk's message content from chunk.choices[0].delta. If chunk_message.content is empty, it skips that chunk. It uses the yield keyword to return a CompletionResponse object containing the current chunk's text and delta content.

Check if the defined LLM class is successful#

response = llm.stream_complete("Who are you?")
for chunk in response:
    print(chunk, end="", flush=True)

result:
I am an AI assistant named ChatGLM, developed based on a language model jointly trained by Tsinghua University's KEG Laboratory and Zhizhu AI Company in 2024. My task is to provide appropriate responses and support to user questions and requests.

Some processing flows in the custom agent#

import sys
import os

sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), "..", "..")))

from llama_index.core.agent import ReActAgent
from llama_index.core.tools import FunctionTool


def multiply(a: float, b: float) -> float:
    """Multiply two numbers and returns the product"""
    return a * b


def add(a: float, b: float) -> float:
    """Add two numbers and returns the sum"""
    return a + b


def main():

    multiply_tool = FunctionTool.from_defaults(fn=multiply)
    add_tool = FunctionTool.from_defaults(fn=add)

    # Create ReActAgent instance
    agent = ReActAgent.from_tools([multiply_tool, add_tool], llm=llm, verbose=True)

    response = agent.chat("What is 20 + (2 * 4)? Use tools to calculate each step")

    print(response)


if __name__ == "__main__":
    main()

Code Analysis:

Define add and multiply functions
Use FunctionTool.from_defaults method to wrap multiply and add functions into tools. Then, create a ReActAgent instance, passing these tools and a language model llm. verbose=True indicates that detailed information will be output during execution.
The agent.chat(...) method is used to converse with the agent, where a math problem is passed in, requesting the agent to use tools to calculate step by step.

ReActAgent creates a framework for dynamic LLM Agents by combining reasoning and acting. This method allows the LLM model to perform tasks more effectively by alternating between reasoning steps and action steps in complex environments. ReActAgent forms a closed loop of reasoning and action, allowing the Agent to complete given tasks independently.

A typical ReActAgent follows this cycle:
Initial Reasoning: The agent first performs reasoning steps to understand the task, gather relevant information, and decide on the next action. Action: The agent takes action based on its reasoning—such as querying an API, retrieving data, or executing commands. Observation: The agent observes the results of the action and gathers any new information. Optimizing Reasoning: Using the new information, the agent reasons again, updating its understanding, plans, or hypotheses. Repeat: The agent repeats this cycle, alternating between reasoning and action, until a satisfactory conclusion is reached or the task is completed.

The result of running the code is:

Running step 8b8dc223-de0e-41ac-836e-87917ab466a3. Step input: What is 20 + (2 * 4)? Use tools to calculate each step
Thought: The user is asking for a calculation involving addition and multiplication. I need to use the tools to help me answer the question.
Action: multiply
Action Input: {'a': 2, 'b': 4}
Observation: 8
Running step 13110be6-d7f8-4cad-85ed-60148842b327. Step input: None
Thought: The user wants to calculate 20 + (2 * 4). I have already calculated the multiplication part. Now I need to add 20 to the result of the multiplication.
Action: add
Action Input: {'a': 20, 'b': 8}
Observation: 28
Running step efe67457-e8f7-4f3d-9a27-c48d3c883f18. Step input: None
Thought: I can answer without using any more tools. I'll use the user's language to answer
Answer: 28
28

However, if generated multiple times, the answers may not be stable, which may need to be strengthened.

Add a method to query the weather#

When we ask the large model a weather question, when there are no tools, the large model responds that it does not know the weather situation and provides where to check the weather. Now we add a method to our Agent to query the weather, returning fake data for testing.

def get_weather(city: str) -> int:
    """
    Gets the weather temperature of a specified city.

    Args:
    city (str): The name or abbreviation of the city.

    Returns:
    int: The temperature of the city. Returns 20 for 'NY' (New York),
         30 for 'BJ' (Beijing), and -1 for unknown cities.
    """

    # Convert the input city to uppercase to handle case-insensitive comparisons
    city = city.upper()

    # Check if the city is New York ('NY')
    if city == "NY":
        return 20  # Return 20°C for New York

    # Check if the city is Beijing ('BJ')
    elif city == "BJ":
        return 30  # Return 30°C for Beijing

    # If the city is neither 'NY' nor 'BJ', return -1 to indicate unknown city
    else:
        return -1

weather_tool = FunctionTool.from_defaults(fn=get_weather)

agent = ReActAgent.from_tools([multiply_tool, add_tool, weather_tool], llm=llm, verbose=True)

response = agent.chat("How is the weather in New York?")

Running step 12923071-c8af-4154-9922-242a1fe96dd0. Step input: How is the weather in New York?
Thought: The user is asking for the weather in New York. I need to use a tool to help me answer the question.
Action: get_weather
Action Input: {'city': 'NY'}
Observation: 20
Running step 7a5ab665-2f32-498e-aaa0-4caa5a7d6913. Step input: None
Thought: I have obtained the weather information for New York. Now, I need to provide the answer to the user.
Answer: The current weather in New York is 20 degrees.

It can be seen that the model's reasoning ability is strong, converting New York to NY. ReActAgent makes it possible for business logic to be automatically converted into code; as long as there is an API model, it can be called, and many business scenarios are applicable. LlamaIndex provides some open-source tools that can be checked on the official website.

Although the Agent can implement business functions, an Agent cannot complete all functions, which also aligns with the design principle of software decoupling. Different Agents can complete different tasks, each with its own role, and Agents can interact and communicate with each other, similar to microservices.