HomeAILeverage OpenAI Software calling: Constructing a dependable AI Agent from Scratch |...

Leverage OpenAI Software calling: Constructing a dependable AI Agent from Scratch | by Lukasz Kowejsza | Mar, 2024


Created with DALL·E

Step-by-Step Workflow for growing and refining an AI Agent whereas coping with errors

After we take into consideration the way forward for AI, we envision intuitive on a regular basis helpers seamlessly integrating into our workflows and taking up advanced, routinely duties. All of us have discovered touchpoints that relieve us from the tedium of psychological routine work. But, the principle duties at the moment tackled contain textual content creation, correction, and brainstorming, underlined by the numerous position RAG (Retrieval-Augmented Era) pipelines play in ongoing growth. We intention to offer Giant Language Fashions with higher context to generate extra invaluable content material.

Occupied with the way forward for AI conjures photos of Jarvis from Iron Man or Rasputin from Future (the sport) for me. In each examples, the AI acts as a voice-controlled interface to a posh system, providing high-level abstractions. As an illustration, Tony Stark makes use of it to handle his analysis, conduct calculations, and run simulations. Even R2D2 can reply to voice instructions to interface with unfamiliar laptop methods and extract information or work together with constructing methods.

In these situations, AI allows interplay with advanced methods with out requiring the top consumer to have a deep understanding of them. This may very well be likened to an ERP system in a big company right now. It’s uncommon to seek out somebody in a big company who absolutely is aware of and understands each side of the in-house ERP system. It’s not far-fetched to think about that, within the close to future, AI may help almost each interplay with an ERP system. From the top consumer managing buyer information or logging orders to the software program developer fixing bugs or implementing new options, these interactions might quickly be facilitated by AI assistants aware of all facets and processes of the ERP system. Such an AI assistant would know which database to enter buyer information into and which processes and code is likely to be related to a bug.

To attain this, a number of challenges and improvements lie forward. We have to rethink processes and their documentation. In the present day’s ERP processes are designed for human use, with particular roles for various customers, documentation for people, enter masks for people, and consumer interactions designed to be intuitive and error-free. The design of those facets will look completely different for AI interactions. We’d like particular roles for AI interactions and completely different course of designs to allow intuitive and error-free AI interplay. That is already evident in our work with prompts. What we think about to be a transparent job usually seems to not be so simple.

Nonetheless, let’s first take a step again to the idea of brokers. Brokers, or AI assistants that may carry out duties utilizing the instruments offered and make selections on how one can use these instruments, are the constructing blocks that might ultimately allow such a system. They’re the method elements we’d need to combine into each side of a posh system. However as highlighted in a earlier article, they’re difficult to deploy reliably. On this article, I’ll display how we are able to design and optimize an agent able to reliably interacting with a database.

Whereas the grand imaginative and prescient of AI’s future is inspiring, it’s essential to take sensible steps in direction of realizing this imaginative and prescient. To display how we are able to begin constructing the muse for such superior AI methods, let’s deal with making a prototype agent for a standard job: expense monitoring. This prototype will function a tangible instance of how AI can help in managing monetary transactions effectively, showcasing the potential of AI in automating routine duties and highlighting the challenges and issues concerned in designing an AI system that interacts seamlessly with databases. By beginning with a selected and relatable use case, we are able to achieve invaluable insights that can inform the event of extra advanced AI brokers sooner or later.

This text will lay the groundwork for a sequence of articles aimed toward growing a chatbot that may function a single level of interplay for a small enterprise to assist and execute enterprise processes or a chatbot that in your private life organizes every little thing you could preserve observe of. From information, routines, recordsdata, to footage, we need to merely chat with our Assistant, permitting it to determine the place to retailer and retrieve your information.

Transitioning from the grand imaginative and prescient of AI’s future to sensible purposes, let’s zoom in on making a prototype agent. This agent will function a foundational step in direction of realizing the bold objectives mentioned earlier. We’ll embark on growing an “Expense Monitoring” agent, a simple but important job, demonstrating how AI can help in managing monetary transactions effectively.

This “Expense Monitoring” prototype won’t solely showcase the potential of AI in automating routine duties but additionally illuminate the challenges and issues concerned in designing an AI system that interacts seamlessly with databases. By specializing in this instance, we are able to discover the intricacies of agent design, enter validation, and the mixing of AI with current methods — laying a strong basis for extra advanced purposes sooner or later.

To deliver our prototype agent to life and establish potential bottlenecks, we’re venturing into testing the software name performance of OpenAI. Beginning with a primary instance of expense monitoring, we’re laying down a foundational piece that mimics a real-world utility. This stage entails making a base mannequin and remodeling it into the OpenAI software schema utilizing the langchain library’s convert_to_openai_tool operate. Moreover, crafting a report_tool allows our future agent to speak outcomes or spotlight lacking info or points:

from pydantic.v1 import BaseModel, validator  
from datetime import datetime
from langchain_core.utils.function_calling import convert_to_openai_tool

class Expense(BaseModel):
description: str
net_amount: float
gross_amount: float
tax_rate: float
date: datetime

class Report(BaseModel):
report: str

add_expense_tool = convert_to_openai_tool(Expense)
report_tool = convert_to_openai_tool(Report)

With the information mannequin and instruments arrange, the subsequent step is to make use of the OpenAI consumer SDK to provoke a easy software name. On this preliminary take a look at, we’re deliberately offering inadequate info to the mannequin to see if it could appropriately point out what’s lacking. This strategy not solely checks the practical functionality of the agent but additionally its interactive and error-handling capacities.

Now, we’ll use the OpenAI consumer SDK to provoke a easy software name. In our first take a look at, we intentionally present the mannequin with inadequate info to see if it could notify us of the lacking particulars.

from openai import OpenAI  
from langchain_core.utils.function_calling import convert_to_openai_tool

SYSTEM_MESSAGE = """You might be tasked with finishing particular goals and
should report the outcomes. At your disposal, you may have quite a lot of instruments,
every specialised in performing a definite sort of job.

For profitable job completion:
Thought: Take into account the duty at hand and decide which software is greatest suited
primarily based on its capabilities and the character of the work.

Use the report_tool with an instruction detailing the outcomes of your work.
When you encounter a difficulty and can't full the duty:

Use the report_tool to speak the problem or motive for the
job's incompletion.
You'll obtain suggestions primarily based on the outcomes of
every software's job execution or explanations for any duties that
could not be accomplished. This suggestions loop is essential for addressing
and resolving any points by strategically deploying the obtainable instruments.
"""
user_message = "I've spend 5$ on a espresso right now please observe my expense. The tax charge is 0.2."

consumer = OpenAI()
model_name = "gpt-3.5-turbo-0125"

messages = [
{"role":"system", "content": SYSTEM_MESSAGE},
{"role":"user", "content": user_message}
]

response = consumer.chat.completions.create(
mannequin=model_name,
messages=messages,
instruments=[
convert_to_openai_tool(Expense),
convert_to_openai_tool(ReportTool)]
)

Subsequent, we’ll want a brand new operate to learn the arguments of the operate name from the response:

def parse_function_args(response):
message = response.decisions[0].message
return json.hundreds(message.tool_calls[0].operate.arguments)

print(parse_function_args(response))

{'description': 'Espresso',
'net_amount': 5,
'gross_amount': None,
'tax_rate': 0.2,
'date': '2023-10-06T12:00:00Z'}

As we are able to observe, we’ve encountered a number of points within the execution:

  1. The gross_amount just isn’t calculated.
  2. The date is hallucinated.

With that in thoughts. Let’s attempt to resolve this points and optimize our agent workflow.

To optimize the agent workflow, I discover it essential to prioritize workflow over immediate engineering. Whereas it is likely to be tempting to fine-tune the immediate in order that the agent learns to make use of the instruments offered completely and makes no errors, it’s extra advisable to first modify the instruments and processes. When a typical error happens, the preliminary consideration must be how one can repair it code-based.

Dealing with lacking info successfully is an important matter for creating sturdy and dependable brokers. Within the earlier instance, offering the agent with a software like “get_current_date” is a workaround for particular situations. Nonetheless, we should assume that lacking info will happen in varied contexts, and we can’t rely solely on immediate engineering and including extra instruments to forestall the mannequin from hallucinating lacking info.

A easy workaround for this state of affairs is to change the software schema to deal with all parameters as elective. This strategy ensures that the agent solely submits the parameters it is aware of, stopping pointless hallucination.

Subsequently, let’s check out openai software schema:

add_expense_tool = convert_to_openai_tool(Expense)
print(add_expense_tool)
{'sort': 'operate',
'operate': {'title': 'Expense',
'description': '',
'parameters': {'sort': 'object',
'properties': {'description': {'sort': 'string'},
'net_amount': {'sort': 'quantity'},
'gross_amount': {'sort': 'quantity'},
'tax_rate': {'sort': 'quantity'},
'date': {'sort': 'string', 'format': 'date-time'}},
'required': ['description',
'net_amount',
'gross_amount',
'tax_rate',
'date']}}}

As we are able to see we’ve particular key required , which we have to take away. Right here’s how one can modify the add_expense_tool schema to make parameters elective by eradicating the required key:

del add_expense_tool["function"]["parameters"]["required"]

Subsequent, we are able to design a Software class that originally checks the enter parameters for lacking values. We create the Software class with two strategies: .run(), .validate_input(), and a property openai_tool_schema, the place we manipulate the software schema by eradicating required parameters. Moreover, we outline the ToolResult BaseModel with the fields content material and success to function the output object for every software run.

from pydantic import BaseModel
from typing import Kind, Callable, Dict, Any, Listing

class ToolResult(BaseModel):
content material: str
success: bool

class Software(BaseModel):
title: str
mannequin: Kind[BaseModel]
operate: Callable
validate_missing: bool = False

class Config:
arbitrary_types_allowed = True

def run(self, **kwargs) -> ToolResult:
if self.validate_missing:
missing_values = self.validate_input(**kwargs)
if missing_values:
content material = f"Lacking values: {', '.be part of(missing_values)}"
return ToolResult(content material=content material, success=False)
end result = self.operate(**kwargs)
return ToolResult(content material=str(end result), success=True)

def validate_input(self, **kwargs) -> Listing[str]:
missing_values = []
for key in self.mannequin.__fields__.keys():
if key not in kwargs:
missing_values.append(key)
return missing_values
@property
def openai_tool_schema(self) -> Dict[str, Any]:
schema = convert_to_openai_tool(self.mannequin)
if "required" in schema["function"]["parameters"]:
del schema["function"]["parameters"]["required"]
return schema

The Software class is an important element within the AI agent’s workflow, serving as a blueprint for creating and managing varied instruments that the agent can make the most of to carry out particular duties. It’s designed to deal with enter validation, execute the software’s operate, and return the end in a standardized format.

The Software class key elements:

  1. title: The title of the software.
  2. mannequin: The Pydantic BaseModel that defines the enter schema for the software.
  3. operate: The callable operate that the software executes.
  4. validate_missing: A boolean flag indicating whether or not to validate lacking enter values (default is False).

The Software class two major strategies:

  1. run(self, **kwargs) -> ToolResult: This technique is chargeable for executing the software’s operate with the offered enter arguments. It first checks if validate_missing is ready to True. If that’s the case, it calls the validate_input() technique to examine for lacking enter values. If any lacking values are discovered, it returns a ToolResult object with an error message and success set to False. If all required enter values are current, it proceeds to execute the software’s operate with the offered arguments and returns a ToolResult object with the end result and success set to True.
  2. validate_input(self, **kwargs) -> Listing[str]: This technique compares the enter arguments handed to the software with the anticipated enter schema outlined within the mannequin. It iterates over the fields outlined within the mannequin and checks if every discipline is current within the enter arguments. If any discipline is lacking, it appends the sphere title to a listing of lacking values. Lastly, it returns the listing of lacking values.

The Software class additionally has a property referred to as openai_tool_schema, which returns the OpenAI software schema for the software. It makes use of the convert_to_openai_tool() operate to transform the mannequin to the OpenAI software schema format. Moreover, it removes the "required" key from the schema, making all enter parameters elective. This enables the agent to offer solely the obtainable info with out the necessity to hallucinate lacking values.

By encapsulating the software’s performance, enter validation, and schema era, the Software class offers a clear and reusable interface for creating and managing instruments within the AI agent’s workflow. It abstracts away the complexities of dealing with lacking values and ensures that the agent can gracefully deal with incomplete info whereas executing the suitable instruments primarily based on the obtainable enter.

Subsequent, we’ll lengthen our OpenAI API name. We would like the consumer to make the most of our software, and our response object to immediately set off a software.run(). For this, we have to initialize our instruments in our newly created Software class. We outline two dummy features which return a hit message string.

def add_expense_func(**kwargs):  
return f"Added expense: {kwargs} to the database."

add_expense_tool = Software(
title="add_expense_tool",
mannequin=Expense,
operate=add_expense_func
)

def report_func(report: str = None):
return f"Reported: {report}"

report_tool = Software(
title="report_tool",
mannequin=ReportTool,
operate=report_func
)

instruments = [add_expense_tool, report_tool]

Subsequent we outline our helper operate, that every take consumer response as enter an assist to work together with out instruments.

def get_tool_from_response(response, instruments=instruments):  
tool_name = response.decisions[0].message.tool_calls[0].operate.title
for t in instruments:
if t.title == tool_name:
return t
elevate ValueError(f"Software {tool_name} not present in instruments listing.")

def parse_function_args(response):
message = response.decisions[0].message
return json.hundreds(message.tool_calls[0].operate.arguments)

def run_tool_from_response(response, instruments=instruments):
software = get_tool_from_response(response, instruments)
tool_kwargs = parse_function_args(response)
return software.run(**tool_kwargs)

Now, we are able to execute our consumer with our new instruments and use the run_tool_from_response operate.

response = consumer.chat.completions.create(  
mannequin=model_name,
messages=messages,
instruments=[tool.openai_tool_schema for tool in tools]
)

tool_result = run_tool_from_response(response, instruments=instruments)
print(tool_result)

content material='Lacking values: gross_amount, date' success=False

Completely, we now see our software indicating that lacking values are current. Due to our trick of sending all parameters as elective, we now keep away from hallucinated parameters.

Our course of, because it stands, doesn’t but symbolize a real agent. Up to now, we’ve solely executed a single API software name. To remodel this into an agent workflow, we have to introduce an iterative course of that feeds the outcomes of software execution again to the consumer. The essential course of ought to like this:

Picture by creator

Let’s get began by creating a brand new OpenAIAgent class:

class StepResult(BaseModel):  
occasion: str
content material: str
success: bool

class OpenAIAgent:

def __init__(
self,
instruments: listing[Tool],
consumer: OpenAI,
system_message: str = SYSTEM_MESSAGE,
model_name: str = "gpt-3.5-turbo-0125",
max_steps: int = 5,
verbose: bool = True
):
self.instruments = instruments
self.consumer = consumer
self.model_name = model_name
self.system_message = system_message
self.step_history = []
self.max_steps = max_steps
self.verbose = verbose

def to_console(self, tag: str, message: str, colour: str = "inexperienced"):
if self.verbose:
color_prefix = Fore.__dict__[color.upper()]
print(color_prefix + f"{tag}: {message}{Type.RESET_ALL}")

Like our ToolResultobject, we’ve outlined a StepResult as an object for every agent step. We then outlined the __init__ technique of the OpenAIAgent class and a to_console() technique to print our intermediate steps and gear calls to the console, utilizing colorama for colourful printouts. Subsequent, we outline the center of the agent, the run() and the run_step() technique.

class OpenAIAgent:

# ... __init__...

# ... to_console ...

def run(self, user_input: str):

openai_tools = [tool.openai_tool_schema for tool in self.tools]
self.step_history = [
{"role":"system", "content":self.system_message},
{"role":"user", "content":user_input}
]

step_result = None
i = 0

self.to_console("START", f"Beginning Agent with Enter: {user_input}")

whereas i < self.max_steps:
step_result = self.run_step(self.step_history, openai_tools)

if step_result.occasion == "end":
break
elif step_result.occasion == "error":
self.to_console(step_result.occasion, step_result.content material, "pink")
else:
self.to_console(step_result.occasion, step_result.content material, "yellow")
i += 1

self.to_console("Remaining Consequence", step_result.content material, "inexperienced")
return step_result.content material

Within the run() technique, we begin by initializing the step_history, which is able to function our message reminiscence, with the predefined system_message and the user_input. Then we begin our whereas loop, the place we name run_step throughout every iteration, which is able to return a StepResult Object. We establish if the agent completed his job or if an error occurred, which can be handed to the console as properly.

class OpenAIAgent:

# ... __init__...

# ... to_console ...
# ... run ...
def run_step(self, messages: listing[dict], instruments):

# plan the subsequent step
response = self.consumer.chat.completions.create(
mannequin=self.model_name,
messages=messages,
instruments=instruments
)

# add message to historical past
self.step_history.append(response.decisions[0].message)

# examine if software name is current
if not response.decisions[0].message.tool_calls:
return StepResult(
occasion="Error",
content material="No software calls had been returned.",
success=False
)

tool_name = response.decisions[0].message.tool_calls[0].operate.title
tool_kwargs = parse_function_args(response)

# execute the software name
self.to_console(
"Software Name", f"Title: {tool_name}nArgs: {tool_kwargs}", "magenta"
)
tool_result = run_tool_from_response(response, instruments=self.instruments)
tool_result_msg = self.tool_call_message(response, tool_result)
self.step_history.append(tool_result_msg)

if tool_result.success:
step_result = StepResult(
occasion="tool_result",
content material=tool_result.content material,
success=True)
else:
step_result = StepResult(
occasion="error",
content material=tool_result.content material,
success=False
)

return step_result

def tool_call_message(self, response, tool_result: ToolResult):
tool_call = response.decisions[0].message.tool_calls[0]
return {
"tool_call_id": tool_call.id,
"position": "software",
"title": tool_call.operate.title,
"content material": tool_result.content material,
}

Now we’ve outlined the logic for every step. We first get hold of a response object by our beforehand examined consumer API name with instruments. We append the response message object to our step_history. We then confirm if a software name is included in our response object, in any other case, we return an error in our StepResult. Then we log our software name to the console and run the chosen software with our beforehand outlined technique run_tool_from_response(). We additionally have to append the software end result to our message historical past. OpenAI has outlined a selected format for this objective, in order that the Mannequin is aware of which software name refers to which output by passing a tool_call_id into our message dict. That is carried out by our technique tool_call_message(), which takes the response object and the tool_result as enter arguments. On the finish of every step, we assign the software end result to a StepResult Object, which additionally signifies if the step was profitable or not, and return it to our loop in run().

Now we are able to take a look at our agent with the earlier instance, immediately equipping it with a get_current_date_tool as properly. Right here, we are able to set our beforehand outlined validate_missing attribute to False, because the software does not want any enter argument.

class DateTool(BaseModel):  
x: str = None

get_date_tool = Software(
title="get_current_date",
mannequin=DateTool,
operate=lambda: datetime.now().strftime("%Y-%m-%d"),
validate_missing=False
)

instruments = [
add_expense_tool,
report_tool,
get_date_tool
]

agent = OpenAIAgent(instruments, consumer)
agent.run("I've spent 5$ on a espresso right now please observe my expense. The tax charge is 0.2.")

START: Beginning Agent with Enter: 
"I've spend 5$ on a espresso right now please observe my expense. The tax charge is 0.2."

Software Name: get_current_date
Args: {}
tool_result: 2024-03-15

Software Name: add_expense_tool
Args: {'description': 'Espresso expense', 'net_amount': 5, 'tax_rate': 0.2, 'date': '2024-03-15'}
error: Lacking values: gross_amount

Software Name: add_expense_tool
Args: {'description': 'Espresso expense', 'net_amount': 5, 'tax_rate': 0.2, 'date': '2024-03-15', 'gross_amount': 6}
tool_result: Added expense: {'description': 'Espresso expense', 'net_amount': 5, 'tax_rate': 0.2, 'date': '2024-03-15', 'gross_amount': 6} to the database.
Error: No software calls had been returned.

Software Name: Title: report_tool
Args: {'report': 'Expense efficiently tracked for espresso buy.'}
tool_result: Reported: Expense efficiently tracked for espresso buy.

Remaining Consequence: Reported: Expense efficiently tracked for espresso buy.

Following the profitable execution of our prototype agent, it’s noteworthy to emphasise how successfully the agent utilized the designated instruments based on plan. Initially, it invoked the get_current_date_tool, establishing a foundational timestamp for the expense entry. Subsequently, when making an attempt to log the expense by way of the add_expense_tool, our intelligently designed software class recognized a lacking gross_amount—an important piece of knowledge for correct monetary monitoring. Impressively, the agent autonomously resolved this by calculating the gross_amount utilizing the offered tax_rate.

It’s necessary to say that in our take a look at run, the character of the enter expense — whether or not the $5 spent on espresso was web or gross — wasn’t explicitly specified. At this juncture, such specificity wasn’t required for the agent to carry out its job efficiently. Nonetheless, this brings to gentle a invaluable perception for refining our agent’s understanding and interplay capabilities: Incorporating such detailed info into our preliminary system immediate might considerably improve the agent’s accuracy and effectivity in processing expense entries. This adjustment would guarantee a extra complete grasp of monetary information proper from the outset.

  1. Iterative Growth: The undertaking underscores the essential nature of an iterative growth cycle, fostering steady enchancment by way of suggestions. This strategy is paramount in AI, the place variability is the norm, necessitating an adaptable and responsive growth technique.
  2. Dealing with Uncertainty: Our journey highlighted the importance of elegantly managing ambiguities and errors. Improvements equivalent to elective parameters and rigorous enter validation have confirmed instrumental in enhancing each the reliability and consumer expertise of the agent.
  3. Custom-made Agent Workflows for Particular Duties: A key perception from this work is the significance of customizing agent workflows to go well with explicit use circumstances. Past assembling a set of instruments, the strategic design of software interactions and responses is significant. This customization ensures the agent successfully addresses particular challenges, resulting in a extra targeted and environment friendly problem-solving strategy.

The journey we’ve embarked upon is just the start of a bigger exploration into the world of AI brokers and their purposes in varied domains. As we proceed to push the boundaries of what’s attainable with AI, we invite you to affix us on this thrilling journey. By constructing upon the muse laid on this article and staying tuned for the upcoming enhancements, you’ll witness firsthand how AI brokers can revolutionize the best way companies and people deal with their information and automate advanced duties.

Collectively, allow us to embrace the facility of AI and unlock its potential to rework the best way we work and work together with know-how. The way forward for AI is vibrant, and we’re on the forefront of shaping it, one dependable agent at a time.

As we proceed our journey in exploring the potential of AI brokers, the upcoming articles will deal with increasing the capabilities of our prototype and integrating it with real-world methods. Within the subsequent article, we’ll dive into designing a sturdy undertaking construction that permits our agent to work together seamlessly with SQL databases. By leveraging the agent developed on this article, we’ll display how AI can effectively handle and manipulate information saved in databases, opening up a world of potentialities for automating data-related duties.

Constructing upon this basis, the third article within the sequence will introduce superior question options, enabling our agent to deal with extra advanced information retrieval and manipulation duties. We can even discover the idea of a routing agent, which is able to act as a central hub for managing a number of subagents, every chargeable for interacting with particular database tables. This hierarchical construction will enable customers to make requests in pure language, which the routing agent will then interpret and direct to the suitable subagent for execution.

To additional improve the practicality and safety of our AI-powered system, we’ll introduce a role-based entry management system. This may be sure that customers have the suitable permissions to entry and modify information primarily based on their assigned roles. By implementing this characteristic, we are able to display how AI brokers might be deployed in real-world situations whereas sustaining information integrity and safety.

Via these upcoming enhancements, we intention to showcase the true potential of AI brokers in streamlining information administration processes and offering a extra intuitive and environment friendly approach for customers to work together with databases. By combining the facility of pure language processing, database administration, and role-based entry management, we can be laying the groundwork for the event of subtle AI assistants that may revolutionize the best way companies and people deal with their information.

Keep tuned for these thrilling developments as we proceed to push the boundaries of what’s attainable with AI brokers in information administration and past.

Supply Code

Moreover, the whole supply code for the tasks coated is out there on GitHub. You’ll be able to entry it at https://github.com/elokus/AgentDemo.



Supply hyperlink

latest articles

Earn Broker Many GEOs
Lightinthebox WW

explore more