December 9th 2025 • 7 min read

The Agent Execution Loop: Building an Agent From Scratch

How AI agents work under the hood - the execution loop that enables reasoning, tool calling, and iterative problem solving.

2025 has been dubbed the year of agentic AI. Tools like Google Gemini CLI, Claude Code, GitHub Copilot agent mode, and Cursor are all examples of agents—autonomous entities that can take an open-ended task, plan, take action, reflect on the results, and loop until the task is done. And they're creating real value.

But how do agents actually work? How can you build one?

In this post, I'll walk through the core of how agents function: the agent execution loop that powers these complex behaviors.

What is an Agent?

An agent is an entity that can reason, act, communicate, and adapt to solve problems.

Consider two questions you might ask an LLM:

"What is the capital of France?" — Can retrieve from training data
"What is the stock price of NVIDIA today?" — Needs to fetch current data from an API

The first question can be answered by a model directly. The second cannot - the model will hallucinate a plausible-sounding but incorrect answer because it doesn't have access to real-time data.

An agent solves this by recognizing it needs current data, calling a financial API, and returning the actual price. This requires action, not just text generation.

Task complexity: LLMs vs Agents vs Multiple Agents

Agent Components

An agent has three core components:

Agent anatomy - model, tools, and memory

Model: The reasoning engine (typically an LLM like GPT-5) that processes context and decides what to do
Tools: Functions the agent can call to take action - APIs, databases, code execution, web search
Memory: Short-term (conversation history) and long-term (persistent storage across sessions)

Calling an LLM

Before building an agent, you need to understand how to call a language model. Here's the basic pattern using the OpenAI API:

python

from openai import AsyncOpenAI

client = AsyncOpenAI(api_key="your-api-key")

response = await client.chat.completions.create(
    model="gpt-5",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is 2 + 2?"}
    ]
)

print(response.choices[0].message.content)
# Output: "4"

The API takes a list of messages (system instructions, user input, previous assistant responses) and returns a completion. This is a single request-response cycle.

To enable tool calling, you also pass tool definitions:

python

response = await client.chat.completions.create(
    model="gpt-5",
    messages=messages,
    tools=[{
        "type": "function",
        "function": {
            "name": "get_stock_price",
            "description": "Get current stock price for a symbol",
            "parameters": {
                "type": "object",
                "properties": {
                    "symbol": {"type": "string", "description": "Stock symbol like NVDA"}
                },
                "required": ["symbol"]
            }
        }
    }]
)

When the model decides it needs to use a tool, instead of returning text content, it returns a tool_calls array with the function name and arguments.

The Agent Execution Loop

Here's the core pattern that every agent follows:

1. Prepare Context  →  Combine task + instructions + memory + history
2. Call Model       →  Send context to LLM, get response
3. Handle Response  →  If text, we're done. If tool calls, execute them.
4. Iterate          →  Add tool results to context, go back to step 2
5. Return           →  Final response ready

In code:

python

async def run(task: str):
    # 1. Prepare context
    messages = [
        {"role": "system", "content": self.instructions},
        {"role": "user", "content": task}
    ]

    while True:
        # 2. Call model
        response = await self.client.chat.completions.create(
            model="gpt-5",
            messages=messages,
            tools=self.tool_schemas
        )

        assistant_message = response.choices[0].message
        messages.append(assistant_message)

        # 3. Handle response
        if not assistant_message.tool_calls:
            # No tool calls - we're done
            return assistant_message.content

        # 4. Execute tools and iterate
        for tool_call in assistant_message.tool_calls:
            result = await self.execute_tool(
                tool_call.function.name,
                json.loads(tool_call.function.arguments)
            )
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })

        # Loop continues - model will process tool results

The key insight: an agent takes multiple steps (model call → tool execution → model call) within a single run. The loop continues until the model returns a text response instead of tool calls.

Tool Execution

When the model returns a tool call, you need to actually execute it:

python

async def execute_tool(self, name: str, arguments: dict) -> str:
    tool = self.tools[name]
    try:
        result = await tool(**arguments)
        return str(result)
    except Exception as e:
        return f"Error: {e}"

The result gets added back to the message history as a tool message, and the loop continues. The model sees what the tool returned and can either call more tools or generate a final response.

A Complete Example

Putting it together:

python

class Agent:
    def __init__(self, instructions: str, tools: list):
        self.client = AsyncOpenAI()
        self.instructions = instructions
        self.tools = {t.__name__: t for t in tools}
        self.tool_schemas = [self._make_schema(t) for t in tools]

    async def run(self, task: str) -> str:
        messages = [
            {"role": "system", "content": self.instructions},
            {"role": "user", "content": task}
        ]

        while True:
            response = await self.client.chat.completions.create(
                model="gpt-5",
                messages=messages,
                tools=self.tool_schemas
            )

            msg = response.choices[0].message
            messages.append(msg)

            if not msg.tool_calls:
                return msg.content

            for tc in msg.tool_calls:
                result = await self.execute_tool(
                    tc.function.name,
                    json.loads(tc.function.arguments)
                )
                messages.append({
                    "role": "tool",
                    "tool_call_id": tc.id,
                    "content": result
                })

# Usage
async def get_stock_price(symbol: str) -> str:
    # In reality, call an API here
    return f"{symbol}: $142.50"

agent = Agent(
    instructions="You help users get stock information.",
    tools=[get_stock_price]
)

result = await agent.run("What's NVIDIA trading at?")
print(result)
# "NVIDIA (NVDA) is currently trading at $142.50."

The agent:

Receives "What's NVIDIA trading at?"
Calls the model, which decides to use get_stock_price
Executes get_stock_price("NVDA") → returns "$142.50"
Adds the result to messages, calls the model again
Model generates a natural language response incorporating the data

Same Pattern, Different Frameworks

The execution loop we built is the same pattern used by production agent frameworks. The syntax differs, but the core architecture is identical: define tools, create an agent with instructions, run it on a task. The code snippets below show how each of these ideas are implemented in frameworks like Microsoft Agent Framework, Google ADK, and LangGraph.

Microsoft Agent Framework:

python

from agent_framework import ai_function
from agent_framework.azure import AzureOpenAIChatClient

@ai_function
def get_weather(location: str) -> str:
    """Get current weather for a given location."""
    return f"The weather in {location} is sunny, 75°F"

client = AzureOpenAIChatClient(deployment_name="gpt-4.1-mini")
agent = client.create_agent(
    name="assistant",
    instructions="You are a helpful assistant.",
    tools=[get_weather],
)

result = await agent.run("What's the weather in Paris?")

Google ADK:

python

from google.adk import Agent

def get_weather(location: str) -> str:
    """Get current weather for a given location."""
    return f"The weather in {location} is sunny, 75°F"

agent = Agent(
    name="assistant",
    model="gemini-flash-latest",
    instruction="You are a helpful assistant.",
    tools=[get_weather],
)

# Run via InMemoryRunner

LangGraph:

python

from langchain_core.tools import tool
from langgraph.prebuilt import create_react_agent

@tool
def get_weather(location: str) -> str:
    """Get current weather for a given location."""
    return f"The weather in {location} is sunny, 75°F"

agent = create_react_agent(
    model=llm,
    tools=[get_weather],
)

result = agent.invoke({"messages": [("user", "What's the weather in Paris?")]})

All three frameworks: define a function, wrap it as a tool, pass it to an agent, call run. The execution loop underneath handles the model calls, tool execution, and iteration.

What's Missing

This basic loop works, but production agents need more:

Streaming: Long tasks need progress updates, not just a final response
Memory: Persisting context across sessions
Middleware: Logging, rate limiting, safety checks
Error handling: Retries, graceful degradation
Context management: Summarizing/compacting as context grows
End-user interfaces: Integrating agents into web applications

These are covered in depth in my book Designing Multi-Agent Systems, which builds a complete agent framework (picoagents) from scratch with all of these features.

Interested in more articles like this? Subscribe to get a monthly roundup of new posts and other interesting ideas at the intersection of Applied AI and HCI.

Read and Subscribe

← Previous

The Agent Execution Loop: Building an Agent From Scratch

What is an Agent?

Agent Components

Calling an LLM

The Agent Execution Loop

Tool Execution

A Complete Example

Same Pattern, Different Frameworks

What's Missing

RELATED POSTS | machinelearning, agents, llm

Read the Newsletter.