From Push to Pull: Giving My AI Agents Agency

Standard AI prompts often 'push' data into the model. Here's why I moved to a 'pull' model, giving my agents the tools to fetch their own context.

Photo by Alexandr Popadin / Unsplash

When I first started building my AI meal planner, the architecture was straightforward. The orchestrator (the "brain") would look at the user's request, search the database for a few relevant recipes, and then push all that data into the AI's prompt.

It worked... until it didn't.

I soon realized that by "pushing" the context, I was limiting the AI's intelligence to my own orchestrator's initial search. If the user asked for something specific that my first search missed, the agent was stuck.

Here is why I am moving to a Pull model, and how it transforms my agents from "prompt-fillers" into true autonomous workers.

My Personal Lab: Learning to Build Agents

Beyond just the technical friction, there was a deeper driver for this change: I wanted to learn.

This entire meal planner project is a "lab" for me. It's where I experiment with how to actually build autonomous agents, how to design tool definitions that an LLM can understand, and how to manage the multi-turn loops that make these agents "think." Moving to a Pull model was the perfect excuse to dive deep into these techniques.

The Problem with the "Push" Model

In a Push model, the orchestrator has to be perfect. It has to anticipate everything the AI might need.

// The old "Push" way
func (p *Planner) GeneratePlan(ctx context.Context, request string) {
    // 1. The orchestrator guesses what the AI needs
    recipes, _ := p.searchRecipes(request)
    
    // 2. The data is "pushed" into the prompt
    prompt := fmt.Sprintf("Here are recipes: %v. Plan a meal.", recipes)
    p.llm.GenerateContent(prompt)
}

This creates several frictions:

Context Bloat: You often push too much data "just in case," wasting tokens.
Brittle Search: If the initial search for "chicken" misses a "poultry" recipe the user would have loved, the AI can't fix it.
Low IQ: The agent isn't "thinking"; it's just summarizing the data you gave it.

Enabling the Conversation: Stateless to Stateful

Before I could implement the "Pull" loop, I had to fix a fundamental flaw in my Go LLM client. My initial implementation was too simple: it was essentially a "stateless" function that took a string and returned a string.

// The original, limited interface
type TextGenerator interface {
    GenerateText(ctx context.Context, prompt string) (string, error)
}

To support tool calling, the client needed to understand State. It needed to handle a sequence of messages where the agent can say, "I want to call this tool," and the system can reply with, "Here is the result."

I refactored the client to use a unified Conversation type (a slice of Message structs) and updated the GenerateContent signature.

// The refactored, stateful interface
type Message struct {
    Role    string
    Content string
    // Support for tool-calling state
    ToolCalls []ToolCall 
}

type Conversation []Message

type TextGenerator interface {
    GenerateContent(
        ctx context.Context, 
        conversation Conversation, 
        tools []Tool,
    ) (ContentResponse, error)
}

This shift from "stateless text" to "stateful conversation" was the structural foundation that made everything else possible.

The Solution: The "Pull" Model (Tooling)

In a Pull model, you give the AI agency. Instead of giving it the data, you give it the tool to find the data.

I defined a `search_recipes` tool and updated the `Analyst` agent to use it. Now, the agent starts with a minimal set of context and a clear instruction: "If you need more recipes to satisfy the user's request, use the `search_recipes` tool."

The "Hybrid" Compromise

Pure "Pull" can be slow (it requires an extra LLM turn). To maintain the developer flow, I implemented a hybrid approach:

Push a small, high-confidence set of recipes (the "best guesses").
Pull more recipes autonomously if those aren't enough.

// The new "Pull" tool definition
var searchRecipesTool = llm.Tool{
	Name:        "search_recipes",
	Description: "Search for recipes based on a query to find meals that fit the user's requirements.",
	Parameters: llm.ToolParameters{
		Type: llm.ParameterTypeObject,
		Properties: map[string]llm.Property{
			"query": {Type: llm.PropertyTypeString},
		},
		Required: []string{"query"},
	},
}

The "How": The Autonomous Loop

With the stateful client in place, the agent can now run in a loop. Instead of one single trip to the LLM, we iterate until the agent is satisfied.

// The engine of our "Pull" model
func (a *Analyst) executeAnalystLoop(ctx context.Context, chat llm.Conversation) {
    for {
        // 1. Ask the LLM (providing the available tools)
        resp, _ := a.llm.GenerateContent(ctx, chat, tools)

        // 2. Add the response to the conversation history
        chat = append(chat, resp.Message)

        // 3. Break if it's a final answer, not a tool call
        if !resp.Message.IsAToolCall() {
            break 
        }

        // 4. If it IS a tool call, execute the search
        toolCall := resp.Message.ToolCalls[0]
        recipes, toolMsg, _ := a.handleSearchTool(ctx, toolCall)

        // 5. Add the tool result back to the chat and loop!
        chat = append(chat, toolMsg)
        
        // This is where the "Pull" happens: 
        // The next iteration now has the new recipes in its context.
    }
}

Testing and Evals: Confirming the Theory

The hope is that this model will significantly improve the agent's ability to handle complex, nuanced requests. For instance, if a user asks for "high-protein meals, but no spicy food," a push model might fail if the initial search is overwhelmed by popular spicy high-protein recipes. With a tool, the agent can recognize the conflict and refine its own search.

To prove this, I'm currently adding eval tests. I'm building a suite of "hard" scenarios to quantitatively measure how often the agent successfully uses its tools to resolve constraints.

Measuring Agency with Metrics

You can't manage what you don't measure. Giving an agent the "keys to the kingdom" (the ability to call tools) means you need to watch its costs and performance very closely.

As part of this refactor, I've added metrics to track exactly what happens during these autonomous loops. I defined a shared metadata structure to capture the operation of every agent execution:

// Shared telemetry for our agents
type ToolCallMeta struct {
    ToolName string        \`json:"tool_name"\`
    Latency  time.Duration \`json:"latency"\`
    Input    any           \`json:"input"\`
}

type AgentMeta struct {
    AgentName string
    Usage     TokenUsage      // Tokens, model, etc.
    Latency   time.Duration   // Total execution time
    ToolCalls []ToolCallMeta  // Detailed tool usage
}

By recording these in my SQLite database, I can look back at real usage and answer critical questions: How many times does the agent actually feel it needs more data? Is the latency hit worth it? Is it getting stuck in loops?

Reflections on Agency

Moving from Push to Pull is more than just a technical refactor; it's a shift in how you view AI. You aren't just "programming" a model with a prompt; you are designing a worker and giving it the tools it needs to succeed.

This transition is the first step toward true autonomy in my system. But as I'm discovering, giving an agent more agency also means you need better ways to test and decouple that agency—a story for the next post.