LLM Agent¶

The LlmAgent (often aliased simply as Agent) is a core component in ADK, acting as the "thinking" part of your application. It leverages the power of a Large Language Model (LLM) for reasoning, understanding natural language, making decisions, generating responses, and interacting with tools.

Unlike deterministic Workflow Agents that follow predefined execution paths, LlmAgent behavior is non-deterministic. It uses the LLM to interpret instructions and context, deciding dynamically how to proceed, which tools to use (if any), or whether to transfer control to another agent.

Building an effective LlmAgent involves defining its identity, clearly guiding its behavior through instructions, and equipping it with the necessary tools and capabilities.

Defining the Agent's Identity and Purpose¶

First, you need to establish what the agent is and what it's for.

name (Required): Every agent needs a unique string identifier. This name is crucial for internal operations, especially in multi-agent systems where agents need to refer to or delegate tasks to each other. Choose a descriptive name that reflects the agent's function (e.g., customer_support_router, billing_inquiry_agent). Avoid reserved names like user.
description (Optional, Recommended for Multi-Agent): Provide a concise summary of the agent's capabilities. This description is primarily used by other LLM agents to determine if they should route a task to this agent. Make it specific enough to differentiate it from peers (e.g., "Handles inquiries about current billing statements," not just "Billing agent").
model (Required): Specify the underlying LLM that will power this agent's reasoning. This is a string identifier like "gemini-2.0-flash". The choice of model impacts the agent's capabilities, cost, and performance. See the Models page for available options and considerations.

# Example: Defining the basic identity
capital_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="capital_agent",
    description="Answers user questions about the capital city of a given country."
    # instruction and tools will be added next
)

Guiding the Agent: Instructions (`instruction`)¶

The instruction parameter is arguably the most critical for shaping an LlmAgent's behavior. It's a string (or a function returning a string) that tells the agent:

Its core task or goal.
Its personality or persona (e.g., "You are a helpful assistant," "You are a witty pirate").
Constraints on its behavior (e.g., "Only answer questions about X," "Never reveal Y").
How and when to use its tools. You should explain the purpose of each tool and the circumstances under which it should be called, supplementing any descriptions within the tool itself.
The desired format for its output (e.g., "Respond in JSON," "Provide a bulleted list").

Tips for Effective Instructions:

Be Clear and Specific: Avoid ambiguity. Clearly state the desired actions and outcomes.
Use Markdown: Improve readability for complex instructions using headings, lists, etc.
Provide Examples (Few-Shot): For complex tasks or specific output formats, include examples directly in the instruction.
Guide Tool Use: Don't just list tools; explain when and why the agent should use them.

# Example: Adding instructions
capital_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="capital_agent",
    description="Answers user questions about the capital city of a given country.",
    instruction="""You are an agent that provides the capital city of a country.
When a user asks for the capital of a country:
1. Identify the country name from the user's query.
2. Use the `get_capital_city` tool to find the capital.
3. Respond clearly to the user, stating the capital city.
Example Query: "What's the capital of France?"
Example Response: "The capital of France is Paris."
""",
    # tools will be added next
)

(Note: For instructions that apply to all agents in a system, consider using global_instruction on the root agent, detailed further in the Multi-Agents section.)

Equipping the Agent: Tools (`tools`)¶

Tools give your LlmAgent capabilities beyond the LLM's built-in knowledge or reasoning. They allow the agent to interact with the outside world, perform calculations, fetch real-time data, or execute specific actions.

tools (Optional): Provide a list of tools the agent can use. Each item in the list can be:
- A Python function (automatically wrapped as a FunctionTool).
- An instance of a class inheriting from BaseTool.
- An instance of another agent (AgentTool, enabling agent-to-agent delegation - see Multi-Agents).

The LLM uses the function/tool names, descriptions (from docstrings or the description field), and parameter schemas to decide which tool to call based on the conversation and its instructions.

# Define a tool function
def get_capital_city(country: str) -> str:
  """Retrieves the capital city for a given country."""
  # Replace with actual logic (e.g., API call, database lookup)
  capitals = {"france": "Paris", "japan": "Tokyo", "canada": "Ottawa"}
  return capitals.get(country.lower(), f"Sorry, I don't know the capital of {country}.")

# Add the tool to the agent
capital_agent = LlmAgent(
    model="gemini-2.0-flash",
    name="capital_agent",
    description="Answers user questions about the capital city of a given country.",
    instruction="""You are an agent that provides the capital city of a country... (previous instruction text)""",
    tools=[get_capital_city] # Provide the function directly
)

Learn more about Tools in the Tools section.

Advanced Configuration & Control¶

Beyond the core parameters, LlmAgent offers several options for finer control:

Fine-Tuning LLM Generation (`generate_content_config`)¶

You can adjust how the underlying LLM generates responses using generate_content_config.

generate_content_config (Optional): Pass an instance of google.genai.types.GenerateContentConfig to control parameters like temperature (randomness), max_output_tokens (response length), top_p, top_k, and safety settings.

from google.genai import types

agent = LlmAgent(
    # ... other params
    generate_content_config=types.GenerateContentConfig(
        temperature=0.2, # More deterministic output
        max_output_tokens=250
    )
)

Structuring Data (`input_schema`, `output_schema`, `output_key`)¶

For scenarios requiring structured data exchange, you can use Pydantic models.

input_schema (Optional): Define a Pydantic BaseModel class representing the expected input structure. If set, the user message content passed to this agent must be a JSON string conforming to this schema. Your instructions should guide the user or preceding agent accordingly.
output_schema (Optional): Define a Pydantic BaseModel class representing the desired output structure. If set, the agent's final response must be a JSON string conforming to this schema.
- Constraint: Using output_schema enables controlled generation within the LLM but disables the agent's ability to use tools or transfer control to other agents. Your instructions must guide the LLM to produce JSON matching the schema directly.
output_key (Optional): Provide a string key. If set, the text content of the agent's final response will be automatically saved to the session's state dictionary under this key (e.g., session.state[output_key] = agent_response_text). This is useful for passing results between agents or steps in a workflow.

from pydantic import BaseModel, Field

class CapitalOutput(BaseModel):
    capital: str = Field(description="The capital of the country.")

structured_capital_agent = LlmAgent(
    # ... name, model, description
    instruction="""You are a Capital Information Agent. Given a country, respond ONLY with a JSON object containing the capital. Format: {"capital": "capital_name"}""",
    output_schema=CapitalOutput, # Enforce JSON output
    output_key="found_capital"  # Store result in state['found_capital']
    # Cannot use tools=[get_capital_city] effectively here
)

Managing Context (`include_contents`)¶

Control whether the agent receives the prior conversation history.

include_contents (Optional, Default: 'default'): Determines if the contents (history) are sent to the LLM.
- 'default': The agent receives the relevant conversation history.
- 'none': The agent receives no prior contents. It operates based solely on its current instruction and any input provided in the current turn (useful for stateless tasks or enforcing specific contexts).
```
stateless_agent = LlmAgent(
    # ... other params
    include_contents='none'
)
```

Planning & Code Execution¶

For more complex reasoning involving multiple steps or executing code:

planner (Optional): Assign a BasePlanner instance to enable multi-step reasoning and planning before execution. (See Multi-Agents patterns).
code_executor (Optional): Provide a BaseCodeExecutor instance to allow the agent to execute code blocks (e.g., Python) found in the LLM's response. (See Tools/Built-in tools).

Putting It Together: Example¶

Code

Here's the complete basic capital_agent:

# Full example code for the basic capital agent

(This example demonstrates the core concepts. More complex agents might incorporate schemas, context control, planning, etc.)

While this page covers the core configuration of LlmAgent, several related concepts provide more advanced control and are detailed elsewhere:

Callbacks: Intercepting execution points (before/after model calls, before/after tool calls) using before_model_callback, after_model_callback, etc. See Callbacks.
Multi-Agent Control: Advanced strategies for agent interaction, including planning (planner), controlling agent transfer (disallow_transfer_to_parent, disallow_transfer_to_peers), and system-wide instructions (global_instruction). See Multi-Agents.

LLM Agent¶

Defining the Agent's Identity and Purpose¶

Guiding the Agent: Instructions (instruction)¶

Equipping the Agent: Tools (tools)¶

Advanced Configuration & Control¶

Fine-Tuning LLM Generation (generate_content_config)¶

Structuring Data (input_schema, output_schema, output_key)¶

Managing Context (include_contents)¶