Building Your First Intelligent Agent π§βπ
Welcome back! In the Quick Start you launched an agent in five minutes. This guide slows down and explains every line so you truly grok how MCP-Use works under the hood.
After completing this tutorial you will be able to customise prompts, change models, and reason about agent behaviour.
1. The Big Picture
βββββββββββββββββββββ "Which tool should I use?" βββββββββββββββββ β Your LLM (GPT-4) β ββββββββββββββββββββββββββββββββΊ β MCPAgent β βββββββββββββββββββββ β (router) β β² β² βββββββββ¬βββββββ β β executes tool β β β βΌ β β ββββββββββββββ β ββ JSON tool calls (function-calling) βββββββ β MCPClient β β ββββββ¬ββββββββ β observation/results β βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- LLM decides what to do next.
- MCPAgent feeds the LLM with tool schemas and keeps track of steps.
- MCPClient is the async glue that talks to one or more MCP servers.
2. Setup the Playground
We'll reuse the Playwright browser server because everyone loves web automation. Create playground.py
:
import asyncio from dotenv import load_dotenv from langchain_openai import ChatOpenAI from mcp_use import MCPAgent, MCPClient load_dotenv() CONFIG = { "mcpServers": { "playwright": { "command": "npx", "args": ["@playwright/mcp@latest"], "env": {"DISPLAY": ":1"} } } } async def main(): client = MCPClient.from_dict(CONFIG) llm = ChatOpenAI(model="gpt-4o", temperature=0) agent = MCPAgent(llm=llm, client=client, max_steps=5, verbose=True) result = await agent.run("Open https://httpbin.org/ip and tell me my IP address") print("Agent final answer:\n", result) await client.close_all_sessions() if __name__ == "__main__": asyncio.run(main())
Run it with python playground.py
. Use DEBUG=1
to see logs.
3. Understanding the Code
3.1 MCPClient from_dict
We pass a pure-Python dict instead of a JSON file. Under the hood MCP-Use spawns a child process and opens a WebSocket.
Tip: Use
client = MCPClient.from_config_file("browser_mcp.json")
for big configs.
3.2 LLM Choice
Any model supporting function calling works. For Anthropic you'd use ChatAnthropic
β everything else is identical.
3.3 Agent parameters
- max_steps β safety valve so the agent doesn't loop forever.
- verbose=True β prints every tool decision and LLM thought.
3.4 The Human Loop
Let's map the code to a real task. Imagine you're planning a weekend in Berlin:
- Ask the agent for the cheapest direct flight from your city.
- Open the booking page and screenshot the checkout price.
- Save the PNG locally.
All you'd change is the prompt:
result = await agent.run( "Search Skyscanner for a direct flight from Rome to Berlin on the first Friday next month, " "open the cheapest option and screenshot the price. Save the file as flight_price.png" )
The LLM will decide to use tools from both the playwright
browser server and the filesystem
server (if configured). You get the file on disk β no manual browsing!
4. Customising Behaviour
4.1 Change Temperature
Higher temperature
increases creativity but may trigger unexpected tool use. Start with 0-0.3 for deterministic automation.
4.2 System Prompt Overrides
agent = MCPAgent(llm=llm, client=client, system_prompt="You are an expert OSINT researcherβ¦")
You can inject domain-specific context! The prompt sits before tool schemas.
4.3 Stop Early
result = await agent.run("β¦", stop_on_final_answer=True)
This returns as soon as the LLM emits a normal text answer, even if max_steps isn't reached.
5. Cleaning Up
Never forget await client.close_all_sessions()
β each MCP server is a real process (sometimes a full browser) that must be killed.
6. Challenge Yourself π₯
- Change the code so the agent screenshots the page and saves it locally.
- Switch to Claude 3 Sonnet β what differences do you notice?
- Add a second server (filesystem) and ask the agent to download an image and save it.
Share your creations with the community and give the repo a β if you enjoyed this deep dive!