Meet MCP: Your LLM’s Super-Helpful Assistant!

Imagine your favorite Large Language Model (LLM) like Gemini is a super-smart brain in a jar. It knows a lot from all the books and websites it read. But what if you ask it about today’s weather, or want it to check your latest emails, or even access a special tool you built? The brain-in-a-jar can’t reach outside on its own!
That’s where something called the Model Context Protocol (MCP) comes in. Think of MCP as a super-polite and efficient messenger service or a universal translator that lets the LLM brain talk to the outside world and use different tools or access fresh information.
For those who understand code more than a blob of text, please check the Github repository of my project here
https://kkrishnan90/github.io/Gemini-MCP-CLI
Spoiler Alert !


What is MCP?
MCP, or Model Context Protocol, is like a special set of rules or a secret handshake that different computer programs can use to share information and tools with an LLM. It was developed to create a standard way for AI models (like Gemini or Claude) to connect with external systems — things outside their own training data — while they are running. This means the LLM can get help from other programs (called “servers”) without needing to be completely retrained.
What’s the Point of MCP? (Why a Secret Handshake?)
Why do we need these special rules?
1. Everyone Speaks the Same Language: MCP creates a standard format so the LLM (the “client”) and the helper programs (the “servers”) always understand each other. No more confusion!
2. Playing Nicely Together: It helps different systems integrate smoothly. An LLM using MCP can potentially connect to many different tools built by different people, as long as they all know the MCP handshake.
3. Growing Big and Strong: It’s designed to work for small projects or massive systems handling thousands of requests. The community on MCP servers is also growing rapidly with various tools being added. For example: mcp.so, glama.ai
4. Being Efficient: It helps avoid doing the same work over and over by allowing context (like the history of your chat) to be shared and reused.
5. Plug and Play: Makes it easier to add new tools or data sources without messing up the whole system.
6. Getting Real-Time Info: Allows the LLM to ask for up-to-the-minute data, making its answers more accurate and less dependent on potentially old training data.
How Does MCP Work? (The Messenger Service in Action)
Think of it like this:
1. The LLM Needs Help: You ask the LLM a question it can’t answer alone (e.g., “What’s the weather like in Sacramento?”).
2. Checking the Tool Shed: The LLM, using its MCP client powers, knows about available tools provided by MCP servers (like a “Weather Forecaster” server).
3. Sending a Request: The MCP client sends a polite message (using the MCP protocol) to the right MCP server, asking it to use its tool (e.g., “Please get the forecast for Sacramento”).
4. The Server Does the Work: The Weather Forecaster server gets the request, uses its own tools (like contacting a real weather service API), and figures out the answer.
5. Sending the Answer Back: The server sends the result back to the LLM’s MCP client, again using the standard MCP format.
6. The LLM Responds: The LLM takes the information from the server and gives you a nice, natural language answer (e.g., “The weather in Sacramento is…”).
All this happens smoothly behind the scenes because both sides agreed on the MCP rules!
MCP vs. Function Calling vs. Tool Calling: What’s the Difference?
You might hear the terms “Function Calling” and “Tool Calling.” Often, these are used interchangeably. They both refer to the general ability of an LLM to realize it needs to use an external tool or API and then generate the necessary information (like the function name and arguments) to call it. The LLM essentially says, “Hey, programmer, I need you to run the `get_weather` function for ‘Boston’!”
Function/Tool Calling (General Concept):
- The LLM generates structured output (often JSON) suggesting a function to be called.
- The developer’s application code receives this, runs the actual function/API call, and sends the result back to the LLM.
- Some LLMs (like newer GPT models and Gemini) have built-in support, making this easier. Others might need clever prompting or fine-tuning.
- Think of it as the LLM asking for a specific tool from the toolbox.
Model Context Protocol (MCP):
- MCP is a specific set of rules (a protocol) for how the LLM (client) and the tool provider (server) communicate.
- It standardizes the entire interaction, including how tools are discovered, how requests are made, how context is managed, and how responses are formatted.
- It aims to make the connection between any MCP-compatible client and any MCP-compatible server seamless.
- Think of it as not just asking for a tool, but having a standardized form to fill out for the request, a specific way the tool reports back, and a system for keeping track of all the requests.
While you might use Gemini’s function calling within an MCP client application, MCP itself is the broader communication framework between that client and the external MCP server providing the tools.
Let’s Build! An MCP Example with Gemini
MCP Server: (e.g., weather.py
)
We’ll use Python and the`mcp` library. This example provides weather tools.
from typing import Any
import httpx
from mcp.server.fastmcp import FastMCP
# Initialize FastMCP server
mcp = FastMCP("weather") # Give our server a name
# Helper function to call a real weather API (details omitted)
async def make_nws_request(url: str) -> dict[str, Any] | None:
# ... (code to fetch data from api.weather.gov) ...
pass
# Define a tool the LLM can use
@mcp.tool()
async def get_forecast(latitude: float, longitude: float) -> str:
"""Get weather forecast for a location."""
points_url = f"https://api.weather.gov/points/{latitude},{longitude}"
points_data = await make_nws_request(points_url)
if not points_data:
return "Unable to fetch forecast data."
# ... (code to get and format forecast) ...
forecast_url = points_data["properties"]["forecast"]
forecast_data = await make_nws_request(forecast_url)
if not forecast_data:
return "Unable to fetch detailed forecast."
# Format the first few periods nicely
periods = forecast_data["properties"]["periods"][:2] # Just get a couple
forecasts = [f"{p['name']}: {p['detailedForecast']}" for p in periods]
return "\n".join(forecasts)
# Define another tool
@mcp.tool()
async def get_alerts(state: str) -> str:
"""Get weather alerts for a US state (e.g., CA, NY)."""
# ... (code to fetch and format alerts using make_nws_request) ...
return "Fetched alerts (details omitted for brevity)"
if __name__ == "__main__":
print("Weather MCP Server starting...")
# Run the server, listening via standard input/output
mcp.run(transport='stdio')
print("Weather MCP Server stopped.")
The MCP Client (e.g., `client.py`)
This program connects to the MCP server and uses an LLM (like Gemini) to decide when to call the server’s tools. The client here is a simple CLI (Command Line Interface)
import asyncio
import sys
from contextlib import AsyncExitStack
from google import genai # Assuming Gemini library is installed
from google.genai import types as genai_types #
# MCP Imports
from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client
# Configure Gemini Client (Replace with your API key setup)
# client = genai.Client(api_key='YOUR_GEMINI_API_KEY') #
# For this example, we'll simulate the client for structure
class MockGeminiClient:
def generate_content(self, contents, tools):
print(f"SIMULATE GEMINI: Processing '{contents}' with tools: {[t['function']['name'] for t in tools]}")
# In a real scenario, Gemini might respond with a function call request
if "weather" in contents and "Boston" in contents:
# Simulate Gemini deciding to call get_forecast
return MockGeminiResponse(is_function_call=True, name="get_forecast", args={'latitude': 42.36, 'longitude': -71.05})
else:
return MockGeminiResponse(text=f"I received '{contents}'.")
class MockGeminiResponse:
def __init__(self, text=None, is_function_call=False, name=None, args=None):
self.text = text
self._is_function_call = is_function_call
self._name = name
self._args = args
# Simulate the structure Gemini SDK might provide
self.candidates = [self] if text else []
self.parts = [self] if text else []
self.function_calls = [self] if is_function_call else []
@property
def function_call(self): # Simulate Gemini's function call structure
if self._is_function_call:
return self
return None
@property
def name(self): return self._name
@property
def args(self): return self._args
gemini_client = MockGeminiClient() # Use the mock for this example
class MCPClient:
def __init__(self):
self.session: ClientSession | None = None
self.exit_stack = AsyncExitStack()
self.mcp_tools = [] # Store MCP tool definitions
async def connect_to_server(self, server_script_path: str):
server_params = StdioServerParameters(command="python", args=[server_script_path]) #
stdio_transport = await self.exit_stack.enter_async_context(stdio_client(server_params)) #
self.stdio, self.write = stdio_transport
self.session = await self.exit_stack.enter_async_context(ClientSession(self.stdio, self.write)) #
await self.session.initialize()
response = await self.session.list_tools() # Ask MCP server for its tools
self.mcp_tools = response.tools
print(f"\nConnected to MCP server with tools: {[tool.name for tool in self.mcp_tools]}")
async def process_query_with_gemini(self, query: str) -> str:
if not self.session: return "Not connected to MCP server."
# Prepare tool definitions for Gemini
gemini_tool_defs = []
for tool in self.mcp_tools:
# Convert MCP tool schema to Gemini's FunctionDeclaration format (simplified)
parameters = {
"type": "OBJECT",
"properties": tool.inputSchema.get("properties", {}),
"required": tool.inputSchema.get("required", [])
}
gemini_tool_defs.append(
# Using dict structure similar to genai.types.FunctionDeclaration
{
"type": "function", # Mimic structure shown in search result
"function": {
"name": tool.name,
"description": tool.description,
"parameters": parameters,
}
}
)
# === Interaction 1: User Query to Gemini ===
print("\n>>> Sending to Gemini...")
response = gemini_client.generate_content(contents=query, tools=gemini_tool_defs) #
if response.function_calls: # Check if Gemini wants to call a function
func_call = response.function_calls[0].function_call # Get the first requested call
tool_name = func_call.name
tool_args = func_call.args
print(f"\n<<< Gemini requested tool: {tool_name} with args: {tool_args}")
# === Interaction 2: Call the MCP Server ===
print(f">>> Calling MCP server tool: {tool_name}...")
try:
mcp_result = await self.session.call_tool(tool_name, tool_args) #
tool_result_content = mcp_result.content
print(f"<<< MCP server response: {tool_result_content[:100]}...") # Print truncated result
# === Interaction 3: Send Tool Result back to Gemini ===
print(">>> Sending MCP result back to Gemini...")
# We need to construct the conversation history for Gemini, including the tool response
# This part requires careful handling of message history and roles as per Gemini docs
# Simplified: Just send the result back in a new call (real app needs history)
# Prepare function response part for Gemini
# This structure is complex and requires following Gemini's exact format
# func_response_part = genai_types.Part.from_function_response(name=tool_name, response={'result': tool_result_content}) #
# Need to build proper Content list including user query, initial assistant response (with tool call), and tool response part
# Simulate sending back - Gemini would generate final text
final_response = gemini_client.generate_content(
contents=f"Result for {tool_name}: {tool_result_content}", # Simplified context
tools=gemini_tool_defs
)
return final_response.text
except Exception as e:
print(f"Error calling MCP tool: {e}")
return f"Sorry, I couldn't use the tool {tool_name}. Error: {e}"
else:
# Gemini responded directly
print("<<< Gemini responded directly.")
return response.text
async def chat_loop(self):
print("\nMCP Client with Gemini Started! Type 'quit' to exit.")
while True:
query = input("\nQuery: ").strip()
if query.lower() == 'quit': break
response = await self.process_query_with_gemini(query)
print(f"\nFinal Answer: {response}")
async def cleanup(self):
await self.exit_stack.aclose() #
async def main():
if len(sys.argv) < 2:
print("Usage: python client.py <path_to_mcp_server_script.py>")
sys.exit(1)
client = MCPClient()
try:
await client.connect_to_server(sys.argv[1]) #
await client.chat_loop() #
finally:
await client.cleanup() #
if __name__ == "__main__":
# Make sure to install necessary libraries: pip install google-genai mcp httpx
# You'll also need to set up your Gemini API Key
asyncio.run(main())
Note: The Gemini integration part in the client is simplified. A real application would need to manage the conversation history correctly according to Gemini’s API requirements for multi-turn chat with function calling.
Why Use MCP? (The Benefits)
So, why go through the trouble of this MCP protocol?
- Standardization: It creates a common ground for different tools and AIs to talk.
- Interoperability: Build a tool server once, and potentially many different AI clients can use it. Build an AI client once, and it can potentially use many different tool servers.
- Modularity: Keep your tools separate from your AI model logic. Update one without breaking the other.
- Capability: Allows LLMs to access real-time data and perform actions beyond their internal knowledge.
Components of MCP (The Building Blocks)
MCP has a few key concepts:
- Resources: Think of these as files or data blobs the LLM can read (like the content of a webpage or data from an API call).
- Tools: These are like functions the LLM can ask the server to run (like `get_forecast` or `send_email`).
- Prompts: Pre-written templates that can help guide the LLM or the user to achieve specific tasks using the available tools/resources.
- Transports: The communication channel used, like `stdio` (standard input/output, good for local testing) or potentially HTTP for network communication.
- Roots: Represent the top-level entry points or main contexts available from a server.
- Sampling: Mechanisms to get partial or summarized data, useful when dealing with large resources.
The Untold Gotchas (Watch Out!)
While powerful, using protocols like MCP or extensive tool calling has things to consider:
- Security: If an LLM can call tools that perform actions (like sending emails or modifying files), you must have strong security. Who is allowed to use which tools? How do you prevent misuse? MCP aims to help with secure access, but careful implementation is crucial. User approval steps might be needed.
- Scalability & Performance: What happens when thousands of users are asking the LLM to call dozens of different tools? Servers need to handle the load. Too many back-and-forth calls can slow things down (latency). MCP includes features like parallel processing and caching to help mitigate this, aiming for high throughput and low latency. There’s a hidden problem with MCP tool calls if you know what I mean. Want to learn more? Wait for my next one ;)
- Complexity: Managing many servers, tools, and their dependencies can become complex. Debugging issues across the client, the protocol, and the server requires good tools and practices.
- Reliability: If an external tool or API fails, the LLM needs to handle that gracefully. The client or middleware needs robust error handling.
- Hallucination (Less with MCP, but still relevant): While MCP standardizes the call, the LLM still needs to correctly decide which tool to call and with what arguments. General tool calling approaches can sometimes see LLMs invent tool calls. Embedded or protocol-based approaches like MCP aim to reduce this risk.
The Future is Action! (MCP and Tool Calling Ahead)
The trend is clear: we want LLMs to do things, not just talk about them. Function/Tool calling is a massive step in this direction.
We’ll likely see:
- More Sophisticated Tools: LLMs interacting with complex software, databases, and even physical devices (IoT).
- Better Integration: Smoother ways to connect LLMs to existing enterprise systems and APIs.
- Smarter Agents: LLMs that can plan and execute multi-step tasks using multiple tools in sequence.
- Improved Frameworks: More robust ways to manage tool execution, possibly using “embedded” approaches where a library handles the calls reliably, reducing errors and latency.
- Standardization Efforts: Protocols like MCP could become more important to ensure tools and AIs from different creators can work together seamlessly.
Want to run your own version of Claude Desktop like experience with Gemini ?
Wait no more! Check out the github link below!