Sitemap

How many ‘R’s are in Strawberry? Achieving AGI with MCP and Gemini CLI

6 min readSep 3, 2025

I am now seeking $500M for my AGI startup.

It was a long-time meme that these state-of-the-art large language models could not accurately answer that question. It’s a question a three-year old can answer, and they’re hardly going to be smart enough to take my job.

What’s going on? A large language model is hardly smart. They are only capable of matching patterns of language through tokens. Compound words “straw” + “berry” are going to be processed in a very particular way that makes an LLM bad at this kind of task.

Perhaps you can use an increasingly complex reasoning model just to do this, but you have to wonder if that actually makes sense. If I want to do matrix multiplication, I can do all of that with a CPU. It will take a long time, and be horribly inefficient, but I can do it.

It also seems inefficient to spend a minute of compute performing an action that can be done in other systems in microseconds with a single line of code.

While early LLMs were just acting on their own, recent pushes in to “Agentic AI” are designed to pair LLMs with third-party tools which can be interpreted as domain experts for help. In this case, an expert in the domain of counting letters.

Developing MCP Tools

To do this, I followed the guidance of the Currency Agent codelab which provides the necessary tooling out of the box. Using the FastMCP library, all of the backend coding and interfacing is done for you.

My MCP tool is a Python function tagged with @mcp.tool . From here, you can do basically whatever Python you want. In an agent-to-agent system, you can call other agents in a potentially endless chain.

@mcp.tool()
def get_letter_count(
word: str = "blueberry",
letter: str = "b",
):
"""Use this to get the number of letters in a provided word.

Args:
word: The word to get the letter count from (eg., "blueberry").
letter: The letter you are looking for (eg., "b")

Returns:
An integer showing the number of the given letter in the given word.
"""
logger.info(
f"--- 🛠️ Tool: get_letter_count called for getting # of {letter} from {word} ---"
)
return {"count": word.count(letter)}

You can see here a lot of text and very little code. It’s actually trivial to get the number of letters in a given word in Python, making it even sillier to use a minute of compute time.

Most of this tool is in the description. This is read by the LLM and used to figure out when the tool should be called, what parameters should be passed in, and how to interpret the response. In many ways, this reminds me a lot of how Dialogflow works, with its slot filling and intent handlers.

Actually Running It

Once I have my tool, I can run it on a local server. Using the Gemini CLI, I can open ~/.gemini/settings.json and add this local server as an MCP source. In theory I guess I can add virtually endless tools, but there’d probably be discovery issues at some point.

"my-mcp-server": {  
"httpUrl": "http://127.0.0.1:8080/mcp/"
}

When Gemini CLI is loaded, I can then view all my available MCP servers and tools by running /mcp list :

When I make a query that could potentially run in one of my tools, I get a prompt where I can enable this tool.

Press enter or click to view image in full size

Comparing Numbers

Another interesting failure mode for LLMs is around comparing decimal numbers. When given two numbers like 8.9 and 8.11, it can easily believe that 8.11 is bigger. If you’re viewing these as text and not numbers, it’s easy to make that mistake.

I’ve seen a few ways people have tried to teach the LLM about this, but again it seems like an area where a simple CPU is superior. After all, numerical comparisons are one of the primary CPU skills.

I started building a similarly simple MCP tool which ends up being just a few lines.

@mcp.tool()
def get_math_comparison(
first: float = 8.9,
second: float = 8.11,
):
"""Use this to compare whether the first number is greater than (or less than) the second number
If I were to ask if 8.9 is greater than 8.11, this tool is right for you

Args:
first: The first number being passed in, the left hand side (eg., 8.9)
second: The second number being passed in, the right hand side (eg., 8.11)

Returns:
Returns a string if the first number is "greater than" or "less than" the second number
"""
logger.info(
f"--- 🛠️ Tool: get_math_comparison called for getting {first} > {second} ? ---"
)
if first > second:
return {"text": f"{first} is greater than {second}", "simple": ">", "expression": f"{first} > {second}"}
elif second > first:
return {"text": f"{first} less than {second}", "simple": "<", "expression": f"{first} < {second}"}
else:
return {"text": f"{first} is equal to {second}", "simple": "=", "expression": f"{first} = {second}"}

Building this took me five minutes or so. I was feeling excited to load this up and take a screenshot of my superior AI when I reached a surprising result:

Hmm. For some reason even though I’m telling it the exact answer in a few different ways, the LLM refuses to believe me.

After some chats, I had to go back and really think harder about not the logic but the text around it. I’m not interacting with the end-user. I’m not supposed to be giving them a JSON output. My actual interaction is with the LLM.

@mcp.tool()
def get_math_comparison(
first: float = 8.9,
second: float = 8.11,
):
"""Use this to compare whether the first number is greater than (or less than) the second number
If I were to ask if 8.9 is greater than 8.11, this tool is right for you
Args:
first: The first number being passed in, the left hand side (eg., 8.9)
second: The second number being passed in, the right hand side (eg., 8.11)
Returns:
A string representing the *answer*. This can be given verbatim.
A string representing the *simple* mathematical symbol showing the comparison.
A string representing the mathematical *expression*.
"""
logger.info(
f"--- 🛠️ Tool: get_math_comparison called for getting {first} > {second} ? ---"
)
if first > second:
return {"answer": f"{first} is greater than {second}", "simple": ">", "expression": f"{first} > {second}"}
elif second > first:
return {"answer": f"{first} less than {second}", "simple": "<", "expression": f"{first} < {second}"}
else:
return {"answer": f"{first} is equal to {second}", "simple": "=", "expression": f"{first} = {second}"}

The method’s text is supposed to be telling the LLM how to interact with my tool and how to interpret the response. Giving abstract terms like “text” evidently isn’t clear enough. I need to be much more clear in telling the LLM what my response is and how it should be handled.

So I fleshed out the Returns section. I tell it that the answer parameter can be given verbatim. I don’t need it opining. I don’t need it to add its own narrative. If it just repeated me, it’d be more correct.

Press enter or click to view image in full size

And there we go.

What else can I build?

Now that I can build simple tools, it makes LLMs more useful for me. In fact, I can start to build over the course of tim, adding new things as necessary.

Although many of my new ideas might require a little more infrastructure. For instance, pulling books from my Goodreads to-read list and placing holds on them from my local library website. I can use Browser MCP for some of that, but I also believe I will need a higher level interface like an agent to orchestrate these calls together.

Either way, I am quite excited for the potential of what I can build next.

--

--

Nick Felker
Nick Felker

Written by Nick Felker

Social Media Expert -- Rowan University 2017 -- IoT & Assistant @ Google

No responses yet