I Love MCP. Here Is What I Learned Building Four of Them

I love MCP and consider myself an early adopter. We have built several versions of our own MCP server for DBGorilla. Early on we built a Node version from scratch using the MCP SDK. We then built a version with FastAPI MCP since we already use FastAPI and the ability to reuse existing schemas and docstrings was really compelling. Hit an auth limitation there, submitted a PR, and moved on when it sat. Built a standalone Python version with FastMCP. Our latest is FastMCP 2.0 embedded directly in the app. I have also used and tested a lot of third-party MCP servers. And I watched what happened to the context window.

Stephen Nolan | Co-Founder & COO

The Potential Challenge

Every MCP tool you connect injects its schema into the prompt. Every parameter, every description, every type definition. Connect a few MCP servers and your agent is burning tokens on tool descriptions before it even starts thinking about your task.

MCP scales your capabilities. It also scales your prompt.

What the Community Is Doing About It

This is well documented at this point. People are actively working on solutions like tool search, lazy loading, and schema deduplication, which is great. The problem is real and the community is responding.

What We Did About It

We have invested heavily in keeping our own MCP server lean. Minimal tool descriptions. Self-documenting parameter names so you do not need prose descriptions. Consolidating CRUD operations into single tools with action parameters instead of separate tools for every operation. Shared patterns across domains so you do not end up with duplicate tools doing the same thing. Scoped endpoints so clients only load the tools they actually need.

Delegation Model

On top of that, if your client supports subagents, you can scope each one to a single MCP tool using a delegation model. This is similar to the strategy we use inside the DBGorilla application. The delegation model keeps our main agent’s context clean. Not every MCP client supports this, so your mileage may vary.

An OpenAPI CLI Instead

We found yet another way.

One of our (many amazing) team members came up with a different approach. They initially prototyped a CLI on our agent runtime, built on our OpenAPI spec. FastAPI generates the spec automatically from our route definitions and Pydantic schemas. Zero extra work. The CLI reads that spec and exposes every endpoint. We took that idea and created a client for the entire app.

When an agent uses the CLI, it discovers what it needs on demand.

endpoints --search chat
endpoints --tag "Query Optimization"
call GET /api/v0_1/chat/

Nothing is injected into the prompt. The agent pulls what it needs, when it needs it. Pull-based discovery instead of push-based prompt stuffing.

What the Agent Actually Said

I pointed an agent at the CLI in a fresh context window and prompted: “see what you are capable of doing with the @cli/dbg-api-cli tool, have fun with it.” No tool definitions injected. No schema bloat. Just the agent discovering what it needed on its own. It really wanted me to believe it was having fun exploring it.

“[redacted] endpoints! That is a big API.”
“We are in! Now let me go wild. Let me explore the system in parallel.”
“Nice! There are 6 existing chats, setup is complete, and lots of health endpoints. Let me dig into the interesting stuff. Check system health, peek into an existing chat, and explore the more exotic endpoints.”
“The streaming chat works beautifully! The DBGorilla AI introduced itself as a PostgreSQL database assistant with a rich set of capabilities. Query optimization, schema analysis, parameter tuning, cloning, and more.”
“Now let me have some more fun. Let me follow up on that chat and check a few more endpoints.”
“This is a beautifully designed single-file CLI using PEP 723 inline script metadata. It only needs uv installed. Typer, httpx, and rich are declared in the script header and auto-installed. The architecture follows a layered pattern: auth management (JWT caching + auto-refresh), HTTP client helpers, SSE streaming, OpenAPI spec caching, and Typer command groups.”

You can tell those are real AI responses. The enthusiasm is a giveaway.

Do You Even Need MCP?

There is also a question worth asking before you reach for MCP at all. Does the model already know the API? I do not need a GitHub MCP server. The model was trained on that API and ‘gh’ CLI. Same for Prometheus, Grafana, Azure, AWS, and others. The model already knows how to call them. An MCP server for a well-known API is injecting tool definitions for knowledge the model already has. That is the purest form of prompt waste in my experience. I am not trying to save money, just trying to keep these things focused.

The real value of MCP is for proprietary APIs the model has never seen. That is exactly what we build. So for us, it matters.

The Takeaway

For quick integrations where you connect a tool and go, MCP is great. For larger proprietary API surfaces, an OpenAPI spec and a thin CLI might be the better play.

We did not choose one over the other. We built both. The right tool depends on the job. And full credit to the team for seeing this way before I ever would have. It is probably my favorite tool right now.

If you are building AI agent tooling, think about what your context window looks like at scale. There are a lot of smart solutions emerging. This is just one more option that worked for us.

dbgorilla

dbgorilla

dbgorilla