Do stricter MCP tool schemas increase agent reliability?

An empirical investigation into whether stricter JSON schemas for MCP tool parameters improve agent reliability. Using a FastMCP expense-logging server with multiple schema variants (bare string, Annotated descriptions, Literal/Enum types, regex patterns), the author ran evaluations across 17 test cases, multiple OpenAI models (gpt-4o, gpt-4.1-mini, gpt-5.3-codex), reasoning effort levels, and two agent frameworks (Pydantic AI and GitHub Copilot SDK). Key findings: combining enum constraints with descriptions (Annotated[Enum]) yielded the best category accuracy but at double the token cost; date schema strictness had zero measurable impact on frontier models; model choice and reasoning effort level mattered more than schema strictness; and both agent frameworks produced identical results. The conclusion is that modern frontier models are well-trained for tool calling and mostly need clarity for ambiguous fields rather than strict type constraints, though stricter types still benefit server-side code quality.

#mcp

Mar 17•20m read time•From blog.pamelafox.org

Table of contents

A basic MCP tool and schema Annotating parameters with descriptions Constraining parameters with types Setting up evaluations Evaluation results: category Evaluation results: date Cross-model evaluations Impact of reasoning effort Comparing agent frameworks Takeaways

Comment

Bookmark

Copy

Sort: