Independent exam prep · Not affiliated with, authorized by, or endorsed by Anthropic.

Tool Design & MCP: What CCA Domain 4 Actually Covers (18%)

Domain 4 — Tool Design & MCP — is worth 18% of the CCA Foundations exam, and it's the domain that most rewards having actually built something an agent depends on, rather than just having used one. The core insight it's testing for is simple to state and surprisingly easy to forget under exam pressure: a tool's name, description, and schema aren't documentation for a human reader. They're the interface the model reasons over when it decides what to do next. Get that interface wrong, and no amount of prompting fixes it.

Domain 4 is one of the five sections covered in our full exam-domain breakdown — here's a deeper look at what it tests, how it plays out in a concrete design decision, and where candidates lose marks they shouldn't.

What This Domain Actually Tests

  • Writing tool names and descriptions that genuinely guide the model's choices, not just label them
  • Choosing tool granularity — fewer broad tools versus more narrowly-scoped ones
  • How MCP servers expose tools, resources, and prompts to a client in a standard, predictable way
  • Designing for graceful failure — error messages a model can actually act on
  • Scoping permissions to the minimum a tool needs to do its job, and no more

Granularity Is a Design Decision, Not a Style Preference

One of the recurring choices this domain tests is whether to build one flexible, general-purpose tool or several narrow, single-purpose ones. There's no universally correct answer — but there is a consistent way to reason about it. Narrow tools let you encode exactly the right constraints into each one's schema (a "cancel" action and a "create" action have almost nothing in common in terms of which fields are required, optional, or forbidden), and they make the model's job simpler: choosing the one obviously-right tool from a short, clearly-differentiated list is a much easier inference than correctly populating a generic "action" parameter buried inside a do-everything tool.

General-purpose tools earn their keep when the underlying actions really are homogeneous — a single read-only query tool against a reporting database, for instance, where every call shares the same schema and the same, low, risk profile. The tell is whether the actions genuinely share a shape and a risk level. If they don't, splitting them isn't over-engineering — it's the design choice that actually reduces the chance of the model doing the wrong thing at the wrong moment.

Worked Scenario: Tools for a Calendar-Management Agent

Say you're designing the toolset for an agent that manages a user's calendar: creating, updating, cancelling, and searching events. Two natural designs present themselves. Option one: a single manage_event tool with an action field (create / update / cancel / find) and one large, flexible payload. Option two: four distinct tools — create_event, update_event, cancel_event, find_events — each with its own tightly-scoped schema.

Option two generally wins, and the reasons are concrete rather than aesthetic. create_event can require a start time and a title; cancel_event shouldn't require either — it just needs an identifier. A single combined schema either has to make everything optional (which invites incomplete creates) or enforce fields that don't apply to half its actions (which invites the model to invent values just to satisfy validation). Splitting also makes permissioning cleaner: find_events can be granted freely, while cancel_event — an action with real consequences for the user — can be gated behind a confirmation step without affecting the other three. None of that clean separation survives once everything funnels through one generic action parameter.

Traps That Catch Candidates in This Domain

  • Designing tools around your API's shape instead of the model's task. A tool that mirrors a backend endpoint one-to-one often hands the model work it shouldn't have to do — pagination, filtering, stitching together related calls — when the tool itself could have done that and returned exactly what's needed.
  • Returning raw, unprocessed data from a tool call. Dumping a full API response back into the conversation burns context on data the model doesn't need and forces it to do the filtering work in its head. A well-designed tool pre-processes and returns only what's relevant to the task at hand.
  • Writing error messages for an engineer instead of the model. "Error 422: Unprocessable Entity" tells the model nothing it can act on. "This event has already started — use update_event to add notes instead of cancelling it" gives it an actual next move, and that's the difference scenario questions are probing for.
  • Assuming more tools means more capability. Every tool in the list is a choice the model has to correctly avoid on every single turn where it isn't the right one. The exam tends to reward the smallest tool set that fully covers the task — not the most exhaustive one.

This domain rewards the kind of pattern recognition that comes from seeing a lot of tool-design decisions laid out side by side — which is exactly what scenario practice builds. Try our free practice question bank (400 scenario-based questions, fully explained), or take the free 10-question diagnostic to find out where Domain 4 sits in your prep right now.