Independent exam prep · Not affiliated with, authorized by, or endorsed by Anthropic.

Prompt Engineering & Structured Output: What CCA Domain 3 Tests (20%)

Domain 3 — Prompt Engineering & Structured Output — makes up 20% of the CCA Foundations exam, and it might be the domain where intuition betrays people most often. Most candidates have written prompts. Far fewer have had to reason carefully about why a prompt that worked fine in testing started failing in production, or what the actual difference is between "asking Claude to format its answer a certain way" and "guaranteeing a format that downstream code can depend on without a human checking it." That distinction is most of what this domain is testing.

It's one piece of the picture covered in our five-domain exam breakdown — here's a closer look at what Domain 3 covers, how it shows up in a realistic scenario, and where candidates tend to lose marks.

What This Domain Actually Tests

  • Structuring prompts clearly — role framing, examples, and unambiguous instructions
  • When few-shot examples resolve ambiguity faster than additional prose ever would
  • Forcing structured output through schemas and tool definitions, rather than instructions alone
  • Recognising when extended reasoning genuinely helps a task, and when it just adds latency
  • Iterating on prompts using real evaluation results rather than a gut feeling about what "sounds right"

The Distinction Underneath Almost Every Question in This Domain

Prompts shape behaviour. They do not guarantee it. That single sentence is the foundation of most of what Domain 3 tests, and it cuts in both directions — which is what makes it trickier than it sounds. Prompting is the right tool for shaping content: tone, reasoning approach, what to prioritise when instructions trade off against each other, how to handle a case a schema can't anticipate. It's the wrong tool for guaranteeing format — the literal shape of an output that some other system is going to parse, store, or act on with no one checking it first.

Confusing the two in either direction costs marks. Asking Claude to "always respond in valid JSON" and treating that as a reliability guarantee is the well-known failure mode. Less discussed — but just as testable — is the opposite mistake: building an elaborate validation schema and assuming it has somehow made the model's reasoning about ambiguous content more reliable. A schema enforces shape. It says nothing about whether the values inside that shape are actually correct.

Worked Scenario: Extracting Structured Records From Resumes

A recruiting platform needs Claude to read uploaded resumes — wildly inconsistent in format, length, and language — and produce a structured record (name, years of experience, skills, education level) that's written straight into a database with no human review step.

The first version of a pipeline like this often looks like a system prompt that says something close to "extract the following fields and respond with a JSON object containing exactly these keys." It performs beautifully in a demo with three clean, well-formatted resumes. In production, on resume #400 — a PDF-to-text conversion with broken line breaks, a candidate who phrases "experience" in a non-standard way, a model that adds one helpful clarifying sentence before the JSON — the pipeline breaks, and it breaks silently, because nothing was actually validating the shape of what came back.

The fix isn't a sterner prompt. It's defining a tool with a strict input schema, forcing the model to use it, and validating the result server-side as a backstop against the rare schema violation that still slips through. That handles the format guarantee. The judgment calls — how to record "years of experience" for a career-changer, what to do when an education history is genuinely ambiguous — are exactly where prompting still earns its keep: clear instructions and a couple of well-chosen examples for the truly ambiguous cases, layered on top of a schema that ensures whatever the model decides comes back in a shape your database can actually use.

Traps That Catch Candidates in This Domain

  • Believing stronger prompt language fixes a format-reliability problem. Capitalising the instruction, repeating it, adding "this is critical" — none of it converts a probabilistic instruction into a structural guarantee. If a downstream system needs a guaranteed shape, the answer involves a schema, not stronger wording.
  • Skipping examples on genuinely ambiguous tasks. When the right output is hard to describe in the abstract — tone, edge-case handling, how to format something unusual — a couple of well-chosen examples often resolve more ambiguity, faster, than another paragraph of instructions trying to describe the same thing in words.
  • Reaching for extended reasoning by default. Added reasoning steps cost time and tokens. The exam tends to test whether you can tell the difference between tasks that genuinely benefit from working through a problem step by step (multi-factor judgment calls, tricky trade-offs) and tasks that are simple lookups or formatting jobs that just get slower with no quality benefit.
  • Equating "well-formed" with "correct." A schema-compliant response can still contain a confidently wrong value. Format enforcement and factual verification solve two different problems, and a complete design addresses both — not just the one that's easier to test for.

This domain rewards having actually felt a prompt fail in a realistic way — which is hard to simulate and easy to practise against with the right questions. Work through our scenario-based practice bank (400 questions, all five domains, full explanations), or start with the free 10-question diagnostic to see exactly how Domain 3 reasoning is landing for you today.