The honest answer: harder than most people expect the first time they sit it, and more passable than it looks once you understand what it actually tests. The CCA Foundations exam has a scaled pass score of 720 out of 1,000. That threshold filters out candidates who are relying on general AI knowledge or surface-level Claude familiarity. It does not require expert-level production experience in every domain. The gap between those two statements is where most preparation strategies go wrong.
What Kind of Hard Is It?
The exam is not hard in the way trivia is hard — it doesn't require you to memorise obscure facts or recall exact API parameter names. It's hard in the way engineering judgment is hard: it presents realistic scenarios and asks you to identify the correct architectural decision among four plausible options. Three of those four options are typically defensible on their surface. One is demonstrably better for the specific constraints described in the scenario.
This means you can't pass by reading a summary of what each domain covers. You need to develop the ability to reason through trade-offs under time pressure. Candidates who pass quickly are typically those who have internalised the frameworks — not those who've memorised the frameworks.
The Five Domains, Ranked by Reported Difficulty
1. Context Management (hardest for most candidates)
This domain covers token budgets, prompt caching, multi-turn context design, and the CALM framework. It consistently catches candidates regardless of their experience level. Engineers who've built with Claude tend to have strong intuitions about the symptoms of poor context management (cost spikes, inconsistent behaviour, degraded quality over long conversations) but haven't systematised the underlying mechanics.
The specifics that most often determine pass/fail: prompt caching has a 5-minute TTL and requires consecutive identical prefix blocks to activate — most people know caching exists but not its precise mechanics. Cache write costs 25% more than uncached; reads cost 90% less. The exam tests whether you can make correct decisions about when caching is worth using, not just whether you know it exists. The CALM framework (Context-Aware Language Management) provides a structured approach for context design that appears in exam scenarios requiring you to diagnose and correct a poorly designed multi-turn system.
2. Agentic Architecture (hardest for non-engineers)
The highest-weighted domain (27%) tests your ability to design agentic systems that are reliable, safe, and cost-controlled. The core principles — minimal footprint, human-in-the-loop checkpoints, reversible actions before irreversible ones, failure handling in multi-agent pipelines — are straightforward to understand conceptually. Where candidates fail is in applying them precisely when scenarios involve competing constraints.
A typical difficult question: an agentic loop is producing inconsistent results. Four potential causes are offered, each plausible. The correct answer requires you to distinguish between a context window management issue, a tool definition issue, a prompt design issue, and a model reliability issue — not just identify that something is wrong.
3. Tool Design and MCP (hardest for conceptual learners)
The Model Context Protocol domain catches candidates who learned about MCP from blog posts rather than from building with it. The exam tests the transport layer distinction (stdio vs SSE — when to use each and why), tool schema design principles, security boundaries in MCP servers, and the architectural patterns for connecting Claude to external systems.
The most common failure point: candidates can describe what MCP does but can't reason about which transport is appropriate for a given deployment scenario, or can't identify the security vulnerability in a described MCP server implementation.
4. Claude Code Configuration (moderate difficulty)
This domain covers CLAUDE.md files, Claude Code setup, slash commands, hooks, and permission management. Candidates with direct Claude Code experience typically find this domain manageable. The difficulty is in the details: the difference between global and project-level CLAUDE.md files, the precedence rules when both exist, and the specific hook types (PreToolUse, PostToolUse, Stop, Notification) and their appropriate uses.
Candidates without Claude Code experience find this domain disproportionately hard because the concepts don't map to prior knowledge. Everything needs to be learned from scratch rather than updated from a foundation.
5. Prompt Engineering (most accessible domain)
The PRECISE framework (Persona, Role, Explicit instructions, Context, Instructions, Steps, Examples) provides a structured vocabulary that most experienced Claude users have implicitly used without formalising. The domain covers system prompt design, chain-of-thought prompting, few-shot examples, and prompt anti-patterns. Most candidates with meaningful Claude experience find this the domain they need the least preparation time for.
The trap: overconfidence. Candidates who've done a lot of prompt engineering often skip this domain during preparation and lose points on the specific framework terminology or the exact definition of prompt anti-patterns the exam uses.
How Long Does It Take to Prepare?
2–3 weeks for engineers with hands-on Claude production experience who study systematically. These candidates have strong intuitions from real work but need to fill specific gaps (prompt caching mechanics, MCP transport specifics, CALM and PRECISE framework vocabulary) that production experience doesn't always develop fully.
4–6 weeks for engineers familiar with AI development but without deep Claude-specific experience. These candidates need to build from a solid foundation but aren't starting from zero on the underlying concepts.
6–8 weeks for non-engineers (product managers, technical consultants, architects from adjacent fields) who need to build conceptual foundations before working on exam-specific preparation. These candidates can and do pass — the exam tests architectural reasoning, not code-writing ability — but the foundation-building phase takes longer.
What Catches People Off Guard
Three things consistently surprise candidates who failed their first attempt:
- The precision of the context management domain. Knowing that prompt caching exists is not enough. The exam requires you to know the TTL, the cost differential, the conditions under which it activates, and when it's not the right tool. Vague knowledge produces wrong answers on questions that have specifically correct answers.
- The scenario depth in Agentic Architecture. Scenarios are not abstract. They describe a specific system with specific failure modes and ask you to identify the precise cause and correct fix. Candidates who studied principles without working through practice scenarios get caught by the specificity.
- The time pressure. 60 questions in 120 minutes is 2 minutes per question. Questions are paragraph-length scenarios, not one-line recall prompts. Candidates who haven't practiced under time pressure often run short of time on the latter questions — which tends to affect Agentic Architecture scores disproportionately since the most complex scenarios cluster later in the exam.
Is It Passable on the First Attempt?
Yes, if you prepare correctly. "Correctly" means: practice with timed questions at exam length (not just reading domain summaries), cover all five domains systematically instead of doubling down on your strongest areas, and specifically address prompt caching mechanics and MCP transport selection since those are the most common gaps even for experienced candidates.
Candidates who fail typically did one of three things: underestimated the specificity required in context management, relied on general AI knowledge without Claude-specific depth, or ran out of time because they hadn't practiced at exam pace.
Our free 10-question diagnostic shows you your starting point across all five domains in under 20 minutes — so you know which areas to prioritise before you begin preparing. The full 60-question timed exam simulation replicates real exam conditions with domain-weighted scoring so you can measure readiness before test day, not after.