🧠 Claude Code Mastery Playbook

Understanding How Claude Code Works Better Than 99% of Users

Paint-by-Numbers Guide | Leverage-Optimized Learning Path

Watch Original Video by Mark (1hr deep-dive)

🎯 What You'll Master

This playbook extracts Mark's 1-hour video into actionable steps prioritized by leverage and temporal dependencies. Each section includes deeper insights ("why this matters") and teaching prompts to deepen understanding.

Sections
9
Time
2-3h
Checkpoints
45+
0%

1 Core Architecture & System Design

Priority: Foundation | Goal: Understand how components interact

start-here 15-20 min
🧩 The 5 Core Components

What makes Claude Code work:

  • CLI: Terminal interface you interact with
  • Session Manager: Handles conversation state & context
  • Tool Executor: Runs file ops, bash commands, searches
  • Permission Layer: Security gatekeeper (ask/allow/block)
  • Claude API: Intelligence brain orchestrating everything
Why This Architecture Matters

Leverage insight: Claude Code isn't magicβ€”it's smart orchestration of existing open-source tools. Understanding this means you can:

  • Predict what Claude can/can't do
  • Debug when things break (is it the tool or the AI?)
  • Build your own tools using the same pattern
  • Appreciate why some tasks are fast (native tools) vs slow (AI reasoning)
# View architecture in action
/context

# Shows:
# - System prompt overhead  
# - claude.md impact
# - Current token usage
# - What's consuming your bucket
Deep Learning Prompt (Copy & Use)

Paste this into any LLM to explore deeper:

πŸ“Š The Gather β†’ Act β†’ Verify Loop

Every Claude Code operation follows this pattern:

  1. GATHER: Reads files, searches code, explores structure
  2. ACT: Edits files, creates folders, runs bash commands
  3. VERIFY: Runs tests, checks outputs, loops if needed
The Compound Effect of This Loop

Why this matters: This loop is the difference between 10x productivity and frustration. Understanding it means:

  • You can interrupt bad loops early: If Claude gathers for 5 min, your prompt was too vague
  • You can optimize each phase: Better prompts β†’ faster gather β†’ less wasted context
  • You can provide success criteria: "Verify by running npm test and checking for 0 errors"
  • You prevent infinite loops: Claude will loop forever without clear exit conditions
Deep Learning Prompt

2 Context Window Management (MOST CRITICAL)

Priority: HIGHEST | Goal: Master the 200k token "bucket" constraint

master-this-first 30-35 min
πŸͺ£ The Bucket Metaphor

Your context window is a bucket:

  • Opus 4.5 = 200,000 token capacity (~150k words)
  • Every message, file read, tool result fills it
  • When full β†’ forced to compact (loses information)
  • Quality degrades sharply after 40-50% full
🚨 The Quality Cliff at 50%

After 50% context usage: Claude gets lazier, makes more errors, forgets earlier decisions, gives repetitive suggestions. This is THE constraint that separates experts from beginners.

Why Context Management Is Your Highest-Leverage Skill

The multiplier effect:

  • Bad context management: 10 min of productivity per session before compaction β†’ restart loop β†’ 1 hour wasted
  • Good context management: 2+ hours of sustained quality β†’ 10x more work done
  • The asymmetry: One mistake (reading a PDF) nukes your entire session. One good habit (slim claude.md) pays dividends every session forever.
  • Compound advantage: Experts preserve context β†’ build more β†’ learn faster β†’ compound skill growth
# Check bucket status anytime
/context

# Strategic monitoring:
# - Start session: Should be <10% used
# - Mid-session: If >40%, consider clearing or switching terminals
# - Before big task: If >60%, start fresh terminal
Deep Learning Prompt
πŸ“Š What Fills the Bucket Fastest (Ranked by Danger)

Token consumption ranked (worst offenders first):

  1. 🚨 PDFs: 1.8M tokens for 50-page doc (instant deathβ€”900% of your bucket!)
  2. ⚠️ Large file reads: 10k+ line files = 15-30k tokens
  3. ⚠️ MCP servers: 10-50k tokens at session start (before you type anything)
  4. ⚠️ Tool result spam: JSON-heavy database/API responses
  5. ⚠️ Bloated claude.md: 5-15k tokens loaded EVERY session
  6. Normal: Conversation history grows ~500-1000 tokens per exchange
⚠️ Mark's Real Example (PDF Token Bomb)

Reading a 50-page economic report consumed 100% of context in ONE operation. The file was 1.8 MILLION tokens because PDFs are full of invisible formatting metadata.

Solution: External API (Gemini 2.5 Flash with 1M context) + return markdown summary only = 98% token savings

The Hidden Costs & Second-Order Effects

What beginners miss:

  • MCP bloat cascade: You install 10 MCPs for "just in case" β†’ Each session starts at 30% capacity β†’ Only get 70% productivity β†’ Never realize MCPs are the problem
  • The compaction amnesia spiral: Fill bucket β†’ compact β†’ lose key decisions β†’ make mistakes based on forgotten context β†’ waste time debugging β†’ fill bucket faster next time
  • Thrashing pattern: Hit 80% β†’ compact β†’ resume β†’ hit 80% again in 10 min β†’ restart session β†’ lose momentum
  • Opportunity cost: Every wasted token is a lost opportunity for Claude to hold MORE useful information
# SOLUTION: Offload large docs to external API

"Create a skill called 'read_large_doc' that:
1. Uses Gemini 2.5 Flash API (1M context window)
2. Reads the PDF file
3. Converts to markdown (strips hidden metadata)  
4. Returns 2-3 page summary with key points
5. Saves my Claude Code context for actual work

Put Gemini API key in .env file."

# Result: 1.8M tokens β†’ 5k token summary = 97% savings
Deep Learning Prompt
πŸ’‘ Context Preservation Strategies (4 High-Leverage Tactics)

Strategy 1: External API Offloading

When: Any file >20 pages or >5k lines

How: Create Python script that uses Gemini/GPT, returns summary only

Savings: 90-98% token reduction

Strategy 2: Sub-Agent Delegation

When: Exploration tasks (codebase mapping, research)

How: Spin up sub-agent with virgin 200k context β†’ it explores β†’ reports back summary

Savings: Main session stays clean, sub-agent context is disposable

Strategy 3: Multi-Terminal Workflow

When: Mutually exclusive tasks (frontend/backend/testing)

How: Terminal 1 = Frontend, Terminal 2 = Backend, Terminal 3 = Testing

Savings: 3x the effective context (600k total vs 200k compressed)

Strategy 4: Slim Claude.md (Routing Pattern)

When: Always

How: Keep claude.md <2k tokens, use it as index to other docs

Savings: 5-15k tokens per session start

The Compounding Leverage of These Strategies

Multiplier math:

  • Beginner: No optimization β†’ 30min productive session β†’ 6 restarts per 3hr coding block = 3hr actual work
  • Intermediate: Slim claude.md + avoiding PDFs β†’ 60min sessions β†’ 3 restarts = 4.5hr equivalent work (1.5x multiplier)
  • Expert: All 4 strategies + multi-terminal β†’ 120min+ sessions β†’ 0-1 restart = 7hr equivalent work in 3hr block (2.3x multiplier)
  • Annual impact: 2.3x multiplier Γ— 500 coding hours/year = 650 "free" hours gained
# Audit your claude.md token usage
"Read my claude.md file and analyze:
1. How many tokens is it currently?
2. What content is repetitive or unnecessary?
3. What should be moved to separate playbooks?
4. Rewrite it to under 2k tokens using routing pattern
5. Show before/after token counts"
Deep Learning Prompt

3 Tool Mastery: Read, Write, Edit, Search

Priority: Core Skill | Goal: Understand how Claude navigates code efficiently

intermediate 25-30 min
πŸ” The Tool Arsenal & When to Use Each
  • Read: Load file contents (⚠️ token-heavy)
  • Write: Create new files from scratch
  • Edit: Surgical string replacement (token-efficient)
  • Glob: Pattern-based file finding (*.ts, **/*.jsx)
  • Grep (ripgrep): Fast text search across codebase
Why Tool Choice Multiplies Your Productivity

The leverage cascade: Glob finds 5 relevant files out of 100 β†’ Grep searches within those 5 β†’ Read loads only the 1 file that matters β†’ Edit makes surgical change β†’ Total tokens: ~5k instead of 150k if you read everything.

Expert pattern: Always search before reading. Reading is expensive, searching is cheap.

# Smart tool workflow example:

"Fix the login button bug. Before reading ANY files:
1. Use glob to find all *.tsx files
2. Use grep to search for 'login' in those files
3. Read ONLY the file that contains the login button
4. Make the fix with surgical edit
5. Verify the change worked"

# This approach uses ~5k tokens vs ~50k if you read the whole codebase
Deep Learning Prompt

πŸ“„ One-Page Quick Reference: 10 High-Leverage Principles & Hacks

Print this. Memorize this. Use this. This is your 80/20.

1Context is King

The Rule: Never exceed 50% context usage

The Hack: Run /context every 15min. If >40%, switch terminals or clear.

ROI: 2-3x longer productive sessions

2PDFs Are Poison

The Rule: Never directly read PDFs in Claude Code

The Hack: Create "read_large_doc" skill using Gemini API β†’ 97% token savings

ROI: Prevents instant session death

3Search Before Read

The Rule: Use glob/grep to narrow scope before reading files

The Hack: "Find it with grep, confirm with read" = 10x token efficiency

ROI: 80% reduction in wasted context

4Slim Claude.md

The Rule: Keep claude.md under 2k tokens

The Hack: Use routing pattern: "For X, read playbook-X.md"

ROI: Save 5-15k tokens every session start

5Multi-Terminal Mastery

The Rule: Separate concerns across terminals

The Hack: T1=Frontend, T2=Backend, T3=Testing = 3x effective context

ROI: Work 3 hours without compaction

6MCP Minimalism

The Rule: Only use MCPs you need EVERY session

The Hack: Convert occasional MCPs to skills (just-in-time loading)

ROI: 20-40k token savings at session start

7Sub-Agent Delegation

The Rule: Use sub-agents for exploration tasks

The Hack: Virgin 200k context for dirty work β†’ summary only back to main

ROI: Explore 50k LOC without touching main context

8Plan β†’ Clear β†’ Execute

The Rule: For complex builds, plan first then clear context

The Hack: Use Plan Mode β†’ save plan.md β†’ /clear β†’ execute with fresh context

ROI: Build complex features without mid-build degradation

9Permission Graduation

The Rule: Start "Ask mode", graduate to "YOLO" after learning

The Hack: 20 sessions in Ask mode = pattern recognition β†’ safe to YOLO

ROI: 3x faster iteration after graduation

10Session-End Capture

The Rule: Always capture learnings before closing session

The Hack: "Update claude.md with: [what we learned]" + git commit

ROI: Never relearn the same thing twice

🎯 Your Action Plan (Do This Next)

  1. RIGHT NOW: Run /context and audit your current usage
  2. TODAY: Audit your claude.md (make it <2k tokens)
  3. THIS WEEK: Create "read_large_doc" skill with Gemini API
  4. THIS MONTH: Practice multi-terminal workflows until natural
  5. ONGOING: Never exceed 50% context, capture learnings every session