Understanding How Claude Code Works Better Than 99% of Users
Paint-by-Numbers Guide | Leverage-Optimized Learning Path
Watch Original Video by Mark (1hr deep-dive)This playbook extracts Mark's 1-hour video into actionable steps prioritized by leverage and temporal dependencies. Each section includes deeper insights ("why this matters") and teaching prompts to deepen understanding.
Priority: Foundation | Goal: Understand how components interact
start-here 15-20 minWhat makes Claude Code work:
Leverage insight: Claude Code isn't magicβit's smart orchestration of existing open-source tools. Understanding this means you can:
# View architecture in action /context # Shows: # - System prompt overhead # - claude.md impact # - Current token usage # - What's consuming your bucket
Paste this into any LLM to explore deeper:
Every Claude Code operation follows this pattern:
Why this matters: This loop is the difference between 10x productivity and frustration. Understanding it means:
Priority: HIGHEST | Goal: Master the 200k token "bucket" constraint
master-this-first 30-35 minYour context window is a bucket:
After 50% context usage: Claude gets lazier, makes more errors, forgets earlier decisions, gives repetitive suggestions. This is THE constraint that separates experts from beginners.
The multiplier effect:
# Check bucket status anytime /context # Strategic monitoring: # - Start session: Should be <10% used # - Mid-session: If >40%, consider clearing or switching terminals # - Before big task: If >60%, start fresh terminal
Token consumption ranked (worst offenders first):
Reading a 50-page economic report consumed 100% of context in ONE operation. The file was 1.8 MILLION tokens because PDFs are full of invisible formatting metadata.
Solution: External API (Gemini 2.5 Flash with 1M context) + return markdown summary only = 98% token savings
What beginners miss:
# SOLUTION: Offload large docs to external API "Create a skill called 'read_large_doc' that: 1. Uses Gemini 2.5 Flash API (1M context window) 2. Reads the PDF file 3. Converts to markdown (strips hidden metadata) 4. Returns 2-3 page summary with key points 5. Saves my Claude Code context for actual work Put Gemini API key in .env file." # Result: 1.8M tokens β 5k token summary = 97% savings
When: Any file >20 pages or >5k lines
How: Create Python script that uses Gemini/GPT, returns summary only
Savings: 90-98% token reduction
When: Exploration tasks (codebase mapping, research)
How: Spin up sub-agent with virgin 200k context β it explores β reports back summary
Savings: Main session stays clean, sub-agent context is disposable
When: Mutually exclusive tasks (frontend/backend/testing)
How: Terminal 1 = Frontend, Terminal 2 = Backend, Terminal 3 = Testing
Savings: 3x the effective context (600k total vs 200k compressed)
When: Always
How: Keep claude.md <2k tokens, use it as index to other docs
Savings: 5-15k tokens per session start
Multiplier math:
# Audit your claude.md token usage "Read my claude.md file and analyze: 1. How many tokens is it currently? 2. What content is repetitive or unnecessary? 3. What should be moved to separate playbooks? 4. Rewrite it to under 2k tokens using routing pattern 5. Show before/after token counts"
Priority: Core Skill | Goal: Understand how Claude navigates code efficiently
intermediate 25-30 minThe leverage cascade: Glob finds 5 relevant files out of 100 β Grep searches within those 5 β Read loads only the 1 file that matters β Edit makes surgical change β Total tokens: ~5k instead of 150k if you read everything.
Expert pattern: Always search before reading. Reading is expensive, searching is cheap.
# Smart tool workflow example: "Fix the login button bug. Before reading ANY files: 1. Use glob to find all *.tsx files 2. Use grep to search for 'login' in those files 3. Read ONLY the file that contains the login button 4. Make the fix with surgical edit 5. Verify the change worked" # This approach uses ~5k tokens vs ~50k if you read the whole codebase
Print this. Memorize this. Use this. This is your 80/20.
The Rule: Never exceed 50% context usage
The Hack: Run /context every 15min. If >40%, switch terminals or clear.
ROI: 2-3x longer productive sessions
The Rule: Never directly read PDFs in Claude Code
The Hack: Create "read_large_doc" skill using Gemini API β 97% token savings
ROI: Prevents instant session death
The Rule: Use glob/grep to narrow scope before reading files
The Hack: "Find it with grep, confirm with read" = 10x token efficiency
ROI: 80% reduction in wasted context
The Rule: Keep claude.md under 2k tokens
The Hack: Use routing pattern: "For X, read playbook-X.md"
ROI: Save 5-15k tokens every session start
The Rule: Separate concerns across terminals
The Hack: T1=Frontend, T2=Backend, T3=Testing = 3x effective context
ROI: Work 3 hours without compaction
The Rule: Only use MCPs you need EVERY session
The Hack: Convert occasional MCPs to skills (just-in-time loading)
ROI: 20-40k token savings at session start
The Rule: Use sub-agents for exploration tasks
The Hack: Virgin 200k context for dirty work β summary only back to main
ROI: Explore 50k LOC without touching main context
The Rule: For complex builds, plan first then clear context
The Hack: Use Plan Mode β save plan.md β /clear β execute with fresh context
ROI: Build complex features without mid-build degradation
The Rule: Start "Ask mode", graduate to "YOLO" after learning
The Hack: 20 sessions in Ask mode = pattern recognition β safe to YOLO
ROI: 3x faster iteration after graduation
The Rule: Always capture learnings before closing session
The Hack: "Update claude.md with: [what we learned]" + git commit
ROI: Never relearn the same thing twice
/context and audit your current usage