Section 01

Core Premise — HC Pages as Knowledge Base

Traditional AI assistants require a dedicated knowledge pipeline: documents are chunked, embedded into vectors, stored in a vector database, and retrieved via similarity search (RAG). The HC Page Agent eliminates this entire stack.

Key Insight

Every HC page already contains structured JSON metadata. The hc-metadata block is the embedding. The neural registry is the vector index. The page URL is the retrieval mechanism. The knowledge base already exists — it's the published pages themselves.

Why This Works

HC vs. RAG: Side-by-Side

Traditional RAG Pipeline

  • Upload documents to ingestion system
  • Chunk text (512–1024 tokens per chunk)
  • Generate embeddings (OpenAI/Cohere API call per chunk)
  • Store in vector DB (Pinecone/Weaviate/Chroma)
  • Similarity search on every query
  • Re-index when documents change
  • Lost context at chunk boundaries
  • $50–200/month for vector DB

HC Page Agent

  • Pages published via existing HC pipeline
  • No chunking — metadata IS the structured summary
  • No embeddings — tags + title = searchable
  • No vector DB — registry is the index
  • Direct URL fetch on every query
  • Instant updates on publish
  • Full page context preserved
  • $0 additional infrastructure
Section 02

Discovery Protocol — The Registry Crawl

When a user asks the agent a question, it follows a structured discovery protocol to find relevant context. The protocol is designed to minimize API calls while maximizing coverage.

Read Own Page FREE
Agent reads the hc-metadata from the page it's embedded on. This data is already in the DOM — zero network cost. Provides immediate context: page title, type, tags, summary, and any structured data (blockers, actions, meetings array).
Resolve Registry Stack FREE
Read registry_stack from hc-metadata to find parent registries. The stack is a 5-level hierarchy: meta → standard → organization → project → user. Fallback: check registries object, then site root /.
Fetch Registry Metadata 1 API CALL
Fetch the parent registry page and parse its hc-metadata. This returns the full catalog: meetings[], artifacts[], blockers.urgent[], blockers.proactive[]. Each entry has name, URL, summary, tags, and date.
Match Query to Entries FREE
Agent scans registry entries for relevance: keyword match in title/summary/tags, person filter (attendee lists), date range filter. This narrows hundreds of entries to 1–5 candidates without any API calls.
Deep Read Linked Pages 1–5 API CALLS
For each relevant candidate, fetch the page and extract its hc-metadata + hc-context-public. This gives the agent deep context: full meeting notes, detailed action items, decision rationale, attendee lists, tool specs.
Synthesize Answer 1 LLM CALL
Agent combines own-page context + registry overview + deep-read details into a cited answer. Every claim references a specific page title + date. Never fabricates data.

Discovery Flow Diagram

User Question | v [ChatAgent Widget] -----> [Own Page hc-metadata] (instant, in DOM) | v [POST /api/hc/registry-agent] | +--> [Registry hc-metadata] (1 fetch) | | | +--> meetings[]: 22 entries with URL, summary, tags | +--> artifacts[]: core docs, sprint pages | +--> blockers.urgent[]: active blockers | +--> blockers.proactive[]: action items | +--> [Match & Filter] (local, no API) | | | +--> "What did Jason discuss?" => filter by attendee | +--> "Day 3 blockers?" => filter by tag + date | +--> "LinkedIn strategy?" => keyword search | +--> [Deep Read 1-5 Pages] (targeted fetches) | | | +--> Parse hc-metadata for meeting_data | +--> Parse hc-context-public for decisions, tools | v [Cited Answer with Page References]
Section 03

Existing Implementation — What's Built Today

Two components already exist that implement parts of this architecture. The gap is connecting them with tool-use capability.

A. ChatAgent Component

templates/athio-registry-jsx.jsx : lines 237–348

A React component embedded in the Athio JSX registry. Renders as a floating action button (bottom-right). On click, opens a chat panel with message history. Currently:

Current Limitation

The ChatAgent sends the entire registry metadata as a flat blob. The agent cannot selectively fetch individual pages. If the registry has 22 entries, all 22 summaries are jammed into the system prompt, but none of the linked pages' deep content (meeting notes, decisions, action items) is accessible.

B. Registry Agent API

app/api/hc/registry-agent/route.ts

Server-side API that proxies to the Anthropic API with registry context. Supports four actions:

Action Purpose Input Output
chat Free-form Q&A about registry User message + registry data Natural language answer
assess Analyze new page impact Page metadata Blockers, actions, decisions extracted
rebuild Re-aggregate linked pages Registry ID Inconsistency report
cleanup Flag stale items Registry data Suggested removals

C. Skill Injector (Registry Walk)

lib/hc/skill-injector.ts

The page enhancer already implements the exact discovery pattern the agent needs, but on the client side for cross-reference cards:

  1. Reads registry_stack from page's hc-metadata
  2. Fetches each registry in parallel (deduplicates by URL)
  3. For meta-registries, recursively discovers sub-registries (depth-1)
  4. Matches current page's tags against registry entries
  5. Injects cross-reference cards (max 8) with level badges
Reuse Opportunity

The toHcFormat(html, sourceUrl) function in skill-injector.ts converts any HC page to a plain-text .hc format optimized for LLM consumption. This is exactly what the agent needs to read linked pages — structured text extraction from HTML, already built.

D. HC Format Export

The toHcFormat function produces output like:

output.hc
# Meeting Title — HC v1.3.3
> Source: https://ideas.asapai.net/meeting-slug
> Type: meeting | ID: artifact-id-here

---
## hc-metadata
{ "title": "...", "meeting_data": { "blockers": [...], "decisions": [...] } }

---
## hc-instructions
How to use this meeting page...

---
## hc-context-public
{ "attendees": [...], "action_items": [...], "decisions": [...] }
Section 04

Target Architecture — Tool-Using Agent

The key upgrade is giving the agent tool-use capability via the Anthropic Messages API. Instead of cramming all context into the system prompt, the agent decides what to read based on the user's question.

Architecture Diagram

[Browser: ChatAgent Widget] | POST /api/hc/registry-agent | { message, page_metadata } v [API Route: registry-agent] | | 1. Build system prompt (page context + tool definitions) | 2. Call Anthropic Messages API with tool_use enabled v [Claude Sonnet + Tools] | | Agent decides which tools to call: | +--> read_own_page() Read page hc-metadata +--> list_registry_entries() Fetch registry catalog +--> read_page(url) Fetch any linked page +--> search_registry(query) Filter entries by keyword/tag/person +--> walk_meta_registry() Discover all registries | | Tool results fed back to Claude | Claude generates final answer with citations v [Response streamed to ChatAgent]

Tool-Use Loop (Server-Side)

The registry-agent API handles the tool-use loop server-side. The client sends a single request and receives a final answer — no client-side tool orchestration needed.

Pseudo-code: Tool Loop
async function handleChat(message, pageMetadata) {
  const tools = [readOwnPage, listRegistryEntries, readPage, searchRegistry, walkMetaRegistry];
  const systemPrompt = buildSystemPrompt(pageMetadata);

  let messages = [{ role: "user", content: message }];
  let totalFetches = 0;

  // Tool-use loop (max 3 iterations to prevent runaway)
  for (let i = 0; i < 3; i++) {
    const response = await anthropic.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 2000,
      system: systemPrompt,
      tools: tools,
      messages: messages,
    });

    // If no tool use, we have the final answer
    if (response.stop_reason === "end_turn") {
      return response.content[0].text;
    }

    // Execute tool calls
    for (const block of response.content) {
      if (block.type === "tool_use") {
        const result = await executeTool(block.name, block.input);
        totalFetches += (block.name === "read_page" ? 1 : 0);

        // Safety: cap at 5 page fetches per turn
        if (totalFetches > 5) break;

        messages.push({ role: "assistant", content: response.content });
        messages.push({
          role: "user",
          content: [{ type: "tool_result", tool_use_id: block.id, content: result }]
        });
      }
    }
  }
}
Section 05

Agent Tools Specification

Tool Input Output Cost
read_own_page None (uses page metadata passed in request) Parsed hc-metadata + hc-context-public JSON 0 fetches
list_registry_entries registry_url? (defaults to registry_stack[0]) Array of entries: {name, url, summary, tags, date, type} 1 fetch
read_page page_url (must be on user's domain) Page's hc-metadata + hc-context-public as structured text via toHcFormat() 1 fetch
search_registry query, filters?: {tags[], person, date_from, date_to} Filtered entries matching criteria 0–1 fetches
walk_meta_registry meta_registry_url? (defaults to /meta-registry) Array of registry descriptors: {name, url, mode, entry_count, level} 1 fetch

Tool: read_page — Implementation Detail

This is the most important tool. It fetches any HC page by URL and returns structured content for the LLM.

Implementation
async function readPage(pageUrl: string): Promise<string> {
  // Validate URL is on an allowed domain
  const allowedDomains = await getUserDomains(userId);
  const urlDomain = new URL(pageUrl).hostname;
  if (!allowedDomains.includes(urlDomain)) {
    return "Error: Cannot read pages from unauthorized domain.";
  }

  // Fetch the page HTML
  const res = await fetch(pageUrl);
  if (!res.ok) return `Error: Page returned ${res.status}`;

  const html = await res.text();

  // Convert to LLM-optimized format using existing toHcFormat()
  const hcText = toHcFormat(html, pageUrl);

  // Truncate to prevent context window overflow (max 8K per page)
  return hcText.substring(0, 8000);
}

Tool: search_registry — Implementation Detail

Implementation
function searchRegistry(
  registryData: any,
  query: string,
  filters?: { tags?: string[], person?: string, date_from?: string, date_to?: string }
): any[] {
  const allEntries = [
    ...(registryData.meetings || []),
    ...(registryData.artifacts || []),
  ];

  return allEntries.filter(entry => {
    // Keyword match
    if (query) {
      const text = [entry.name, entry.summary, ...(entry.tags || [])].join(" ").toLowerCase();
      if (!text.includes(query.toLowerCase())) return false;
    }

    // Person filter
    if (filters?.person) {
      const people = [...(entry.attendees || []), entry.owner || ""].map(s => s.toLowerCase());
      if (!people.some(p => p.includes(filters.person.toLowerCase()))) return false;
    }

    // Date range filter
    if (filters?.date_from) {
      const entryDate = new Date(entry.meeting_date || entry.added);
      if (entryDate < new Date(filters.date_from)) return false;
    }
    if (filters?.date_to) {
      const entryDate = new Date(entry.meeting_date || entry.added);
      if (entryDate > new Date(filters.date_to)) return false;
    }

    // Tag filter
    if (filters?.tags?.length) {
      const entryTags = (entry.tags || []).map(t => t.toLowerCase());
      if (!filters.tags.some(t => entryTags.includes(t.toLowerCase()))) return false;
    }

    return true;
  });
}
Section 06

Security, Limits & Context Management

Security Policies

Policy Rule Rationale
Domain Restriction Agent can only fetch pages from domains the authenticated user owns Prevents data exfiltration — agent can't read arbitrary URLs
Auth Required Registry-agent API requires valid Supabase session No anonymous access to LLM-powered features
Read-Only (v1) Agent cannot write, update, or delete any data Safety first — write capability deferred to v2
API Key Server-Side Anthropic API key stored as ANTHROPIC_API_KEY env var on Vercel, never sent to client Prevent key leakage

Rate Limits & Budgets

Limit Value Notes
Page fetches per turn 5 max Prevents runaway crawls; usually 1–3 suffice
Content per page 8,000 chars Truncated via toHcFormat() output
Total context per turn ~50K tokens System prompt + tools + messages + tool results
Tool loop iterations 3 max Agent gets 3 rounds of tool calls before forced response
API calls per user/minute 10 Standard rate limit via Vercel edge middleware
Max output tokens 2,000 Keeps responses focused and fast

Context Window Strategy

Budget Allocation

Claude Sonnet's 200K context window is allocated as follows:

System prompt: ~2K tokens (instructions + page metadata summary)

Tool definitions: ~1.5K tokens (5 tools with descriptions)

Registry overview: ~5–15K tokens (depends on entry count)

Deep-read pages: ~8K per page × 5 max = 40K tokens

Conversation history: ~5K tokens (recent messages)

Output budget: 2K tokens

Total worst case: ~65K tokens — well within 200K limit.

Section 07

Open Questions & Design Decisions

# Question Options Recommendation
1 Registry data caching between messages? Fresh fetch each time vs. cache for session duration Cache for 5 minutes. Registry changes are infrequent during a conversation.
2 Cross-registry discovery? Restrict to own registry vs. allow meta-registry walk Allow meta-registry walk but only for user's own domains. Enables "what's happening across all registries?" queries.
3 Write operations in v2? Read-only forever vs. allow dismiss/annotate v2: Allow dismiss + add-note. Never allow delete or structural changes via agent.
4 Streaming responses? Wait for full response vs. SSE streaming SSE streaming for chat action. Non-streaming for assess/rebuild/cleanup (need structured JSON output).
5 Conversation persistence? localStorage vs. server session vs. ephemeral localStorage with 10-message window. Cheap, no server state, clears on page navigation (acceptable for page-scoped agent).
6 Agent on non-registry pages? Registry-only vs. any HC page Any HC page with a registry_stack. Agent discovers context by walking up to its registry. A meeting page agent can answer questions about the meeting + related context.
Section 08

Rollback & Recovery

This page is a new artifact — it creates no risk to existing pages. Recovery strategy:

Layer Mechanism Action
Git Source HTML committed before publish git revert to remove temp file
Supabase page_versions table auto-snapshots on registry update Restore previous version via API or SQL Editor
Registry Auto-cascade creates version before mutating registry If the registry entry is unwanted, restore registry from version
Simple Delete This is a standalone page with no dependencies Delete from Supabase pages table — zero blast radius
Pre-Publish Checkpoint

Git HEAD at time of generation: e715500. No existing files modified. The temp file tmp-hc-page-agent-architecture.html is the only new artifact. Registry entry can be independently reverted.