HC Page Agent

Section 01

Core Premise — HC Pages as Knowledge Base

Traditional AI assistants require a dedicated knowledge pipeline: documents are chunked, embedded into vectors, stored in a vector database, and retrieved via similarity search (RAG). The HC Page Agent eliminates this entire stack.

Key Insight

Every HC page already contains structured JSON metadata. The hc-metadata block is the embedding. The neural registry is the vector index. The page URL is the retrieval mechanism. The knowledge base already exists — it's the published pages themselves.

Why This Works

hc-metadata — Structured JSON with title, tags, summary, meeting data, action items. This IS the structured representation that RAG tries to create from unstructured text.
hc-instructions — Tells the agent HOW to interpret the page. No prompt engineering needed per document — each page carries its own instructions.
hc-context-public — Deep structured data (attendees, decisions, blockers, tool specs). Machine-readable by design.
Neural registry — A page that indexes other pages, with arrays of entries including URL, name, summary, tags, and dates. This is the table of contents an agent can scan to find relevant pages.
Instant updates — Publish a page, and the agent sees it on the next query. No re-indexing, no embedding pipeline, no stale cache.

HC vs. RAG: Side-by-Side

Traditional RAG Pipeline

Upload documents to ingestion system
Chunk text (512–1024 tokens per chunk)
Generate embeddings (OpenAI/Cohere API call per chunk)
Store in vector DB (Pinecone/Weaviate/Chroma)
Similarity search on every query
Re-index when documents change
Lost context at chunk boundaries
$50–200/month for vector DB

Pages published via existing HC pipeline
No chunking — metadata IS the structured summary
No embeddings — tags + title = searchable
No vector DB — registry is the index
Direct URL fetch on every query
Instant updates on publish
Full page context preserved
$0 additional infrastructure

Section 02

Discovery Protocol — The Registry Crawl

When a user asks the agent a question, it follows a structured discovery protocol to find relevant context. The protocol is designed to minimize API calls while maximizing coverage.

Read Own Page FREE

Agent reads the hc-metadata from the page it's embedded on. This data is already in the DOM — zero network cost. Provides immediate context: page title, type, tags, summary, and any structured data (blockers, actions, meetings array).

Resolve Registry Stack FREE

Read registry_stack from hc-metadata to find parent registries. The stack is a 5-level hierarchy: meta → standard → organization → project → user. Fallback: check registries object, then site root /.

Fetch Registry Metadata 1 API CALL

Fetch the parent registry page and parse its hc-metadata. This returns the full catalog: meetings[], artifacts[], blockers.urgent[], blockers.proactive[]. Each entry has name, URL, summary, tags, and date.

Match Query to Entries FREE

Agent scans registry entries for relevance: keyword match in title/summary/tags, person filter (attendee lists), date range filter. This narrows hundreds of entries to 1–5 candidates without any API calls.

Deep Read Linked Pages 1–5 API CALLS

For each relevant candidate, fetch the page and extract its hc-metadata + hc-context-public. This gives the agent deep context: full meeting notes, detailed action items, decision rationale, attendee lists, tool specs.

Synthesize Answer 1 LLM CALL

Agent combines own-page context + registry overview + deep-read details into a cited answer. Every claim references a specific page title + date. Never fabricates data.

Discovery Flow Diagram

User Question | v [ChatAgent Widget] -----> [Own Page hc-metadata] (instant, in DOM) | v [POST /api/hc/registry-agent] | +--> [Registry hc-metadata] (1 fetch) | | | +--> meetings[]: 22 entries with URL, summary, tags | +--> artifacts[]: core docs, sprint pages | +--> blockers.urgent[]: active blockers | +--> blockers.proactive[]: action items | +--> [Match & Filter] (local, no API) | | | +--> "What did Jason discuss?" => filter by attendee | +--> "Day 3 blockers?" => filter by tag + date | +--> "LinkedIn strategy?" => keyword search | +--> [Deep Read 1-5 Pages] (targeted fetches) | | | +--> Parse hc-metadata for meeting_data | +--> Parse hc-context-public for decisions, tools | v [Cited Answer with Page References]

Section 03

Existing Implementation — What's Built Today

Two components already exist that implement parts of this architecture. The gap is connecting them with tool-use capability.

A. ChatAgent Component

templates/athio-registry-jsx.jsx : lines 237–348

A React component embedded in the Athio JSX registry. Renders as a floating action button (bottom-right). On click, opens a chat panel with message history. Currently:

Context source: Passes the entire metadata object (loaded from hc-metadata via loadMetadata()) as the registry_data field in the API request.
API call: First tries POST /api/hc/registry-agent. If 404, falls back to POST /api/chat with the metadata stringified into the system prompt (truncated to 12KB).
Message format: {role: "user"|"assistant", content: string} — standard chat format.
UX: Loading spinner ("Thinking..."), auto-scroll to bottom, Enter-to-send.

Current Limitation

The ChatAgent sends the entire registry metadata as a flat blob. The agent cannot selectively fetch individual pages. If the registry has 22 entries, all 22 summaries are jammed into the system prompt, but none of the linked pages' deep content (meeting notes, decisions, action items) is accessible.

B. Registry Agent API

app/api/hc/registry-agent/route.ts

Server-side API that proxies to the Anthropic API with registry context. Supports four actions:

Action	Purpose	Input	Output
`chat`	Free-form Q&A about registry	User message + registry data	Natural language answer
`assess`	Analyze new page impact	Page metadata	Blockers, actions, decisions extracted
`rebuild`	Re-aggregate linked pages	Registry ID	Inconsistency report
`cleanup`	Flag stale items	Registry data	Suggested removals

C. Skill Injector (Registry Walk)

lib/hc/skill-injector.ts

The page enhancer already implements the exact discovery pattern the agent needs, but on the client side for cross-reference cards:

Reads registry_stack from page's hc-metadata
Fetches each registry in parallel (deduplicates by URL)
For meta-registries, recursively discovers sub-registries (depth-1)
Matches current page's tags against registry entries
Injects cross-reference cards (max 8) with level badges

Reuse Opportunity

The toHcFormat(html, sourceUrl) function in skill-injector.ts converts any HC page to a plain-text .hc format optimized for LLM consumption. This is exactly what the agent needs to read linked pages — structured text extraction from HTML, already built.

D. HC Format Export

The toHcFormat function produces output like:

output.hc

# Meeting Title — HC v1.3.3
> Source: https://ideas.asapai.net/meeting-slug
> Type: meeting | ID: artifact-id-here

---
## hc-metadata
{ "title": "...", "meeting_data": { "blockers": [...], "decisions": [...] } }

---
## hc-instructions
How to use this meeting page...

---
## hc-context-public
{ "attendees": [...], "action_items": [...], "decisions": [...] }

Section 04

Target Architecture — Tool-Using Agent

The key upgrade is giving the agent tool-use capability via the Anthropic Messages API. Instead of cramming all context into the system prompt, the agent decides what to read based on the user's question.

Architecture Diagram

[Browser: ChatAgent Widget] | POST /api/hc/registry-agent | { message, page_metadata } v [API Route: registry-agent] | | 1. Build system prompt (page context + tool definitions) | 2. Call Anthropic Messages API with tool_use enabled v [Claude Sonnet + Tools] | | Agent decides which tools to call: | +--> read_own_page() Read page hc-metadata +--> list_registry_entries() Fetch registry catalog +--> read_page(url) Fetch any linked page +--> search_registry(query) Filter entries by keyword/tag/person +--> walk_meta_registry() Discover all registries | | Tool results fed back to Claude | Claude generates final answer with citations v [Response streamed to ChatAgent]

Tool-Use Loop (Server-Side)

The registry-agent API handles the tool-use loop server-side. The client sends a single request and receives a final answer — no client-side tool orchestration needed.

Pseudo-code: Tool Loop

async function handleChat(message, pageMetadata) {
  const tools = [readOwnPage, listRegistryEntries, readPage, searchRegistry, walkMetaRegistry];
  const systemPrompt = buildSystemPrompt(pageMetadata);

  let messages = [{ role: "user", content: message }];
  let totalFetches = 0;

  // Tool-use loop (max 3 iterations to prevent runaway)
  for (let i = 0; i < 3; i++) {
    const response = await anthropic.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 2000,
      system: systemPrompt,
      tools: tools,
      messages: messages,
    });

    // If no tool use, we have the final answer
    if (response.stop_reason === "end_turn") {
      return response.content[0].text;
    }

    // Execute tool calls
    for (const block of response.content) {
      if (block.type === "tool_use") {
        const result = await executeTool(block.name, block.input);
        totalFetches += (block.name === "read_page" ? 1 : 0);

        // Safety: cap at 5 page fetches per turn
        if (totalFetches > 5) break;

        messages.push({ role: "assistant", content: response.content });
        messages.push({
          role: "user",
          content: [{ type: "tool_result", tool_use_id: block.id, content: result }]
        });
      }
    }
  }
}

Section 05

Agent Tools Specification

Tool	Input	Output	Cost
`read_own_page`	None (uses page metadata passed in request)	Parsed hc-metadata + hc-context-public JSON	0 fetches
`list_registry_entries`	`registry_url?` (defaults to registry_stack[0])	Array of entries: `{name, url, summary, tags, date, type}`	1 fetch
`read_page`	`page_url` (must be on user's domain)	Page's hc-metadata + hc-context-public as structured text via `toHcFormat()`	1 fetch
`search_registry`	`query, filters?: {tags[], person, date_from, date_to}`	Filtered entries matching criteria	0–1 fetches
`walk_meta_registry`	`meta_registry_url?` (defaults to `/meta-registry`)	Array of registry descriptors: `{name, url, mode, entry_count, level}`	1 fetch

Tool: `read_page` — Implementation Detail

This is the most important tool. It fetches any HC page by URL and returns structured content for the LLM.

Implementation

async function readPage(pageUrl: string): Promise<string> {
  // Validate URL is on an allowed domain
  const allowedDomains = await getUserDomains(userId);
  const urlDomain = new URL(pageUrl).hostname;
  if (!allowedDomains.includes(urlDomain)) {
    return "Error: Cannot read pages from unauthorized domain.";
  }

  // Fetch the page HTML
  const res = await fetch(pageUrl);
  if (!res.ok) return `Error: Page returned ${res.status}`;

  const html = await res.text();

  // Convert to LLM-optimized format using existing toHcFormat()
  const hcText = toHcFormat(html, pageUrl);

  // Truncate to prevent context window overflow (max 8K per page)
  return hcText.substring(0, 8000);
}

Tool: `search_registry` — Implementation Detail

Implementation

function searchRegistry(
  registryData: any,
  query: string,
  filters?: { tags?: string[], person?: string, date_from?: string, date_to?: string }
): any[] {
  const allEntries = [
    ...(registryData.meetings || []),
    ...(registryData.artifacts || []),
  ];

  return allEntries.filter(entry => {
    // Keyword match
    if (query) {
      const text = [entry.name, entry.summary, ...(entry.tags || [])].join(" ").toLowerCase();
      if (!text.includes(query.toLowerCase())) return false;
    }

    // Person filter
    if (filters?.person) {
      const people = [...(entry.attendees || []), entry.owner || ""].map(s => s.toLowerCase());
      if (!people.some(p => p.includes(filters.person.toLowerCase()))) return false;
    }

    // Date range filter
    if (filters?.date_from) {
      const entryDate = new Date(entry.meeting_date || entry.added);
      if (entryDate < new Date(filters.date_from)) return false;
    }
    if (filters?.date_to) {
      const entryDate = new Date(entry.meeting_date || entry.added);
      if (entryDate > new Date(filters.date_to)) return false;
    }

    // Tag filter
    if (filters?.tags?.length) {
      const entryTags = (entry.tags || []).map(t => t.toLowerCase());
      if (!filters.tags.some(t => entryTags.includes(t.toLowerCase()))) return false;
    }

    return true;
  });
}

Section 06

Security, Limits & Context Management

Security Policies

Policy	Rule	Rationale
Domain Restriction	Agent can only fetch pages from domains the authenticated user owns	Prevents data exfiltration — agent can't read arbitrary URLs
Auth Required	Registry-agent API requires valid Supabase session	No anonymous access to LLM-powered features
Read-Only (v1)	Agent cannot write, update, or delete any data	Safety first — write capability deferred to v2
API Key Server-Side	Anthropic API key stored as `ANTHROPIC_API_KEY` env var on Vercel, never sent to client	Prevent key leakage

Rate Limits & Budgets

Limit	Value	Notes
Page fetches per turn	5 max	Prevents runaway crawls; usually 1–3 suffice
Content per page	8,000 chars	Truncated via `toHcFormat()` output
Total context per turn	~50K tokens	System prompt + tools + messages + tool results
Tool loop iterations	3 max	Agent gets 3 rounds of tool calls before forced response
API calls per user/minute	10	Standard rate limit via Vercel edge middleware
Max output tokens	2,000	Keeps responses focused and fast

Context Window Strategy

Budget Allocation

Claude Sonnet's 200K context window is allocated as follows:

System prompt: ~2K tokens (instructions + page metadata summary)

Tool definitions: ~1.5K tokens (5 tools with descriptions)

Registry overview: ~5–15K tokens (depends on entry count)

Deep-read pages: ~8K per page × 5 max = 40K tokens

Conversation history: ~5K tokens (recent messages)

Output budget: 2K tokens

Total worst case: ~65K tokens — well within 200K limit.

Section 07

Open Questions & Design Decisions

#	Question	Options	Recommendation
1	Registry data caching between messages?	Fresh fetch each time vs. cache for session duration	Cache for 5 minutes. Registry changes are infrequent during a conversation.
2	Cross-registry discovery?	Restrict to own registry vs. allow meta-registry walk	Allow meta-registry walk but only for user's own domains. Enables "what's happening across all registries?" queries.
3	Write operations in v2?	Read-only forever vs. allow dismiss/annotate	v2: Allow dismiss + add-note. Never allow delete or structural changes via agent.
4	Streaming responses?	Wait for full response vs. SSE streaming	SSE streaming for chat action. Non-streaming for assess/rebuild/cleanup (need structured JSON output).
5	Conversation persistence?	localStorage vs. server session vs. ephemeral	localStorage with 10-message window. Cheap, no server state, clears on page navigation (acceptable for page-scoped agent).
6	Agent on non-registry pages?	Registry-only vs. any HC page	Any HC page with a `registry_stack`. Agent discovers context by walking up to its registry. A meeting page agent can answer questions about the meeting + related context.

Section 08

Rollback & Recovery

This page is a new artifact — it creates no risk to existing pages. Recovery strategy:

Layer	Mechanism	Action
Git	Source HTML committed before publish	`git revert` to remove temp file
Supabase	`page_versions` table auto-snapshots on registry update	Restore previous version via API or SQL Editor
Registry	Auto-cascade creates version before mutating registry	If the registry entry is unwanted, restore registry from version
Simple Delete	This is a standalone page with no dependencies	Delete from Supabase `pages` table — zero blast radius

Pre-Publish Checkpoint

Git HEAD at time of generation: e715500. No existing files modified. The temp file tmp-hc-page-agent-architecture.html is the only new artifact. Registry entry can be independently reverted.

How to use this page

For LLM agents reading this page:

For humans:

HC Page Agent
Architecture & Discovery Protocol

Core Premise — HC Pages as Knowledge Base

Why This Works

HC vs. RAG: Side-by-Side

Traditional RAG Pipeline

HC Page Agent

Discovery Protocol — The Registry Crawl

Discovery Flow Diagram

Existing Implementation — What's Built Today

A. ChatAgent Component

B. Registry Agent API

C. Skill Injector (Registry Walk)

D. HC Format Export

Target Architecture — Tool-Using Agent

Architecture Diagram

Tool-Use Loop (Server-Side)

Agent Tools Specification

Tool: `read_page` — Implementation Detail

Tool: `search_registry` — Implementation Detail

Security, Limits & Context Management

Security Policies

Rate Limits & Budgets

Context Window Strategy

Open Questions & Design Decisions

Rollback & Recovery

How to use this page

For LLM agents reading this page:

For humans:

Core Premise — HC Pages as Knowledge Base

Why This Works

HC vs. RAG: Side-by-Side

Traditional RAG Pipeline

HC Page Agent

Discovery Protocol — The Registry Crawl

Discovery Flow Diagram

Existing Implementation — What's Built Today

A. ChatAgent Component

B. Registry Agent API

C. Skill Injector (Registry Walk)

D. HC Format Export

Target Architecture — Tool-Using Agent

Architecture Diagram

Tool-Use Loop (Server-Side)

Agent Tools Specification

Tool: read_page — Implementation Detail

Tool: search_registry — Implementation Detail

Security, Limits & Context Management

Security Policies

Rate Limits & Budgets

Context Window Strategy

Open Questions & Design Decisions

Rollback & Recovery

Tool: `read_page` — Implementation Detail

Tool: `search_registry` — Implementation Detail