An LLM agent that uses HC pages as its knowledge base, crawls the neural registry for context, and reads linked pages on demand. No RAG. No vector DB. Just structured HTML.
Traditional AI assistants require a dedicated knowledge pipeline: documents are chunked, embedded into vectors, stored in a vector database, and retrieved via similarity search (RAG). The HC Page Agent eliminates this entire stack.
Every HC page already contains structured JSON metadata. The hc-metadata block is the embedding. The neural registry is the vector index. The page URL is the retrieval mechanism. The knowledge base already exists — it's the published pages themselves.
hc-metadata — Structured JSON with title, tags, summary, meeting data, action items. This IS the structured representation that RAG tries to create from unstructured text.hc-instructions — Tells the agent HOW to interpret the page. No prompt engineering needed per document — each page carries its own instructions.hc-context-public — Deep structured data (attendees, decisions, blockers, tool specs). Machine-readable by design.When a user asks the agent a question, it follows a structured discovery protocol to find relevant context. The protocol is designed to minimize API calls while maximizing coverage.
hc-metadata from the page it's embedded on. This data is already in the DOM — zero network cost. Provides immediate context: page title, type, tags, summary, and any structured data (blockers, actions, meetings array).registry_stack from hc-metadata to find parent registries. The stack is a 5-level hierarchy: meta → standard → organization → project → user. Fallback: check registries object, then site root /.hc-metadata. This returns the full catalog: meetings[], artifacts[], blockers.urgent[], blockers.proactive[]. Each entry has name, URL, summary, tags, and date.hc-metadata + hc-context-public. This gives the agent deep context: full meeting notes, detailed action items, decision rationale, attendee lists, tool specs.Two components already exist that implement parts of this architecture. The gap is connecting them with tool-use capability.
templates/athio-registry-jsx.jsx : lines 237–348
A React component embedded in the Athio JSX registry. Renders as a floating action button (bottom-right). On click, opens a chat panel with message history. Currently:
metadata object (loaded from hc-metadata via loadMetadata()) as the registry_data field in the API request.POST /api/hc/registry-agent. If 404, falls back to POST /api/chat with the metadata stringified into the system prompt (truncated to 12KB).{role: "user"|"assistant", content: string} — standard chat format.The ChatAgent sends the entire registry metadata as a flat blob. The agent cannot selectively fetch individual pages. If the registry has 22 entries, all 22 summaries are jammed into the system prompt, but none of the linked pages' deep content (meeting notes, decisions, action items) is accessible.
app/api/hc/registry-agent/route.ts
Server-side API that proxies to the Anthropic API with registry context. Supports four actions:
| Action | Purpose | Input | Output |
|---|---|---|---|
chat |
Free-form Q&A about registry | User message + registry data | Natural language answer |
assess |
Analyze new page impact | Page metadata | Blockers, actions, decisions extracted |
rebuild |
Re-aggregate linked pages | Registry ID | Inconsistency report |
cleanup |
Flag stale items | Registry data | Suggested removals |
lib/hc/skill-injector.ts
The page enhancer already implements the exact discovery pattern the agent needs, but on the client side for cross-reference cards:
registry_stack from page's hc-metadataThe toHcFormat(html, sourceUrl) function in skill-injector.ts converts any HC page to a plain-text .hc format optimized for LLM consumption. This is exactly what the agent needs to read linked pages — structured text extraction from HTML, already built.
The toHcFormat function produces output like:
# Meeting Title — HC v1.3.3
> Source: https://ideas.asapai.net/meeting-slug
> Type: meeting | ID: artifact-id-here
---
## hc-metadata
{ "title": "...", "meeting_data": { "blockers": [...], "decisions": [...] } }
---
## hc-instructions
How to use this meeting page...
---
## hc-context-public
{ "attendees": [...], "action_items": [...], "decisions": [...] }
The key upgrade is giving the agent tool-use capability via the Anthropic Messages API. Instead of cramming all context into the system prompt, the agent decides what to read based on the user's question.
The registry-agent API handles the tool-use loop server-side. The client sends a single request and receives a final answer — no client-side tool orchestration needed.
async function handleChat(message, pageMetadata) {
const tools = [readOwnPage, listRegistryEntries, readPage, searchRegistry, walkMetaRegistry];
const systemPrompt = buildSystemPrompt(pageMetadata);
let messages = [{ role: "user", content: message }];
let totalFetches = 0;
// Tool-use loop (max 3 iterations to prevent runaway)
for (let i = 0; i < 3; i++) {
const response = await anthropic.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 2000,
system: systemPrompt,
tools: tools,
messages: messages,
});
// If no tool use, we have the final answer
if (response.stop_reason === "end_turn") {
return response.content[0].text;
}
// Execute tool calls
for (const block of response.content) {
if (block.type === "tool_use") {
const result = await executeTool(block.name, block.input);
totalFetches += (block.name === "read_page" ? 1 : 0);
// Safety: cap at 5 page fetches per turn
if (totalFetches > 5) break;
messages.push({ role: "assistant", content: response.content });
messages.push({
role: "user",
content: [{ type: "tool_result", tool_use_id: block.id, content: result }]
});
}
}
}
}
| Tool | Input | Output | Cost |
|---|---|---|---|
read_own_page |
None (uses page metadata passed in request) | Parsed hc-metadata + hc-context-public JSON | 0 fetches |
list_registry_entries |
registry_url? (defaults to registry_stack[0]) |
Array of entries: {name, url, summary, tags, date, type} |
1 fetch |
read_page |
page_url (must be on user's domain) |
Page's hc-metadata + hc-context-public as structured text via toHcFormat() |
1 fetch |
search_registry |
query, filters?: {tags[], person, date_from, date_to} |
Filtered entries matching criteria | 0–1 fetches |
walk_meta_registry |
meta_registry_url? (defaults to /meta-registry) |
Array of registry descriptors: {name, url, mode, entry_count, level} |
1 fetch |
read_page — Implementation DetailThis is the most important tool. It fetches any HC page by URL and returns structured content for the LLM.
async function readPage(pageUrl: string): Promise<string> {
// Validate URL is on an allowed domain
const allowedDomains = await getUserDomains(userId);
const urlDomain = new URL(pageUrl).hostname;
if (!allowedDomains.includes(urlDomain)) {
return "Error: Cannot read pages from unauthorized domain.";
}
// Fetch the page HTML
const res = await fetch(pageUrl);
if (!res.ok) return `Error: Page returned ${res.status}`;
const html = await res.text();
// Convert to LLM-optimized format using existing toHcFormat()
const hcText = toHcFormat(html, pageUrl);
// Truncate to prevent context window overflow (max 8K per page)
return hcText.substring(0, 8000);
}
search_registry — Implementation Detailfunction searchRegistry(
registryData: any,
query: string,
filters?: { tags?: string[], person?: string, date_from?: string, date_to?: string }
): any[] {
const allEntries = [
...(registryData.meetings || []),
...(registryData.artifacts || []),
];
return allEntries.filter(entry => {
// Keyword match
if (query) {
const text = [entry.name, entry.summary, ...(entry.tags || [])].join(" ").toLowerCase();
if (!text.includes(query.toLowerCase())) return false;
}
// Person filter
if (filters?.person) {
const people = [...(entry.attendees || []), entry.owner || ""].map(s => s.toLowerCase());
if (!people.some(p => p.includes(filters.person.toLowerCase()))) return false;
}
// Date range filter
if (filters?.date_from) {
const entryDate = new Date(entry.meeting_date || entry.added);
if (entryDate < new Date(filters.date_from)) return false;
}
if (filters?.date_to) {
const entryDate = new Date(entry.meeting_date || entry.added);
if (entryDate > new Date(filters.date_to)) return false;
}
// Tag filter
if (filters?.tags?.length) {
const entryTags = (entry.tags || []).map(t => t.toLowerCase());
if (!filters.tags.some(t => entryTags.includes(t.toLowerCase()))) return false;
}
return true;
});
}
| Policy | Rule | Rationale |
|---|---|---|
| Domain Restriction | Agent can only fetch pages from domains the authenticated user owns | Prevents data exfiltration — agent can't read arbitrary URLs |
| Auth Required | Registry-agent API requires valid Supabase session | No anonymous access to LLM-powered features |
| Read-Only (v1) | Agent cannot write, update, or delete any data | Safety first — write capability deferred to v2 |
| API Key Server-Side | Anthropic API key stored as ANTHROPIC_API_KEY env var on Vercel, never sent to client |
Prevent key leakage |
| Limit | Value | Notes |
|---|---|---|
| Page fetches per turn | 5 max | Prevents runaway crawls; usually 1–3 suffice |
| Content per page | 8,000 chars | Truncated via toHcFormat() output |
| Total context per turn | ~50K tokens | System prompt + tools + messages + tool results |
| Tool loop iterations | 3 max | Agent gets 3 rounds of tool calls before forced response |
| API calls per user/minute | 10 | Standard rate limit via Vercel edge middleware |
| Max output tokens | 2,000 | Keeps responses focused and fast |
Claude Sonnet's 200K context window is allocated as follows:
System prompt: ~2K tokens (instructions + page metadata summary)
Tool definitions: ~1.5K tokens (5 tools with descriptions)
Registry overview: ~5–15K tokens (depends on entry count)
Deep-read pages: ~8K per page × 5 max = 40K tokens
Conversation history: ~5K tokens (recent messages)
Output budget: 2K tokens
Total worst case: ~65K tokens — well within 200K limit.
| # | Question | Options | Recommendation |
|---|---|---|---|
| 1 | Registry data caching between messages? | Fresh fetch each time vs. cache for session duration | Cache for 5 minutes. Registry changes are infrequent during a conversation. |
| 2 | Cross-registry discovery? | Restrict to own registry vs. allow meta-registry walk | Allow meta-registry walk but only for user's own domains. Enables "what's happening across all registries?" queries. |
| 3 | Write operations in v2? | Read-only forever vs. allow dismiss/annotate | v2: Allow dismiss + add-note. Never allow delete or structural changes via agent. |
| 4 | Streaming responses? | Wait for full response vs. SSE streaming | SSE streaming for chat action. Non-streaming for assess/rebuild/cleanup (need structured JSON output). |
| 5 | Conversation persistence? | localStorage vs. server session vs. ephemeral | localStorage with 10-message window. Cheap, no server state, clears on page navigation (acceptable for page-scoped agent). |
| 6 | Agent on non-registry pages? | Registry-only vs. any HC page | Any HC page with a registry_stack. Agent discovers context by walking up to its registry. A meeting page agent can answer questions about the meeting + related context. |
This page is a new artifact — it creates no risk to existing pages. Recovery strategy:
| Layer | Mechanism | Action |
|---|---|---|
| Git | Source HTML committed before publish | git revert to remove temp file |
| Supabase | page_versions table auto-snapshots on registry update |
Restore previous version via API or SQL Editor |
| Registry | Auto-cascade creates version before mutating registry | If the registry entry is unwanted, restore registry from version |
| Simple Delete | This is a standalone page with no dependencies | Delete from Supabase pages table — zero blast radius |
Git HEAD at time of generation: e715500. No existing files modified. The temp file tmp-hc-page-agent-architecture.html is the only new artifact. Registry entry can be independently reverted.