Test Guide: Card #1264

forge-conductor | script | ralph/forge-conductor-phase-1-task-1-create-conductor-state-table-in-sup-20260413_041526 | 2026-04-13 04:22:27

What Was Built

Ralph completed: Phase 1 Task 1: Create conductor_state table in Supabase. Write SQL to scripts/sql/conductor-state.sql with the schema from Vault/projects/forge-conductor/PRD.md section 4. Run via bash scripts/run-sql.sh. Verify table returns empty array via curl to Supabase REST API. Include has_alignment, has_discovery, has_design, has_draw, has_prd, current_stage, next_step, checkpoint_gate, checkpoint_status, checkpoint_summary, checkpoint_decision_gate (JSONB), tickle_tier, depends_on, subsumes columns.

What This Unlocks

Verify the changes work as expected.

How To Verify

Follow each step. In the dashboard /verify tab, mark each as passed or failed, then submit your verdict.

Step 1. Run the script with realistic inputs — not --help, not bad args. Run the ACTUAL thing it was built to do and look at the output.

Expected: Output that makes sense to read — structured sections, clear labels, numbers that tell you something about your system. Not raw dumps or walls of text.

Why: The real test: does the output actually help you understand something or make a decision? If it is unreadable or meaningless, it failed.

Step 2. Read the output like a non-developer. Is it scannable? Can you find the important information in 10 seconds? Are there clear sections, headers, or highlights?

Manual: Read the terminal output from step 1. Look for: clear section headers, important numbers highlighted or labeled, actionable items called out.

Expected: Output should be structured and scannable — key findings near the top, details below. You should be able to tell someone what it found in one sentence.

Why: Output quality check. A script that runs perfectly but produces unreadable output is useless to you.

Step 3. Decision check: after reading the output, do you know what to DO next? Does it tell you something new about your system, or just confirm what you already knew?

Manual: Based on the output, write down one thing you would DO differently. If you cannot think of anything, the output might not be actionable enough.

Expected: You should have at least one actionable takeaway — something to fix, investigate, or feel confident about.

Why: The purpose of any tool is to help you make decisions. If the output does not change your behavior, the tool needs work.

Step 4. Check where the output goes — if it writes files, look at them. If it feeds the dashboard, check the dashboard page. If it publishes, check the published URL.

Expected: Output artifacts (files, dashboard data, published pages) exist and are readable.

Why: Integration check — the script does not exist in isolation. Its outputs should flow into the rest of the system.

Step 5. Try it with broken or missing inputs — does it explain what went wrong clearly, or does it just crash with a stack trace?

Expected: Clear error message explaining what is wrong and what to do about it. Not a Python traceback or cryptic exit code.

Why: Error UX matters — when something goes wrong, you need to know WHY and WHAT TO DO, not just that it failed.

QA Gate Results

No QA gate log found.

Diff Summary

 scripts/__pycache__/check_logs.cpython-312.pyc     | Bin 7552 -> 7547 bytes
 scripts/sql/conductor-state.sql                    |  37 +++++++++++++++++++++
 .../test_check_logs.cpython-312-pytest-9.0.2.pyc   | Bin 29773 -> 29768 bytes
 ...est_approve_reject.cpython-312-pytest-9.0.2.pyc | Bin 18808 -> 18803 bytes
 4 files changed, 37 insertions(+)