Test Guide: Card #1264
forge-conductor | script | ralph/forge-conductor-phase-1-task-1-create-conductor-state-table-in-sup-20260413_041526 | 2026-04-13 04:22:27
What Was Built
Ralph completed: Phase 1 Task 1: Create conductor_state table in Supabase. Write SQL to scripts/sql/conductor-state.sql with the schema from Vault/projects/forge-conductor/PRD.md section 4. Run via bash scripts/run-sql.sh. Verify table returns empty array via curl to Supabase REST API. Include has_alignment, has_discovery, has_design, has_draw, has_prd, current_stage, next_step, checkpoint_gate, checkpoint_status, checkpoint_summary, checkpoint_decision_gate (JSONB), tickle_tier, depends_on, subsumes columns.
What This Unlocks
Verify the changes work as expected.
How To Verify
Follow each step. In the dashboard /verify tab, mark each as passed or failed, then submit your verdict.
Step 1. Run the script with realistic inputs — not --help, not bad args. Run the ACTUAL thing it was built to do and look at the output.
Expected: Output that makes sense to read — structured sections, clear labels, numbers that tell you something about your system. Not raw dumps or walls of text.
Why: The real test: does the output actually help you understand something or make a decision? If it is unreadable or meaningless, it failed.
Step 2. Read the output like a non-developer. Is it scannable? Can you find the important information in 10 seconds? Are there clear sections, headers, or highlights?
Manual: Read the terminal output from step 1. Look for: clear section headers, important numbers highlighted or labeled, actionable items called out.
Expected: Output should be structured and scannable — key findings near the top, details below. You should be able to tell someone what it found in one sentence.
Why: Output quality check. A script that runs perfectly but produces unreadable output is useless to you.
Step 3. Decision check: after reading the output, do you know what to DO next? Does it tell you something new about your system, or just confirm what you already knew?
Manual: Based on the output, write down one thing you would DO differently. If you cannot think of anything, the output might not be actionable enough.
Expected: You should have at least one actionable takeaway — something to fix, investigate, or feel confident about.
Why: The purpose of any tool is to help you make decisions. If the output does not change your behavior, the tool needs work.
Step 4. Check where the output goes — if it writes files, look at them. If it feeds the dashboard, check the dashboard page. If it publishes, check the published URL.
Expected: Output artifacts (files, dashboard data, published pages) exist and are readable.
Why: Integration check — the script does not exist in isolation. Its outputs should flow into the rest of the system.
Step 5. Try it with broken or missing inputs — does it explain what went wrong clearly, or does it just crash with a stack trace?
Expected: Clear error message explaining what is wrong and what to do about it. Not a Python traceback or cryptic exit code.
Why: Error UX matters — when something goes wrong, you need to know WHY and WHAT TO DO, not just that it failed.
QA Gate Results
No QA gate log found.
Diff Summary
scripts/__pycache__/check_logs.cpython-312.pyc | Bin 7552 -> 7547 bytes
scripts/sql/conductor-state.sql | 37 +++++++++++++++++++++
.../test_check_logs.cpython-312-pytest-9.0.2.pyc | Bin 29773 -> 29768 bytes
...est_approve_reject.cpython-312-pytest-9.0.2.pyc | Bin 18808 -> 18803 bytes
4 files changed, 37 insertions(+)