Test Guide: Card #1056

governance-as-chains | script | ralph/governance-as-chains-wire-chain-deploy-post-into-scripts-deploy-sh-afte-20260413_035751 | 2026-04-13 04:02:53

What Was Built

Ralph completed: Wire /chain-deploy post into scripts/deploy.sh. After successful merge and existing smoke test, invoke /chain-deploy post (behavioral-verify + notify.sh). Fail-open: if chain invocation fails, report 'DEPLOY OK — VERIFICATION SKIPPED' warning but do not block deploy. Add SKIP_CHAIN_DEPLOY=1 as emergency bypass env var. PRD Phase 5 Task 2.\n

What This Unlocks

Verify the changes work as expected.

How To Verify

Follow each step. In the dashboard /verify tab, mark each as passed or failed, then submit your verdict.

Step 1. Run the script with realistic inputs — not --help, not bad args. Run the ACTUAL thing it was built to do and look at the output.
Expected: Output that makes sense to read — structured sections, clear labels, numbers that tell you something about your system. Not raw dumps or walls of text.
Why: The real test: does the output actually help you understand something or make a decision? If it is unreadable or meaningless, it failed.
Step 2. Read the output like a non-developer. Is it scannable? Can you find the important information in 10 seconds? Are there clear sections, headers, or highlights?
Manual: Read the terminal output from step 1. Look for: clear section headers, important numbers highlighted or labeled, actionable items called out.
Expected: Output should be structured and scannable — key findings near the top, details below. You should be able to tell someone what it found in one sentence.
Why: Output quality check. A script that runs perfectly but produces unreadable output is useless to you.
Step 3. Decision check: after reading the output, do you know what to DO next? Does it tell you something new about your system, or just confirm what you already knew?
Manual: Based on the output, write down one thing you would DO differently. If you cannot think of anything, the output might not be actionable enough.
Expected: You should have at least one actionable takeaway — something to fix, investigate, or feel confident about.
Why: The purpose of any tool is to help you make decisions. If the output does not change your behavior, the tool needs work.
Step 4. Check where the output goes — if it writes files, look at them. If it feeds the dashboard, check the dashboard page. If it publishes, check the published URL.
Expected: Output artifacts (files, dashboard data, published pages) exist and are readable.
Why: Integration check — the script does not exist in isolation. Its outputs should flow into the rest of the system.
Step 5. Try it with broken or missing inputs — does it explain what went wrong clearly, or does it just crash with a stack trace?
Expected: Clear error message explaining what is wrong and what to do about it. Not a Python traceback or cryptic exit code.
Why: Error UX matters — when something goes wrong, you need to know WHY and WHAT TO DO, not just that it failed.
QA Gate Results

No QA gate log found.

Diff Summary
 .claude/skills/chain-deploy                        |   1 +
 scripts/__pycache__/check_logs.cpython-312.pyc     | Bin 7552 -> 7552 bytes
 scripts/deploy.sh                                  |  36 ++++++-
 skills/chain-deploy/SKILL.md                       | 120 +++++++++++++++++++++
 .../test_check_logs.cpython-312-pytest-9.0.2.pyc   | Bin 29773 -> 29773 bytes
 ...est_approve_reject.cpython-312-pytest-9.0.2.pyc | Bin 18808 -> 18808 bytes
 6 files changed, 156 insertions(+), 1 deletion(-)