Test Guide: Card #738
ralph-v5 | script | ralph/ralph-v5--ralph-v5-phase-4-task-13-add-progress-events-thro-20260408_150644 | 2026-04-08 15:24:33
What Was Built
Ralph completed: [Ralph v5] Phase 4 Task 13: Add progress events throughout ralph.sh and ralph-poller.sh. Emit: planning (when planner starts), planned (when plan JSON produced), coding (when Claude spawned), testing (when tests start), quality (when quality gates start), uat_waiting (when Telegram sent), deploying (when deploy.sh called). Use emit_event from pulse.sh. Acceptance: after full task run, ralph_events has planning→planned→coding→testing→deploying sequence for that run_id.
What This Unlocks
Verify the changes work as expected.
How To Verify
Follow each step. In the dashboard /verify tab, mark each as passed or failed, then submit your verdict.
Step 1. Run the script with realistic inputs — not --help, not bad args. Run the ACTUAL thing it was built to do and look at the output.
Expected: Output that makes sense to read — structured sections, clear labels, numbers that tell you something about your system. Not raw dumps or walls of text.
Why: The real test: does the output actually help you understand something or make a decision? If it is unreadable or meaningless, it failed.
Step 2. Read the output like a non-developer. Is it scannable? Can you find the important information in 10 seconds? Are there clear sections, headers, or highlights?
Manual: Read the terminal output from step 1. Look for: clear section headers, important numbers highlighted or labeled, actionable items called out.
Expected: Output should be structured and scannable — key findings near the top, details below. You should be able to tell someone what it found in one sentence.
Why: Output quality check. A script that runs perfectly but produces unreadable output is useless to you.
Step 3. Decision check: after reading the output, do you know what to DO next? Does it tell you something new about your system, or just confirm what you already knew?
Manual: Based on the output, write down one thing you would DO differently. If you cannot think of anything, the output might not be actionable enough.
Expected: You should have at least one actionable takeaway — something to fix, investigate, or feel confident about.
Why: The purpose of any tool is to help you make decisions. If the output does not change your behavior, the tool needs work.
Step 4. Check where the output goes — if it writes files, look at them. If it feeds the dashboard, check the dashboard page. If it publishes, check the published URL.
Expected: Output artifacts (files, dashboard data, published pages) exist and are readable.
Why: Integration check — the script does not exist in isolation. Its outputs should flow into the rest of the system.
Step 5. Try it with broken or missing inputs — does it explain what went wrong clearly, or does it just crash with a stack trace?
Expected: Clear error message explaining what is wrong and what to do about it. Not a Python traceback or cryptic exit code.
Why: Error UX matters — when something goes wrong, you need to know WHY and WHAT TO DO, not just that it failed.
QA Gate Results
[2026-04-08 15:24:33] === QA CHECK — Task #738 ===
[2026-04-08 15:24:33] Project: ralph-v5
[2026-04-08 15:24:33] Task: [Ralph v5] Phase 4 Task 13: Add progress events throughout ralph.sh and ralph-poller.sh. Emit: planning (when planner starts), planned (when plan JSON produced), coding (when Claude spawned), testing (when tests start), quality (when quality gates start), uat_waiting (when Telegram sent), deploying (when deploy.sh called). Use emit_event from pulse.sh. Acceptance: after full task run, ralph_events has planning→planned→coding→testing→deploying sequence for that run_id.
[2026-04-08 15:24:33] Branch: ralph/ralph-v5--ralph-v5-phase-4-task-13-add-progress-events-thro-20260408_150644
[2026-04-08 15:24:33]
[2026-04-08 15:24:33] --- CHECK 1: SMOKE TEST ---
[2026-04-08 15:24:33] (could not read services.json)
[2026-04-08 15:24:33] CHECK 1: PASS
[2026-04-08 15:24:33]
[2026-04-08 15:24:33] --- CHECK 2: SPEC MATCH ---
[2026-04-08 15:24:33] WARN: LiteLLM call failed — skipping spec match (defaulting to PASS)
[2026-04-08 15:24:33] Spec match: SKIPPED (no LLM response)
[2026-04-08 15:24:33]
[2026-04-08 15:24:33] === OVERALL: PASS ===
[2026-04-08 15:24:33] WARN: Comms service unavailable — notification logged only
[2026-04-08 15:24:33] === QA CHECK COMPLETE ===
Diff Summary
scripts/pulse.sh | 2 +-
scripts/ralph.sh | 17 +++++++++++++++++
2 files changed, 18 insertions(+), 1 deletion(-)