Test Guide: Card #738

ralph-v5 | script | ralph/ralph-v5--ralph-v5-phase-4-task-13-add-progress-events-thro-20260408_150644 | 2026-04-08 15:24:33

What Was Built

Ralph completed: [Ralph v5] Phase 4 Task 13: Add progress events throughout ralph.sh and ralph-poller.sh. Emit: planning (when planner starts), planned (when plan JSON produced), coding (when Claude spawned), testing (when tests start), quality (when quality gates start), uat_waiting (when Telegram sent), deploying (when deploy.sh called). Use emit_event from pulse.sh. Acceptance: after full task run, ralph_events has planning→planned→coding→testing→deploying sequence for that run_id.

What This Unlocks

Verify the changes work as expected.

How To Verify

Follow each step. In the dashboard /verify tab, mark each as passed or failed, then submit your verdict.

Step 1. Run the script with realistic inputs — not --help, not bad args. Run the ACTUAL thing it was built to do and look at the output.

Expected: Output that makes sense to read — structured sections, clear labels, numbers that tell you something about your system. Not raw dumps or walls of text.

Why: The real test: does the output actually help you understand something or make a decision? If it is unreadable or meaningless, it failed.

Step 2. Read the output like a non-developer. Is it scannable? Can you find the important information in 10 seconds? Are there clear sections, headers, or highlights?

Manual: Read the terminal output from step 1. Look for: clear section headers, important numbers highlighted or labeled, actionable items called out.

Expected: Output should be structured and scannable — key findings near the top, details below. You should be able to tell someone what it found in one sentence.

Why: Output quality check. A script that runs perfectly but produces unreadable output is useless to you.

Step 3. Decision check: after reading the output, do you know what to DO next? Does it tell you something new about your system, or just confirm what you already knew?

Manual: Based on the output, write down one thing you would DO differently. If you cannot think of anything, the output might not be actionable enough.

Expected: You should have at least one actionable takeaway — something to fix, investigate, or feel confident about.

Why: The purpose of any tool is to help you make decisions. If the output does not change your behavior, the tool needs work.

Step 4. Check where the output goes — if it writes files, look at them. If it feeds the dashboard, check the dashboard page. If it publishes, check the published URL.

Expected: Output artifacts (files, dashboard data, published pages) exist and are readable.

Why: Integration check — the script does not exist in isolation. Its outputs should flow into the rest of the system.

Step 5. Try it with broken or missing inputs — does it explain what went wrong clearly, or does it just crash with a stack trace?

Expected: Clear error message explaining what is wrong and what to do about it. Not a Python traceback or cryptic exit code.

Why: Error UX matters — when something goes wrong, you need to know WHY and WHAT TO DO, not just that it failed.

QA Gate Results

[2026-04-08 15:24:33] === QA CHECK — Task #738 ===
[2026-04-08 15:24:33] Project: ralph-v5
[2026-04-08 15:24:33] Task: [Ralph v5] Phase 4 Task 13: Add progress events throughout ralph.sh and ralph-poller.sh. Emit: planning (when planner starts), planned (when plan JSON produced), coding (when Claude spawned), testing (when tests start), quality (when quality gates start), uat_waiting (when Telegram sent), deploying (when deploy.sh called). Use emit_event from pulse.sh. Acceptance: after full task run, ralph_events has planning→planned→coding→testing→deploying sequence for that run_id.
[2026-04-08 15:24:33] Branch: ralph/ralph-v5--ralph-v5-phase-4-task-13-add-progress-events-thro-20260408_150644
[2026-04-08 15:24:33] 
[2026-04-08 15:24:33] --- CHECK 1: SMOKE TEST ---
[2026-04-08 15:24:33] (could not read services.json)
[2026-04-08 15:24:33] CHECK 1: PASS
[2026-04-08 15:24:33] 
[2026-04-08 15:24:33] --- CHECK 2: SPEC MATCH ---
[2026-04-08 15:24:33] WARN: LiteLLM call failed — skipping spec match (defaulting to PASS)
[2026-04-08 15:24:33] Spec match: SKIPPED (no LLM response)
[2026-04-08 15:24:33] 
[2026-04-08 15:24:33] === OVERALL: PASS ===
[2026-04-08 15:24:33] WARN: Comms service unavailable — notification logged only
[2026-04-08 15:24:33] === QA CHECK COMPLETE ===

Diff Summary

 scripts/pulse.sh |  2 +-
 scripts/ralph.sh | 17 +++++++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)