diff --git a/.claude/skills/debug-agent/SKILL.md b/.claude/skills/debug-agent/SKILL.md
new file mode 100644
index 00000000..95cad3af
--- /dev/null
+++ b/.claude/skills/debug-agent/SKILL.md
@@ -0,0 +1,184 @@
+---
+name: debug-agent
+description: >-
+  Systematic evidence-based debugging using runtime logs. Generates hypotheses,
+  instruments code with NDJSON logs, guides reproduction, analyzes log evidence,
+  and iterates until root cause is proven with cited log lines. Use when the
+  user reports a bug, unexpected behavior, or asks to debug an issue.
+---
+
+# Debug Mode
+
+You are now in **DEBUG MODE**. You must debug with **runtime evidence**.
+
+**Why this approach:** Traditional AI agents jump to fixes claiming 100% confidence, but fail due to lacking runtime information.
+They guess based on code alone. You **cannot** and **must NOT** fix bugs this way — you need actual runtime data.
+
+**Your systematic workflow:**
+
+1. **Generate 3-5 precise hypotheses** about WHY the bug occurs (be detailed, aim for MORE not fewer)
+2. **Instrument code** with logs (see Logging section) to test all hypotheses in parallel
+3. **Reproduce the bug.**
+   - **If a failing test already exists**: run it directly.
+   - **If reproduction is straightforward** (e.g., a single CLI command, a curl request, a simple script): write and run an ad hoc reproduction script yourself. Tailor it to the runtime — Playwright/Puppeteer for browser bugs, a Node/Python/shell script for backend bugs, etc.
+   - **Otherwise**: ask the user to reproduce it. Provide clear, numbered steps. Remind them to restart apps/services if instrumented files are cached or bundled. Offer: "If you'd like me to write a reproduction script instead, let me know."
+   - Once the user confirms a reproduction pathway (manual or automated), reuse it for all subsequent iterations without re-asking.
+4. **Analyze logs**: evaluate each hypothesis (CONFIRMED/REJECTED/INCONCLUSIVE) with cited log line evidence
+5. **Fix only with 100% confidence** and log proof; do NOT remove instrumentation yet
+6. **Verify with logs**: ask user to run again, compare before/after logs with cited entries
+7. **If logs prove success** and user confirms: remove all instrumentation by searching for `#region debug log` / `#endregion` markers and deleting those blocks (see Cleanup section). **If failed**: FIRST remove any code changes from rejected hypotheses (keep only instrumentation and proven fixes), THEN generate NEW hypotheses from different subsystems and add more instrumentation
+8. **After confirmed success**: explain the problem and provide a concise summary of the fix (1-2 lines)
+
+**Critical constraints:**
+
+- NEVER fix without runtime evidence first
+- ALWAYS rely on runtime information + code (never code alone)
+- Do NOT remove instrumentation before post-fix verification logs prove success and user confirms that there are no more issues
+- Fixes often fail; iteration is expected and preferred. Taking longer with more data yields better, more precise fixes
+
+---
+
+## Logging
+
+### STEP 0: Start the logging server (MANDATORY BEFORE ANY INSTRUMENTATION)
+
+Run the debug server in **daemon mode** before any instrumentation. The `--daemon` flag starts the server in the background and exits immediately with the server info — no backgrounding or `&` required.
+
+```bash
+npx debug-agent --daemon
+```
+
+The command prints a single JSON line to stdout and exits:
+
+```json
+{
+  "sessionId": "a1b2c3",
+  "port": 54321,
+  "endpoint": "http://127.0.0.1:54321/ingest/a1b2c3",
+  "logPath": "/tmp/debug-agent/debug-a1b2c3.log"
+}
+```
+
+Capture and remember these values:
+
+- **Server endpoint**: The `endpoint` value (the HTTP endpoint URL where logs will be sent via POST requests)
+- **Log path**: The `logPath` value (NDJSON logs are written here)
+- **Session ID**: The `sessionId` value (unique identifier for this debug session)
+
+If the server fails to start, STOP IMMEDIATELY and inform the user.
+
+- DO NOT PROCEED with instrumentation without valid logging configuration.
+- The server is idempotent — if one is already running, it returns the existing server's info instead of starting a duplicate.
+- You do not need to pre-create the log file; it will be created automatically when your instrumentation first writes to it.
+
+### STEP 1: Understand the log format
+
+- Logs are written in **NDJSON format** (one JSON object per line) to the file specified by the **log path**.
+- For JavaScript/TypeScript, logs are sent via a POST request to the **server endpoint** during runtime, and the logging server writes these as NDJSON lines to the **log path** file.
+- For other languages (Python, Go, Rust, Java, C/C++, Ruby, etc.), you should prefer writing logs directly by appending NDJSON lines to the **log path** using the language's standard library file I/O.
+
+Example log entry:
+
+```json
+{
+  "sessionId": "a1b2c3",
+  "id": "log_1733456789_abc",
+  "timestamp": 1733456789000,
+  "location": "test.js:42",
+  "message": "User score",
+  "data": { "userId": 5, "score": 85 },
+  "runId": "run1",
+  "hypothesisId": "A"
+}
+```
+
+### STEP 2: Insert instrumentation logs
+
+- In **JavaScript/TypeScript files**, use this one-line fetch template (replace `ENDPOINT` and `SESSION_ID` with values from Step 0), even if filesystem access is available:
+
+```
+fetch('ENDPOINT',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({sessionId:'SESSION_ID',location:'file.js:LINE',message:'desc',data:{k:v},timestamp:Date.now()})}).catch(()=>{});
+```
+
+- In **non-JavaScript languages** (Python, Go, Rust, Java, C, C++, Ruby), instrument by opening the **log path** in append mode using standard library file I/O, writing a single NDJSON line with your payload, and then closing the file. Keep these snippets as tiny and compact as possible (ideally one line, or just a few).
+
+- Decide how many instrumentation logs to insert based on the complexity of the code under investigation and the hypotheses you are testing. A single well-placed log may be enough when the issue is highly localized; complex multi-step flows may need more. Aim for the minimum number that can confirm or reject ALL your hypotheses. Guidelines:
+  - At least 1 log is required; never skip instrumentation entirely
+  - Do not exceed 10 logs — if you think you need more, narrow your hypotheses first
+  - Typical range is 2-6 logs, but use your judgment
+
+- Choose log placements from these categories as relevant to your hypotheses:
+  - Function entry with parameters
+  - Function exit with return values
+  - Values BEFORE critical operations
+  - Values AFTER critical operations
+  - Branch execution paths (which if/else executed)
+  - Suspected error/edge case values
+  - State mutations and intermediate values
+
+- Each log must map to at least one hypothesis (include `hypothesisId` in payload).
+- Use this payload structure: `{sessionId, runId, hypothesisId, location, message, data, timestamp}`
+- **REQUIRED:** Wrap EACH debug log in a collapsible code region:
+  - Use language-appropriate region syntax (e.g., `// #region debug log`, `// #endregion` for JS/TS)
+  - This keeps the editor clean by auto-folding debug instrumentation
+- **FORBIDDEN:** Logging secrets (tokens, passwords, API keys, PII)
+
+### STEP 3: Clear previous log file before each run (MANDATORY)
+
+- Send a `DELETE` request to the **server endpoint** to clear the log file before each run. For example: `curl -X DELETE ENDPOINT` (replace `ENDPOINT` with the endpoint value from Step 0).
+- This ensures clean logs for the new run without mixing old and new data.
+- Clearing the log file is NOT the same as removing instrumentation; do not remove any debug logs from code here.
+- **CRITICAL:** Only clear YOUR session's logs (via your endpoint from Step 0). NEVER delete, modify, or overwrite log files belonging to other debug sessions.
+
+### STEP 4: Read logs after user runs the program
+
+- After the user runs the program and confirms completion in their interface, do NOT ask them to type "done"; then use the file-read tool to read the file at the **log path**.
+- The log file will contain NDJSON entries (one JSON object per line) from your instrumentation.
+- Analyze these logs to evaluate your hypotheses and identify the root cause.
+- If log file is empty or missing: tell user the reproduction may have failed and ask them to try again.
+
+### STEP 5: Keep logs during fixes
+
+- When implementing a fix, DO NOT remove debug logs yet.
+- Logs MUST remain active for verification runs.
+- You may tag logs with `runId="post-fix"` to distinguish verification runs from initial debugging runs.
+- FORBIDDEN: Removing or modifying any previously added logs in any files before post-fix verification logs are analyzed or the user explicitly confirms success.
+- Only remove logs after a successful post-fix verification run (log-based proof) or explicit user request to remove.
+
+---
+
+## Critical Reminders (must follow)
+
+- Keep instrumentation active during fixes; do not remove or modify logs until verification succeeds or the user explicitly confirms.
+- FORBIDDEN: Using `setTimeout`, `sleep`, or artificial delays as a "fix"; use proper reactivity/events/lifecycles.
+- FORBIDDEN: Removing instrumentation before analyzing post-fix verification logs or receiving explicit user confirmation.
+- Verification requires before/after log comparison with cited log lines; do not claim success without log proof.
+- Clear logs by sending a DELETE request to the server endpoint.
+- Do not create the log file manually; it's created automatically.
+- Clearing the log file is not removing instrumentation.
+- NEVER delete or modify log files that do not belong to this session. Only touch the log file at the exact path from Step 0.
+- Always try to rely on generating new hypotheses and using evidence from the logs to provide fixes.
+- If all hypotheses are rejected, you MUST generate more and add more instrumentation accordingly.
+- **Remove code changes from rejected hypotheses:** When logs prove a hypothesis wrong, revert the code changes made for that hypothesis. Do not let defensive guards, speculative fixes, or unproven changes accumulate. Only keep modifications that are supported by runtime evidence.
+- Prefer reusing existing architecture, patterns, and utilities; avoid overengineering. Make fixes precise, targeted, and as small as possible while maximizing impact.
+
+## Cleanup
+
+When it is time to remove instrumentation (after verified fix or user request):
+
+1. Search all files for `#region debug log` markers (e.g., grep/ripgrep for `#region debug log`)
+2. For each match, delete everything from the `#region debug log` line through its corresponding `#endregion` line (inclusive)
+3. Grep again to verify zero markers remain
+4. Run `git diff` to review all changes — confirm only your intentional fix remains and no stray debug code was missed
+
+This is why wrapping every debug log in `#region debug log` / `#endregion` is mandatory — it enables deterministic cleanup.
+
+---
+
+## Server API reference
+
+| Method                      | Effect                                      |
+| --------------------------- | ------------------------------------------- |
+| `POST /ingest/:sessionId`   | Append JSON body as NDJSON line to log file |
+| `GET /ingest/:sessionId`    | Read full log file contents                 |
+| `DELETE /ingest/:sessionId` | Clear the log file                          |
diff --git a/.codex/skills/debug-agent/SKILL.md b/.codex/skills/debug-agent/SKILL.md
new file mode 100644
index 00000000..95cad3af
--- /dev/null
+++ b/.codex/skills/debug-agent/SKILL.md
@@ -0,0 +1,184 @@
+---
+name: debug-agent
+description: >-
+  Systematic evidence-based debugging using runtime logs. Generates hypotheses,
+  instruments code with NDJSON logs, guides reproduction, analyzes log evidence,
+  and iterates until root cause is proven with cited log lines. Use when the
+  user reports a bug, unexpected behavior, or asks to debug an issue.
+---
+
+# Debug Mode
+
+You are now in **DEBUG MODE**. You must debug with **runtime evidence**.
+
+**Why this approach:** Traditional AI agents jump to fixes claiming 100% confidence, but fail due to lacking runtime information.
+They guess based on code alone. You **cannot** and **must NOT** fix bugs this way — you need actual runtime data.
+
+**Your systematic workflow:**
+
+1. **Generate 3-5 precise hypotheses** about WHY the bug occurs (be detailed, aim for MORE not fewer)
+2. **Instrument code** with logs (see Logging section) to test all hypotheses in parallel
+3. **Reproduce the bug.**
+   - **If a failing test already exists**: run it directly.
+   - **If reproduction is straightforward** (e.g., a single CLI command, a curl request, a simple script): write and run an ad hoc reproduction script yourself. Tailor it to the runtime — Playwright/Puppeteer for browser bugs, a Node/Python/shell script for backend bugs, etc.
+   - **Otherwise**: ask the user to reproduce it. Provide clear, numbered steps. Remind them to restart apps/services if instrumented files are cached or bundled. Offer: "If you'd like me to write a reproduction script instead, let me know."
+   - Once the user confirms a reproduction pathway (manual or automated), reuse it for all subsequent iterations without re-asking.
+4. **Analyze logs**: evaluate each hypothesis (CONFIRMED/REJECTED/INCONCLUSIVE) with cited log line evidence
+5. **Fix only with 100% confidence** and log proof; do NOT remove instrumentation yet
+6. **Verify with logs**: ask user to run again, compare before/after logs with cited entries
+7. **If logs prove success** and user confirms: remove all instrumentation by searching for `#region debug log` / `#endregion` markers and deleting those blocks (see Cleanup section). **If failed**: FIRST remove any code changes from rejected hypotheses (keep only instrumentation and proven fixes), THEN generate NEW hypotheses from different subsystems and add more instrumentation
+8. **After confirmed success**: explain the problem and provide a concise summary of the fix (1-2 lines)
+
+**Critical constraints:**
+
+- NEVER fix without runtime evidence first
+- ALWAYS rely on runtime information + code (never code alone)
+- Do NOT remove instrumentation before post-fix verification logs prove success and user confirms that there are no more issues
+- Fixes often fail; iteration is expected and preferred. Taking longer with more data yields better, more precise fixes
+
+---
+
+## Logging
+
+### STEP 0: Start the logging server (MANDATORY BEFORE ANY INSTRUMENTATION)
+
+Run the debug server in **daemon mode** before any instrumentation. The `--daemon` flag starts the server in the background and exits immediately with the server info — no backgrounding or `&` required.
+
+```bash
+npx debug-agent --daemon
+```
+
+The command prints a single JSON line to stdout and exits:
+
+```json
+{
+  "sessionId": "a1b2c3",
+  "port": 54321,
+  "endpoint": "http://127.0.0.1:54321/ingest/a1b2c3",
+  "logPath": "/tmp/debug-agent/debug-a1b2c3.log"
+}
+```
+
+Capture and remember these values:
+
+- **Server endpoint**: The `endpoint` value (the HTTP endpoint URL where logs will be sent via POST requests)
+- **Log path**: The `logPath` value (NDJSON logs are written here)
+- **Session ID**: The `sessionId` value (unique identifier for this debug session)
+
+If the server fails to start, STOP IMMEDIATELY and inform the user.
+
+- DO NOT PROCEED with instrumentation without valid logging configuration.
+- The server is idempotent — if one is already running, it returns the existing server's info instead of starting a duplicate.
+- You do not need to pre-create the log file; it will be created automatically when your instrumentation first writes to it.
+
+### STEP 1: Understand the log format
+
+- Logs are written in **NDJSON format** (one JSON object per line) to the file specified by the **log path**.
+- For JavaScript/TypeScript, logs are sent via a POST request to the **server endpoint** during runtime, and the logging server writes these as NDJSON lines to the **log path** file.
+- For other languages (Python, Go, Rust, Java, C/C++, Ruby, etc.), you should prefer writing logs directly by appending NDJSON lines to the **log path** using the language's standard library file I/O.
+
+Example log entry:
+
+```json
+{
+  "sessionId": "a1b2c3",
+  "id": "log_1733456789_abc",
+  "timestamp": 1733456789000,
+  "location": "test.js:42",
+  "message": "User score",
+  "data": { "userId": 5, "score": 85 },
+  "runId": "run1",
+  "hypothesisId": "A"
+}
+```
+
+### STEP 2: Insert instrumentation logs
+
+- In **JavaScript/TypeScript files**, use this one-line fetch template (replace `ENDPOINT` and `SESSION_ID` with values from Step 0), even if filesystem access is available:
+
+```
+fetch('ENDPOINT',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({sessionId:'SESSION_ID',location:'file.js:LINE',message:'desc',data:{k:v},timestamp:Date.now()})}).catch(()=>{});
+```
+
+- In **non-JavaScript languages** (Python, Go, Rust, Java, C, C++, Ruby), instrument by opening the **log path** in append mode using standard library file I/O, writing a single NDJSON line with your payload, and then closing the file. Keep these snippets as tiny and compact as possible (ideally one line, or just a few).
+
+- Decide how many instrumentation logs to insert based on the complexity of the code under investigation and the hypotheses you are testing. A single well-placed log may be enough when the issue is highly localized; complex multi-step flows may need more. Aim for the minimum number that can confirm or reject ALL your hypotheses. Guidelines:
+  - At least 1 log is required; never skip instrumentation entirely
+  - Do not exceed 10 logs — if you think you need more, narrow your hypotheses first
+  - Typical range is 2-6 logs, but use your judgment
+
+- Choose log placements from these categories as relevant to your hypotheses:
+  - Function entry with parameters
+  - Function exit with return values
+  - Values BEFORE critical operations
+  - Values AFTER critical operations
+  - Branch execution paths (which if/else executed)
+  - Suspected error/edge case values
+  - State mutations and intermediate values
+
+- Each log must map to at least one hypothesis (include `hypothesisId` in payload).
+- Use this payload structure: `{sessionId, runId, hypothesisId, location, message, data, timestamp}`
+- **REQUIRED:** Wrap EACH debug log in a collapsible code region:
+  - Use language-appropriate region syntax (e.g., `// #region debug log`, `// #endregion` for JS/TS)
+  - This keeps the editor clean by auto-folding debug instrumentation
+- **FORBIDDEN:** Logging secrets (tokens, passwords, API keys, PII)
+
+### STEP 3: Clear previous log file before each run (MANDATORY)
+
+- Send a `DELETE` request to the **server endpoint** to clear the log file before each run. For example: `curl -X DELETE ENDPOINT` (replace `ENDPOINT` with the endpoint value from Step 0).
+- This ensures clean logs for the new run without mixing old and new data.
+- Clearing the log file is NOT the same as removing instrumentation; do not remove any debug logs from code here.
+- **CRITICAL:** Only clear YOUR session's logs (via your endpoint from Step 0). NEVER delete, modify, or overwrite log files belonging to other debug sessions.
+
+### STEP 4: Read logs after user runs the program
+
+- After the user runs the program and confirms completion in their interface, do NOT ask them to type "done"; then use the file-read tool to read the file at the **log path**.
+- The log file will contain NDJSON entries (one JSON object per line) from your instrumentation.
+- Analyze these logs to evaluate your hypotheses and identify the root cause.
+- If log file is empty or missing: tell user the reproduction may have failed and ask them to try again.
+
+### STEP 5: Keep logs during fixes
+
+- When implementing a fix, DO NOT remove debug logs yet.
+- Logs MUST remain active for verification runs.
+- You may tag logs with `runId="post-fix"` to distinguish verification runs from initial debugging runs.
+- FORBIDDEN: Removing or modifying any previously added logs in any files before post-fix verification logs are analyzed or the user explicitly confirms success.
+- Only remove logs after a successful post-fix verification run (log-based proof) or explicit user request to remove.
+
+---
+
+## Critical Reminders (must follow)
+
+- Keep instrumentation active during fixes; do not remove or modify logs until verification succeeds or the user explicitly confirms.
+- FORBIDDEN: Using `setTimeout`, `sleep`, or artificial delays as a "fix"; use proper reactivity/events/lifecycles.
+- FORBIDDEN: Removing instrumentation before analyzing post-fix verification logs or receiving explicit user confirmation.
+- Verification requires before/after log comparison with cited log lines; do not claim success without log proof.
+- Clear logs by sending a DELETE request to the server endpoint.
+- Do not create the log file manually; it's created automatically.
+- Clearing the log file is not removing instrumentation.
+- NEVER delete or modify log files that do not belong to this session. Only touch the log file at the exact path from Step 0.
+- Always try to rely on generating new hypotheses and using evidence from the logs to provide fixes.
+- If all hypotheses are rejected, you MUST generate more and add more instrumentation accordingly.
+- **Remove code changes from rejected hypotheses:** When logs prove a hypothesis wrong, revert the code changes made for that hypothesis. Do not let defensive guards, speculative fixes, or unproven changes accumulate. Only keep modifications that are supported by runtime evidence.
+- Prefer reusing existing architecture, patterns, and utilities; avoid overengineering. Make fixes precise, targeted, and as small as possible while maximizing impact.
+
+## Cleanup
+
+When it is time to remove instrumentation (after verified fix or user request):
+
+1. Search all files for `#region debug log` markers (e.g., grep/ripgrep for `#region debug log`)
+2. For each match, delete everything from the `#region debug log` line through its corresponding `#endregion` line (inclusive)
+3. Grep again to verify zero markers remain
+4. Run `git diff` to review all changes — confirm only your intentional fix remains and no stray debug code was missed
+
+This is why wrapping every debug log in `#region debug log` / `#endregion` is mandatory — it enables deterministic cleanup.
+
+---
+
+## Server API reference
+
+| Method                      | Effect                                      |
+| --------------------------- | ------------------------------------------- |
+| `POST /ingest/:sessionId`   | Append JSON body as NDJSON line to log file |
+| `GET /ingest/:sessionId`    | Read full log file contents                 |
+| `DELETE /ingest/:sessionId` | Clear the log file                          |
diff --git a/.cursor/skills/debug-agent/SKILL.md b/.cursor/skills/debug-agent/SKILL.md
new file mode 100644
index 00000000..95cad3af
--- /dev/null
+++ b/.cursor/skills/debug-agent/SKILL.md
@@ -0,0 +1,184 @@
+---
+name: debug-agent
+description: >-
+  Systematic evidence-based debugging using runtime logs. Generates hypotheses,
+  instruments code with NDJSON logs, guides reproduction, analyzes log evidence,
+  and iterates until root cause is proven with cited log lines. Use when the
+  user reports a bug, unexpected behavior, or asks to debug an issue.
+---
+
+# Debug Mode
+
+You are now in **DEBUG MODE**. You must debug with **runtime evidence**.
+
+**Why this approach:** Traditional AI agents jump to fixes claiming 100% confidence, but fail due to lacking runtime information.
+They guess based on code alone. You **cannot** and **must NOT** fix bugs this way — you need actual runtime data.
+
+**Your systematic workflow:**
+
+1. **Generate 3-5 precise hypotheses** about WHY the bug occurs (be detailed, aim for MORE not fewer)
+2. **Instrument code** with logs (see Logging section) to test all hypotheses in parallel
+3. **Reproduce the bug.**
+   - **If a failing test already exists**: run it directly.
+   - **If reproduction is straightforward** (e.g., a single CLI command, a curl request, a simple script): write and run an ad hoc reproduction script yourself. Tailor it to the runtime — Playwright/Puppeteer for browser bugs, a Node/Python/shell script for backend bugs, etc.
+   - **Otherwise**: ask the user to reproduce it. Provide clear, numbered steps. Remind them to restart apps/services if instrumented files are cached or bundled. Offer: "If you'd like me to write a reproduction script instead, let me know."
+   - Once the user confirms a reproduction pathway (manual or automated), reuse it for all subsequent iterations without re-asking.
+4. **Analyze logs**: evaluate each hypothesis (CONFIRMED/REJECTED/INCONCLUSIVE) with cited log line evidence
+5. **Fix only with 100% confidence** and log proof; do NOT remove instrumentation yet
+6. **Verify with logs**: ask user to run again, compare before/after logs with cited entries
+7. **If logs prove success** and user confirms: remove all instrumentation by searching for `#region debug log` / `#endregion` markers and deleting those blocks (see Cleanup section). **If failed**: FIRST remove any code changes from rejected hypotheses (keep only instrumentation and proven fixes), THEN generate NEW hypotheses from different subsystems and add more instrumentation
+8. **After confirmed success**: explain the problem and provide a concise summary of the fix (1-2 lines)
+
+**Critical constraints:**
+
+- NEVER fix without runtime evidence first
+- ALWAYS rely on runtime information + code (never code alone)
+- Do NOT remove instrumentation before post-fix verification logs prove success and user confirms that there are no more issues
+- Fixes often fail; iteration is expected and preferred. Taking longer with more data yields better, more precise fixes
+
+---
+
+## Logging
+
+### STEP 0: Start the logging server (MANDATORY BEFORE ANY INSTRUMENTATION)
+
+Run the debug server in **daemon mode** before any instrumentation. The `--daemon` flag starts the server in the background and exits immediately with the server info — no backgrounding or `&` required.
+
+```bash
+npx debug-agent --daemon
+```
+
+The command prints a single JSON line to stdout and exits:
+
+```json
+{
+  "sessionId": "a1b2c3",
+  "port": 54321,
+  "endpoint": "http://127.0.0.1:54321/ingest/a1b2c3",
+  "logPath": "/tmp/debug-agent/debug-a1b2c3.log"
+}
+```
+
+Capture and remember these values:
+
+- **Server endpoint**: The `endpoint` value (the HTTP endpoint URL where logs will be sent via POST requests)
+- **Log path**: The `logPath` value (NDJSON logs are written here)
+- **Session ID**: The `sessionId` value (unique identifier for this debug session)
+
+If the server fails to start, STOP IMMEDIATELY and inform the user.
+
+- DO NOT PROCEED with instrumentation without valid logging configuration.
+- The server is idempotent — if one is already running, it returns the existing server's info instead of starting a duplicate.
+- You do not need to pre-create the log file; it will be created automatically when your instrumentation first writes to it.
+
+### STEP 1: Understand the log format
+
+- Logs are written in **NDJSON format** (one JSON object per line) to the file specified by the **log path**.
+- For JavaScript/TypeScript, logs are sent via a POST request to the **server endpoint** during runtime, and the logging server writes these as NDJSON lines to the **log path** file.
+- For other languages (Python, Go, Rust, Java, C/C++, Ruby, etc.), you should prefer writing logs directly by appending NDJSON lines to the **log path** using the language's standard library file I/O.
+
+Example log entry:
+
+```json
+{
+  "sessionId": "a1b2c3",
+  "id": "log_1733456789_abc",
+  "timestamp": 1733456789000,
+  "location": "test.js:42",
+  "message": "User score",
+  "data": { "userId": 5, "score": 85 },
+  "runId": "run1",
+  "hypothesisId": "A"
+}
+```
+
+### STEP 2: Insert instrumentation logs
+
+- In **JavaScript/TypeScript files**, use this one-line fetch template (replace `ENDPOINT` and `SESSION_ID` with values from Step 0), even if filesystem access is available:
+
+```
+fetch('ENDPOINT',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({sessionId:'SESSION_ID',location:'file.js:LINE',message:'desc',data:{k:v},timestamp:Date.now()})}).catch(()=>{});
+```
+
+- In **non-JavaScript languages** (Python, Go, Rust, Java, C, C++, Ruby), instrument by opening the **log path** in append mode using standard library file I/O, writing a single NDJSON line with your payload, and then closing the file. Keep these snippets as tiny and compact as possible (ideally one line, or just a few).
+
+- Decide how many instrumentation logs to insert based on the complexity of the code under investigation and the hypotheses you are testing. A single well-placed log may be enough when the issue is highly localized; complex multi-step flows may need more. Aim for the minimum number that can confirm or reject ALL your hypotheses. Guidelines:
+  - At least 1 log is required; never skip instrumentation entirely
+  - Do not exceed 10 logs — if you think you need more, narrow your hypotheses first
+  - Typical range is 2-6 logs, but use your judgment
+
+- Choose log placements from these categories as relevant to your hypotheses:
+  - Function entry with parameters
+  - Function exit with return values
+  - Values BEFORE critical operations
+  - Values AFTER critical operations
+  - Branch execution paths (which if/else executed)
+  - Suspected error/edge case values
+  - State mutations and intermediate values
+
+- Each log must map to at least one hypothesis (include `hypothesisId` in payload).
+- Use this payload structure: `{sessionId, runId, hypothesisId, location, message, data, timestamp}`
+- **REQUIRED:** Wrap EACH debug log in a collapsible code region:
+  - Use language-appropriate region syntax (e.g., `// #region debug log`, `// #endregion` for JS/TS)
+  - This keeps the editor clean by auto-folding debug instrumentation
+- **FORBIDDEN:** Logging secrets (tokens, passwords, API keys, PII)
+
+### STEP 3: Clear previous log file before each run (MANDATORY)
+
+- Send a `DELETE` request to the **server endpoint** to clear the log file before each run. For example: `curl -X DELETE ENDPOINT` (replace `ENDPOINT` with the endpoint value from Step 0).
+- This ensures clean logs for the new run without mixing old and new data.
+- Clearing the log file is NOT the same as removing instrumentation; do not remove any debug logs from code here.
+- **CRITICAL:** Only clear YOUR session's logs (via your endpoint from Step 0). NEVER delete, modify, or overwrite log files belonging to other debug sessions.
+
+### STEP 4: Read logs after user runs the program
+
+- After the user runs the program and confirms completion in their interface, do NOT ask them to type "done"; then use the file-read tool to read the file at the **log path**.
+- The log file will contain NDJSON entries (one JSON object per line) from your instrumentation.
+- Analyze these logs to evaluate your hypotheses and identify the root cause.
+- If log file is empty or missing: tell user the reproduction may have failed and ask them to try again.
+
+### STEP 5: Keep logs during fixes
+
+- When implementing a fix, DO NOT remove debug logs yet.
+- Logs MUST remain active for verification runs.
+- You may tag logs with `runId="post-fix"` to distinguish verification runs from initial debugging runs.
+- FORBIDDEN: Removing or modifying any previously added logs in any files before post-fix verification logs are analyzed or the user explicitly confirms success.
+- Only remove logs after a successful post-fix verification run (log-based proof) or explicit user request to remove.
+
+---
+
+## Critical Reminders (must follow)
+
+- Keep instrumentation active during fixes; do not remove or modify logs until verification succeeds or the user explicitly confirms.
+- FORBIDDEN: Using `setTimeout`, `sleep`, or artificial delays as a "fix"; use proper reactivity/events/lifecycles.
+- FORBIDDEN: Removing instrumentation before analyzing post-fix verification logs or receiving explicit user confirmation.
+- Verification requires before/after log comparison with cited log lines; do not claim success without log proof.
+- Clear logs by sending a DELETE request to the server endpoint.
+- Do not create the log file manually; it's created automatically.
+- Clearing the log file is not removing instrumentation.
+- NEVER delete or modify log files that do not belong to this session. Only touch the log file at the exact path from Step 0.
+- Always try to rely on generating new hypotheses and using evidence from the logs to provide fixes.
+- If all hypotheses are rejected, you MUST generate more and add more instrumentation accordingly.
+- **Remove code changes from rejected hypotheses:** When logs prove a hypothesis wrong, revert the code changes made for that hypothesis. Do not let defensive guards, speculative fixes, or unproven changes accumulate. Only keep modifications that are supported by runtime evidence.
+- Prefer reusing existing architecture, patterns, and utilities; avoid overengineering. Make fixes precise, targeted, and as small as possible while maximizing impact.
+
+## Cleanup
+
+When it is time to remove instrumentation (after verified fix or user request):
+
+1. Search all files for `#region debug log` markers (e.g., grep/ripgrep for `#region debug log`)
+2. For each match, delete everything from the `#region debug log` line through its corresponding `#endregion` line (inclusive)
+3. Grep again to verify zero markers remain
+4. Run `git diff` to review all changes — confirm only your intentional fix remains and no stray debug code was missed
+
+This is why wrapping every debug log in `#region debug log` / `#endregion` is mandatory — it enables deterministic cleanup.
+
+---
+
+## Server API reference
+
+| Method                      | Effect                                      |
+| --------------------------- | ------------------------------------------- |
+| `POST /ingest/:sessionId`   | Append JSON body as NDJSON line to log file |
+| `GET /ingest/:sessionId`    | Read full log file contents                 |
+| `DELETE /ingest/:sessionId` | Clear the log file                          |