Add debug-agent skill (millionco/debug-agent)

This commit is contained in:
Tommaso Casaburi
2026-04-11 15:59:38 +07:00
parent b1d9aac99b
commit 95089ffe1e
3 changed files with 552 additions and 0 deletions

View File

@@ -0,0 +1,184 @@
---
name: debug-agent
description: >-
Systematic evidence-based debugging using runtime logs. Generates hypotheses,
instruments code with NDJSON logs, guides reproduction, analyzes log evidence,
and iterates until root cause is proven with cited log lines. Use when the
user reports a bug, unexpected behavior, or asks to debug an issue.
---
# Debug Mode
You are now in **DEBUG MODE**. You must debug with **runtime evidence**.
**Why this approach:** Traditional AI agents jump to fixes claiming 100% confidence, but fail due to lacking runtime information.
They guess based on code alone. You **cannot** and **must NOT** fix bugs this way — you need actual runtime data.
**Your systematic workflow:**
1. **Generate 3-5 precise hypotheses** about WHY the bug occurs (be detailed, aim for MORE not fewer)
2. **Instrument code** with logs (see Logging section) to test all hypotheses in parallel
3. **Reproduce the bug.**
- **If a failing test already exists**: run it directly.
- **If reproduction is straightforward** (e.g., a single CLI command, a curl request, a simple script): write and run an ad hoc reproduction script yourself. Tailor it to the runtime — Playwright/Puppeteer for browser bugs, a Node/Python/shell script for backend bugs, etc.
- **Otherwise**: ask the user to reproduce it. Provide clear, numbered steps. Remind them to restart apps/services if instrumented files are cached or bundled. Offer: "If you'd like me to write a reproduction script instead, let me know."
- Once the user confirms a reproduction pathway (manual or automated), reuse it for all subsequent iterations without re-asking.
4. **Analyze logs**: evaluate each hypothesis (CONFIRMED/REJECTED/INCONCLUSIVE) with cited log line evidence
5. **Fix only with 100% confidence** and log proof; do NOT remove instrumentation yet
6. **Verify with logs**: ask user to run again, compare before/after logs with cited entries
7. **If logs prove success** and user confirms: remove all instrumentation by searching for `#region debug log` / `#endregion` markers and deleting those blocks (see Cleanup section). **If failed**: FIRST remove any code changes from rejected hypotheses (keep only instrumentation and proven fixes), THEN generate NEW hypotheses from different subsystems and add more instrumentation
8. **After confirmed success**: explain the problem and provide a concise summary of the fix (1-2 lines)
**Critical constraints:**
- NEVER fix without runtime evidence first
- ALWAYS rely on runtime information + code (never code alone)
- Do NOT remove instrumentation before post-fix verification logs prove success and user confirms that there are no more issues
- Fixes often fail; iteration is expected and preferred. Taking longer with more data yields better, more precise fixes
---
## Logging
### STEP 0: Start the logging server (MANDATORY BEFORE ANY INSTRUMENTATION)
Run the debug server in **daemon mode** before any instrumentation. The `--daemon` flag starts the server in the background and exits immediately with the server info — no backgrounding or `&` required.
```bash
npx debug-agent --daemon
```
The command prints a single JSON line to stdout and exits:
```json
{
"sessionId": "a1b2c3",
"port": 54321,
"endpoint": "http://127.0.0.1:54321/ingest/a1b2c3",
"logPath": "/tmp/debug-agent/debug-a1b2c3.log"
}
```
Capture and remember these values:
- **Server endpoint**: The `endpoint` value (the HTTP endpoint URL where logs will be sent via POST requests)
- **Log path**: The `logPath` value (NDJSON logs are written here)
- **Session ID**: The `sessionId` value (unique identifier for this debug session)
If the server fails to start, STOP IMMEDIATELY and inform the user.
- DO NOT PROCEED with instrumentation without valid logging configuration.
- The server is idempotent — if one is already running, it returns the existing server's info instead of starting a duplicate.
- You do not need to pre-create the log file; it will be created automatically when your instrumentation first writes to it.
### STEP 1: Understand the log format
- Logs are written in **NDJSON format** (one JSON object per line) to the file specified by the **log path**.
- For JavaScript/TypeScript, logs are sent via a POST request to the **server endpoint** during runtime, and the logging server writes these as NDJSON lines to the **log path** file.
- For other languages (Python, Go, Rust, Java, C/C++, Ruby, etc.), you should prefer writing logs directly by appending NDJSON lines to the **log path** using the language's standard library file I/O.
Example log entry:
```json
{
"sessionId": "a1b2c3",
"id": "log_1733456789_abc",
"timestamp": 1733456789000,
"location": "test.js:42",
"message": "User score",
"data": { "userId": 5, "score": 85 },
"runId": "run1",
"hypothesisId": "A"
}
```
### STEP 2: Insert instrumentation logs
- In **JavaScript/TypeScript files**, use this one-line fetch template (replace `ENDPOINT` and `SESSION_ID` with values from Step 0), even if filesystem access is available:
```
fetch('ENDPOINT',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({sessionId:'SESSION_ID',location:'file.js:LINE',message:'desc',data:{k:v},timestamp:Date.now()})}).catch(()=>{});
```
- In **non-JavaScript languages** (Python, Go, Rust, Java, C, C++, Ruby), instrument by opening the **log path** in append mode using standard library file I/O, writing a single NDJSON line with your payload, and then closing the file. Keep these snippets as tiny and compact as possible (ideally one line, or just a few).
- Decide how many instrumentation logs to insert based on the complexity of the code under investigation and the hypotheses you are testing. A single well-placed log may be enough when the issue is highly localized; complex multi-step flows may need more. Aim for the minimum number that can confirm or reject ALL your hypotheses. Guidelines:
- At least 1 log is required; never skip instrumentation entirely
- Do not exceed 10 logs — if you think you need more, narrow your hypotheses first
- Typical range is 2-6 logs, but use your judgment
- Choose log placements from these categories as relevant to your hypotheses:
- Function entry with parameters
- Function exit with return values
- Values BEFORE critical operations
- Values AFTER critical operations
- Branch execution paths (which if/else executed)
- Suspected error/edge case values
- State mutations and intermediate values
- Each log must map to at least one hypothesis (include `hypothesisId` in payload).
- Use this payload structure: `{sessionId, runId, hypothesisId, location, message, data, timestamp}`
- **REQUIRED:** Wrap EACH debug log in a collapsible code region:
- Use language-appropriate region syntax (e.g., `// #region debug log`, `// #endregion` for JS/TS)
- This keeps the editor clean by auto-folding debug instrumentation
- **FORBIDDEN:** Logging secrets (tokens, passwords, API keys, PII)
### STEP 3: Clear previous log file before each run (MANDATORY)
- Send a `DELETE` request to the **server endpoint** to clear the log file before each run. For example: `curl -X DELETE ENDPOINT` (replace `ENDPOINT` with the endpoint value from Step 0).
- This ensures clean logs for the new run without mixing old and new data.
- Clearing the log file is NOT the same as removing instrumentation; do not remove any debug logs from code here.
- **CRITICAL:** Only clear YOUR session's logs (via your endpoint from Step 0). NEVER delete, modify, or overwrite log files belonging to other debug sessions.
### STEP 4: Read logs after user runs the program
- After the user runs the program and confirms completion in their interface, do NOT ask them to type "done"; then use the file-read tool to read the file at the **log path**.
- The log file will contain NDJSON entries (one JSON object per line) from your instrumentation.
- Analyze these logs to evaluate your hypotheses and identify the root cause.
- If log file is empty or missing: tell user the reproduction may have failed and ask them to try again.
### STEP 5: Keep logs during fixes
- When implementing a fix, DO NOT remove debug logs yet.
- Logs MUST remain active for verification runs.
- You may tag logs with `runId="post-fix"` to distinguish verification runs from initial debugging runs.
- FORBIDDEN: Removing or modifying any previously added logs in any files before post-fix verification logs are analyzed or the user explicitly confirms success.
- Only remove logs after a successful post-fix verification run (log-based proof) or explicit user request to remove.
---
## Critical Reminders (must follow)
- Keep instrumentation active during fixes; do not remove or modify logs until verification succeeds or the user explicitly confirms.
- FORBIDDEN: Using `setTimeout`, `sleep`, or artificial delays as a "fix"; use proper reactivity/events/lifecycles.
- FORBIDDEN: Removing instrumentation before analyzing post-fix verification logs or receiving explicit user confirmation.
- Verification requires before/after log comparison with cited log lines; do not claim success without log proof.
- Clear logs by sending a DELETE request to the server endpoint.
- Do not create the log file manually; it's created automatically.
- Clearing the log file is not removing instrumentation.
- NEVER delete or modify log files that do not belong to this session. Only touch the log file at the exact path from Step 0.
- Always try to rely on generating new hypotheses and using evidence from the logs to provide fixes.
- If all hypotheses are rejected, you MUST generate more and add more instrumentation accordingly.
- **Remove code changes from rejected hypotheses:** When logs prove a hypothesis wrong, revert the code changes made for that hypothesis. Do not let defensive guards, speculative fixes, or unproven changes accumulate. Only keep modifications that are supported by runtime evidence.
- Prefer reusing existing architecture, patterns, and utilities; avoid overengineering. Make fixes precise, targeted, and as small as possible while maximizing impact.
## Cleanup
When it is time to remove instrumentation (after verified fix or user request):
1. Search all files for `#region debug log` markers (e.g., grep/ripgrep for `#region debug log`)
2. For each match, delete everything from the `#region debug log` line through its corresponding `#endregion` line (inclusive)
3. Grep again to verify zero markers remain
4. Run `git diff` to review all changes — confirm only your intentional fix remains and no stray debug code was missed
This is why wrapping every debug log in `#region debug log` / `#endregion` is mandatory — it enables deterministic cleanup.
---
## Server API reference
| Method | Effect |
| --------------------------- | ------------------------------------------- |
| `POST /ingest/:sessionId` | Append JSON body as NDJSON line to log file |
| `GET /ingest/:sessionId` | Read full log file contents |
| `DELETE /ingest/:sessionId` | Clear the log file |

View File

@@ -0,0 +1,184 @@
---
name: debug-agent
description: >-
Systematic evidence-based debugging using runtime logs. Generates hypotheses,
instruments code with NDJSON logs, guides reproduction, analyzes log evidence,
and iterates until root cause is proven with cited log lines. Use when the
user reports a bug, unexpected behavior, or asks to debug an issue.
---
# Debug Mode
You are now in **DEBUG MODE**. You must debug with **runtime evidence**.
**Why this approach:** Traditional AI agents jump to fixes claiming 100% confidence, but fail due to lacking runtime information.
They guess based on code alone. You **cannot** and **must NOT** fix bugs this way — you need actual runtime data.
**Your systematic workflow:**
1. **Generate 3-5 precise hypotheses** about WHY the bug occurs (be detailed, aim for MORE not fewer)
2. **Instrument code** with logs (see Logging section) to test all hypotheses in parallel
3. **Reproduce the bug.**
- **If a failing test already exists**: run it directly.
- **If reproduction is straightforward** (e.g., a single CLI command, a curl request, a simple script): write and run an ad hoc reproduction script yourself. Tailor it to the runtime — Playwright/Puppeteer for browser bugs, a Node/Python/shell script for backend bugs, etc.
- **Otherwise**: ask the user to reproduce it. Provide clear, numbered steps. Remind them to restart apps/services if instrumented files are cached or bundled. Offer: "If you'd like me to write a reproduction script instead, let me know."
- Once the user confirms a reproduction pathway (manual or automated), reuse it for all subsequent iterations without re-asking.
4. **Analyze logs**: evaluate each hypothesis (CONFIRMED/REJECTED/INCONCLUSIVE) with cited log line evidence
5. **Fix only with 100% confidence** and log proof; do NOT remove instrumentation yet
6. **Verify with logs**: ask user to run again, compare before/after logs with cited entries
7. **If logs prove success** and user confirms: remove all instrumentation by searching for `#region debug log` / `#endregion` markers and deleting those blocks (see Cleanup section). **If failed**: FIRST remove any code changes from rejected hypotheses (keep only instrumentation and proven fixes), THEN generate NEW hypotheses from different subsystems and add more instrumentation
8. **After confirmed success**: explain the problem and provide a concise summary of the fix (1-2 lines)
**Critical constraints:**
- NEVER fix without runtime evidence first
- ALWAYS rely on runtime information + code (never code alone)
- Do NOT remove instrumentation before post-fix verification logs prove success and user confirms that there are no more issues
- Fixes often fail; iteration is expected and preferred. Taking longer with more data yields better, more precise fixes
---
## Logging
### STEP 0: Start the logging server (MANDATORY BEFORE ANY INSTRUMENTATION)
Run the debug server in **daemon mode** before any instrumentation. The `--daemon` flag starts the server in the background and exits immediately with the server info — no backgrounding or `&` required.
```bash
npx debug-agent --daemon
```
The command prints a single JSON line to stdout and exits:
```json
{
"sessionId": "a1b2c3",
"port": 54321,
"endpoint": "http://127.0.0.1:54321/ingest/a1b2c3",
"logPath": "/tmp/debug-agent/debug-a1b2c3.log"
}
```
Capture and remember these values:
- **Server endpoint**: The `endpoint` value (the HTTP endpoint URL where logs will be sent via POST requests)
- **Log path**: The `logPath` value (NDJSON logs are written here)
- **Session ID**: The `sessionId` value (unique identifier for this debug session)
If the server fails to start, STOP IMMEDIATELY and inform the user.
- DO NOT PROCEED with instrumentation without valid logging configuration.
- The server is idempotent — if one is already running, it returns the existing server's info instead of starting a duplicate.
- You do not need to pre-create the log file; it will be created automatically when your instrumentation first writes to it.
### STEP 1: Understand the log format
- Logs are written in **NDJSON format** (one JSON object per line) to the file specified by the **log path**.
- For JavaScript/TypeScript, logs are sent via a POST request to the **server endpoint** during runtime, and the logging server writes these as NDJSON lines to the **log path** file.
- For other languages (Python, Go, Rust, Java, C/C++, Ruby, etc.), you should prefer writing logs directly by appending NDJSON lines to the **log path** using the language's standard library file I/O.
Example log entry:
```json
{
"sessionId": "a1b2c3",
"id": "log_1733456789_abc",
"timestamp": 1733456789000,
"location": "test.js:42",
"message": "User score",
"data": { "userId": 5, "score": 85 },
"runId": "run1",
"hypothesisId": "A"
}
```
### STEP 2: Insert instrumentation logs
- In **JavaScript/TypeScript files**, use this one-line fetch template (replace `ENDPOINT` and `SESSION_ID` with values from Step 0), even if filesystem access is available:
```
fetch('ENDPOINT',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({sessionId:'SESSION_ID',location:'file.js:LINE',message:'desc',data:{k:v},timestamp:Date.now()})}).catch(()=>{});
```
- In **non-JavaScript languages** (Python, Go, Rust, Java, C, C++, Ruby), instrument by opening the **log path** in append mode using standard library file I/O, writing a single NDJSON line with your payload, and then closing the file. Keep these snippets as tiny and compact as possible (ideally one line, or just a few).
- Decide how many instrumentation logs to insert based on the complexity of the code under investigation and the hypotheses you are testing. A single well-placed log may be enough when the issue is highly localized; complex multi-step flows may need more. Aim for the minimum number that can confirm or reject ALL your hypotheses. Guidelines:
- At least 1 log is required; never skip instrumentation entirely
- Do not exceed 10 logs — if you think you need more, narrow your hypotheses first
- Typical range is 2-6 logs, but use your judgment
- Choose log placements from these categories as relevant to your hypotheses:
- Function entry with parameters
- Function exit with return values
- Values BEFORE critical operations
- Values AFTER critical operations
- Branch execution paths (which if/else executed)
- Suspected error/edge case values
- State mutations and intermediate values
- Each log must map to at least one hypothesis (include `hypothesisId` in payload).
- Use this payload structure: `{sessionId, runId, hypothesisId, location, message, data, timestamp}`
- **REQUIRED:** Wrap EACH debug log in a collapsible code region:
- Use language-appropriate region syntax (e.g., `// #region debug log`, `// #endregion` for JS/TS)
- This keeps the editor clean by auto-folding debug instrumentation
- **FORBIDDEN:** Logging secrets (tokens, passwords, API keys, PII)
### STEP 3: Clear previous log file before each run (MANDATORY)
- Send a `DELETE` request to the **server endpoint** to clear the log file before each run. For example: `curl -X DELETE ENDPOINT` (replace `ENDPOINT` with the endpoint value from Step 0).
- This ensures clean logs for the new run without mixing old and new data.
- Clearing the log file is NOT the same as removing instrumentation; do not remove any debug logs from code here.
- **CRITICAL:** Only clear YOUR session's logs (via your endpoint from Step 0). NEVER delete, modify, or overwrite log files belonging to other debug sessions.
### STEP 4: Read logs after user runs the program
- After the user runs the program and confirms completion in their interface, do NOT ask them to type "done"; then use the file-read tool to read the file at the **log path**.
- The log file will contain NDJSON entries (one JSON object per line) from your instrumentation.
- Analyze these logs to evaluate your hypotheses and identify the root cause.
- If log file is empty or missing: tell user the reproduction may have failed and ask them to try again.
### STEP 5: Keep logs during fixes
- When implementing a fix, DO NOT remove debug logs yet.
- Logs MUST remain active for verification runs.
- You may tag logs with `runId="post-fix"` to distinguish verification runs from initial debugging runs.
- FORBIDDEN: Removing or modifying any previously added logs in any files before post-fix verification logs are analyzed or the user explicitly confirms success.
- Only remove logs after a successful post-fix verification run (log-based proof) or explicit user request to remove.
---
## Critical Reminders (must follow)
- Keep instrumentation active during fixes; do not remove or modify logs until verification succeeds or the user explicitly confirms.
- FORBIDDEN: Using `setTimeout`, `sleep`, or artificial delays as a "fix"; use proper reactivity/events/lifecycles.
- FORBIDDEN: Removing instrumentation before analyzing post-fix verification logs or receiving explicit user confirmation.
- Verification requires before/after log comparison with cited log lines; do not claim success without log proof.
- Clear logs by sending a DELETE request to the server endpoint.
- Do not create the log file manually; it's created automatically.
- Clearing the log file is not removing instrumentation.
- NEVER delete or modify log files that do not belong to this session. Only touch the log file at the exact path from Step 0.
- Always try to rely on generating new hypotheses and using evidence from the logs to provide fixes.
- If all hypotheses are rejected, you MUST generate more and add more instrumentation accordingly.
- **Remove code changes from rejected hypotheses:** When logs prove a hypothesis wrong, revert the code changes made for that hypothesis. Do not let defensive guards, speculative fixes, or unproven changes accumulate. Only keep modifications that are supported by runtime evidence.
- Prefer reusing existing architecture, patterns, and utilities; avoid overengineering. Make fixes precise, targeted, and as small as possible while maximizing impact.
## Cleanup
When it is time to remove instrumentation (after verified fix or user request):
1. Search all files for `#region debug log` markers (e.g., grep/ripgrep for `#region debug log`)
2. For each match, delete everything from the `#region debug log` line through its corresponding `#endregion` line (inclusive)
3. Grep again to verify zero markers remain
4. Run `git diff` to review all changes — confirm only your intentional fix remains and no stray debug code was missed
This is why wrapping every debug log in `#region debug log` / `#endregion` is mandatory — it enables deterministic cleanup.
---
## Server API reference
| Method | Effect |
| --------------------------- | ------------------------------------------- |
| `POST /ingest/:sessionId` | Append JSON body as NDJSON line to log file |
| `GET /ingest/:sessionId` | Read full log file contents |
| `DELETE /ingest/:sessionId` | Clear the log file |

View File

@@ -0,0 +1,184 @@
---
name: debug-agent
description: >-
Systematic evidence-based debugging using runtime logs. Generates hypotheses,
instruments code with NDJSON logs, guides reproduction, analyzes log evidence,
and iterates until root cause is proven with cited log lines. Use when the
user reports a bug, unexpected behavior, or asks to debug an issue.
---
# Debug Mode
You are now in **DEBUG MODE**. You must debug with **runtime evidence**.
**Why this approach:** Traditional AI agents jump to fixes claiming 100% confidence, but fail due to lacking runtime information.
They guess based on code alone. You **cannot** and **must NOT** fix bugs this way — you need actual runtime data.
**Your systematic workflow:**
1. **Generate 3-5 precise hypotheses** about WHY the bug occurs (be detailed, aim for MORE not fewer)
2. **Instrument code** with logs (see Logging section) to test all hypotheses in parallel
3. **Reproduce the bug.**
- **If a failing test already exists**: run it directly.
- **If reproduction is straightforward** (e.g., a single CLI command, a curl request, a simple script): write and run an ad hoc reproduction script yourself. Tailor it to the runtime — Playwright/Puppeteer for browser bugs, a Node/Python/shell script for backend bugs, etc.
- **Otherwise**: ask the user to reproduce it. Provide clear, numbered steps. Remind them to restart apps/services if instrumented files are cached or bundled. Offer: "If you'd like me to write a reproduction script instead, let me know."
- Once the user confirms a reproduction pathway (manual or automated), reuse it for all subsequent iterations without re-asking.
4. **Analyze logs**: evaluate each hypothesis (CONFIRMED/REJECTED/INCONCLUSIVE) with cited log line evidence
5. **Fix only with 100% confidence** and log proof; do NOT remove instrumentation yet
6. **Verify with logs**: ask user to run again, compare before/after logs with cited entries
7. **If logs prove success** and user confirms: remove all instrumentation by searching for `#region debug log` / `#endregion` markers and deleting those blocks (see Cleanup section). **If failed**: FIRST remove any code changes from rejected hypotheses (keep only instrumentation and proven fixes), THEN generate NEW hypotheses from different subsystems and add more instrumentation
8. **After confirmed success**: explain the problem and provide a concise summary of the fix (1-2 lines)
**Critical constraints:**
- NEVER fix without runtime evidence first
- ALWAYS rely on runtime information + code (never code alone)
- Do NOT remove instrumentation before post-fix verification logs prove success and user confirms that there are no more issues
- Fixes often fail; iteration is expected and preferred. Taking longer with more data yields better, more precise fixes
---
## Logging
### STEP 0: Start the logging server (MANDATORY BEFORE ANY INSTRUMENTATION)
Run the debug server in **daemon mode** before any instrumentation. The `--daemon` flag starts the server in the background and exits immediately with the server info — no backgrounding or `&` required.
```bash
npx debug-agent --daemon
```
The command prints a single JSON line to stdout and exits:
```json
{
"sessionId": "a1b2c3",
"port": 54321,
"endpoint": "http://127.0.0.1:54321/ingest/a1b2c3",
"logPath": "/tmp/debug-agent/debug-a1b2c3.log"
}
```
Capture and remember these values:
- **Server endpoint**: The `endpoint` value (the HTTP endpoint URL where logs will be sent via POST requests)
- **Log path**: The `logPath` value (NDJSON logs are written here)
- **Session ID**: The `sessionId` value (unique identifier for this debug session)
If the server fails to start, STOP IMMEDIATELY and inform the user.
- DO NOT PROCEED with instrumentation without valid logging configuration.
- The server is idempotent — if one is already running, it returns the existing server's info instead of starting a duplicate.
- You do not need to pre-create the log file; it will be created automatically when your instrumentation first writes to it.
### STEP 1: Understand the log format
- Logs are written in **NDJSON format** (one JSON object per line) to the file specified by the **log path**.
- For JavaScript/TypeScript, logs are sent via a POST request to the **server endpoint** during runtime, and the logging server writes these as NDJSON lines to the **log path** file.
- For other languages (Python, Go, Rust, Java, C/C++, Ruby, etc.), you should prefer writing logs directly by appending NDJSON lines to the **log path** using the language's standard library file I/O.
Example log entry:
```json
{
"sessionId": "a1b2c3",
"id": "log_1733456789_abc",
"timestamp": 1733456789000,
"location": "test.js:42",
"message": "User score",
"data": { "userId": 5, "score": 85 },
"runId": "run1",
"hypothesisId": "A"
}
```
### STEP 2: Insert instrumentation logs
- In **JavaScript/TypeScript files**, use this one-line fetch template (replace `ENDPOINT` and `SESSION_ID` with values from Step 0), even if filesystem access is available:
```
fetch('ENDPOINT',{method:'POST',headers:{'Content-Type':'application/json'},body:JSON.stringify({sessionId:'SESSION_ID',location:'file.js:LINE',message:'desc',data:{k:v},timestamp:Date.now()})}).catch(()=>{});
```
- In **non-JavaScript languages** (Python, Go, Rust, Java, C, C++, Ruby), instrument by opening the **log path** in append mode using standard library file I/O, writing a single NDJSON line with your payload, and then closing the file. Keep these snippets as tiny and compact as possible (ideally one line, or just a few).
- Decide how many instrumentation logs to insert based on the complexity of the code under investigation and the hypotheses you are testing. A single well-placed log may be enough when the issue is highly localized; complex multi-step flows may need more. Aim for the minimum number that can confirm or reject ALL your hypotheses. Guidelines:
- At least 1 log is required; never skip instrumentation entirely
- Do not exceed 10 logs — if you think you need more, narrow your hypotheses first
- Typical range is 2-6 logs, but use your judgment
- Choose log placements from these categories as relevant to your hypotheses:
- Function entry with parameters
- Function exit with return values
- Values BEFORE critical operations
- Values AFTER critical operations
- Branch execution paths (which if/else executed)
- Suspected error/edge case values
- State mutations and intermediate values
- Each log must map to at least one hypothesis (include `hypothesisId` in payload).
- Use this payload structure: `{sessionId, runId, hypothesisId, location, message, data, timestamp}`
- **REQUIRED:** Wrap EACH debug log in a collapsible code region:
- Use language-appropriate region syntax (e.g., `// #region debug log`, `// #endregion` for JS/TS)
- This keeps the editor clean by auto-folding debug instrumentation
- **FORBIDDEN:** Logging secrets (tokens, passwords, API keys, PII)
### STEP 3: Clear previous log file before each run (MANDATORY)
- Send a `DELETE` request to the **server endpoint** to clear the log file before each run. For example: `curl -X DELETE ENDPOINT` (replace `ENDPOINT` with the endpoint value from Step 0).
- This ensures clean logs for the new run without mixing old and new data.
- Clearing the log file is NOT the same as removing instrumentation; do not remove any debug logs from code here.
- **CRITICAL:** Only clear YOUR session's logs (via your endpoint from Step 0). NEVER delete, modify, or overwrite log files belonging to other debug sessions.
### STEP 4: Read logs after user runs the program
- After the user runs the program and confirms completion in their interface, do NOT ask them to type "done"; then use the file-read tool to read the file at the **log path**.
- The log file will contain NDJSON entries (one JSON object per line) from your instrumentation.
- Analyze these logs to evaluate your hypotheses and identify the root cause.
- If log file is empty or missing: tell user the reproduction may have failed and ask them to try again.
### STEP 5: Keep logs during fixes
- When implementing a fix, DO NOT remove debug logs yet.
- Logs MUST remain active for verification runs.
- You may tag logs with `runId="post-fix"` to distinguish verification runs from initial debugging runs.
- FORBIDDEN: Removing or modifying any previously added logs in any files before post-fix verification logs are analyzed or the user explicitly confirms success.
- Only remove logs after a successful post-fix verification run (log-based proof) or explicit user request to remove.
---
## Critical Reminders (must follow)
- Keep instrumentation active during fixes; do not remove or modify logs until verification succeeds or the user explicitly confirms.
- FORBIDDEN: Using `setTimeout`, `sleep`, or artificial delays as a "fix"; use proper reactivity/events/lifecycles.
- FORBIDDEN: Removing instrumentation before analyzing post-fix verification logs or receiving explicit user confirmation.
- Verification requires before/after log comparison with cited log lines; do not claim success without log proof.
- Clear logs by sending a DELETE request to the server endpoint.
- Do not create the log file manually; it's created automatically.
- Clearing the log file is not removing instrumentation.
- NEVER delete or modify log files that do not belong to this session. Only touch the log file at the exact path from Step 0.
- Always try to rely on generating new hypotheses and using evidence from the logs to provide fixes.
- If all hypotheses are rejected, you MUST generate more and add more instrumentation accordingly.
- **Remove code changes from rejected hypotheses:** When logs prove a hypothesis wrong, revert the code changes made for that hypothesis. Do not let defensive guards, speculative fixes, or unproven changes accumulate. Only keep modifications that are supported by runtime evidence.
- Prefer reusing existing architecture, patterns, and utilities; avoid overengineering. Make fixes precise, targeted, and as small as possible while maximizing impact.
## Cleanup
When it is time to remove instrumentation (after verified fix or user request):
1. Search all files for `#region debug log` markers (e.g., grep/ripgrep for `#region debug log`)
2. For each match, delete everything from the `#region debug log` line through its corresponding `#endregion` line (inclusive)
3. Grep again to verify zero markers remain
4. Run `git diff` to review all changes — confirm only your intentional fix remains and no stray debug code was missed
This is why wrapping every debug log in `#region debug log` / `#endregion` is mandatory — it enables deterministic cleanup.
---
## Server API reference
| Method | Effect |
| --------------------------- | ------------------------------------------- |
| `POST /ingest/:sessionId` | Append JSON body as NDJSON line to log file |
| `GET /ingest/:sessionId` | Read full log file contents |
| `DELETE /ingest/:sessionId` | Clear the log file |