mirror of
https://github.com/meshtastic/firmware.git
synced 2026-05-29 19:24:46 -04:00
* Implement rotating JSONL recorder for persistent logging * Fixes * Update documentation and clean up imports in command files * Address remaining recorder review feedback Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/2541773c-869a-463f-9fae-8505272c06ff Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * recorder: fix lock re-entry deadlock on start() and force_rotate_all() The previous "Fixes" commit added `_files_snapshot()` which acquires `self._lock` so handlers don't race with `stop()` clearing `_files`. But two callers were already holding `self._lock` when they invoked methods that go through the snapshot: - `start()` writes the `recorder_start` event from inside its `with self._lock:` block. `_write_event` -> `_files_snapshot` re-acquires the same non-reentrant `threading.Lock`, freezing process startup. - `force_rotate_all()` calls `self.status()` (which also acquires `self._lock`) while still holding the lock from rotating each file. Both fixes release the lock before the call. The recorder_start marker still lands in events.jsonl because the started/started_at flags are already set when we write it. Verified end-to-end against the standalone /tmp/verify_pr_fixes.py harness — all 9 PR review-comment fixes pass, including pause/resume event ordering and concurrent start/stop without KeyError. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix markdown linting issues in leakhunt.md and repro.md * Handle recorder startup and query review fixes Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Tighten recorder follow-up tests Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Stabilize recorder startup tests Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Remove brittle recorder startup test Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Polish recorder follow-up errors Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Refine recorder startup and regex errors Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Clean up recorder follow-up nits Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/78540a9f-fe62-4350-b252-0ae5621f0b8a Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Trunk --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Meshtastic MCP Server — Test Harness
Automated test suite for the MCP server, organized around real operator concerns rather than generic "unit vs hardware".
Tiers
| Dir | Hardware | Question this tier answers |
|---|---|---|
unit/ |
none | Do the parsing / filtering / profile-generation primitives work? |
provisioning/ |
1 device, per-test bake | Did my pre-bake recipe stick? Does it survive a factory reset? |
admin/ |
1 device, shared bake | Do my daily admin ops (owner, channel URL, config writes) round-trip? |
mesh/ |
2 devices, shared bake | Do my devices actually form a mesh? Send + receive? ACKs? |
telemetry/ |
2 devices, shared bake | Is telemetry reporting? Is position broadcast correct? |
monitor/ |
1 device, shared bake | Is the boot log clean (no panics)? |
fleet/ |
varies | Are my CI runs isolated from each other? Are reflashes idempotent? |
Quick start
cd mcp-server
pip install -e ".[test]"
# No hardware — 33 unit tests, ~3 seconds
pytest tests/unit -v
# Hub attached (nRF52840 + ESP32-S3) — first run bakes, then exercises everything
pytest tests/ --html=report.html
# Hub already baked with session profile (dev loop) — skip bake
pytest tests/ --assume-baked --html=report.html
# Force a rebake (new firmware, new seed, etc.)
pytest tests/ --force-bake --html=report.html
CLI flags
--force-bake— always reflash both roles at session start, even if the current state matches the session profile.--assume-baked— skiptest_00_bake.pyentirely. Use when you know the devices are already baked and want a fast dev loop.--hub-profile=<yaml>— point at a YAML file for non-default hub hardware. Default targets VID0x239a(nRF52) and0x303a/0x10c4(ESP32-S3).--no-teardown-rebake— skip the session-end rebake thatprovisioning/andfleet/tests perform. Useful in rapid iteration.
Environment variables
MESHTASTIC_FIRMWARE_ROOT— firmware repo path (defaults to../from tests/)MESHTASTIC_MCP_ENV_NRF52— PlatformIO env for the nRF52 role (defaultrak4631)MESHTASTIC_MCP_ENV_ESP32S3— PlatformIO env for the ESP32-S3 role (defaultheltec-v3)MESHTASTIC_MCP_SEED— override the session PSK seed (default:pytest-<unix-ts>). Set this to reproduce a specific failing run.
Fixtures you'll use when adding tests
All defined in conftest.py:
hub_devices→{"nrf52": "/dev/cu.X", "esp32s3": "/dev/cu.Y"}. Auto- skips the test if a required role isn't present.test_profile→ USERPREFS dict for the session (build_testing_profile).no_region_profile→ variant withoutUSERPREFS_CONFIG_LORA_REGION.baked_mesh→ verifies both devices are baked with the session profile (does NOT reflash — that'stest_00_bake.py's job).baked_single→ single verified baked device; parametrizerequest.paramto pick role.serial_capture→ factory;cap = serial_capture("esp32s3")starts a pio device monitor session, drains into a per-test buffer, attaches the buffer to the pytest-html report on failure.wait_until→ exponential-backoff polling helper;wait_until(lambda: predicate(), timeout=60)replaces flakytime.sleep()patterns.
Reports
pytest --html=report.html produces a self-contained HTML with:
- Per-test pass/fail/skip with timings
- On failure: serial log capture from any
serial_capturefixture used - On failure:
device_info+ lora config JSON for every role on the hub - Session seed and session start time (for reproducibility)
pytest --junitxml=junit.xml produces CI-integration XML.
tool_coverage.json is emitted at session end in the tests directory — shows
which of the 38 MCP tools the run exercised. Useful for closing test gaps.
Adding a new test
- Pick the category that matches the operator concern (not the technical
surface). "Does my fleet's owner name persist" is
admin/, notunit/. - If you need both devices, depend on
baked_mesh. If you need one, depend onbaked_single. If you need to mutate hardware state, put it inprovisioning/orfleet/and add atry/finallyteardown that re-bakes the session profile. - Use
wait_untilfor anything involving LoRa timing — fixedsleep()produces flakes. - Use
serial_capturewhen you need to observe firmware log output (e.g. "did the packet get decoded?"). - Add a
@pytest.mark.timeout(N)— mesh tests routinely hit LoRa-airtime waits; default pytest timeout is infinite.
Troubleshooting
- All hardware tests SKIP → hub not detected. Plug in the USB hub, verify
with
pytest tests/ --collect-onlyorpython -c "from meshtastic_mcp import devices; print(devices.list_devices())". baked_meshfails with "devices not baked" → runpytest tests/test_00_bake.pyfirst, or pass--force-bakeon the full run.- Mesh formation tests time out → check that both devices are on the same
session profile (
--force-bakeforces both to the current seed). - Provisioning tests leave device in bad state → teardowns re-bake, but
if a test crashes between "bake broken state" and "bake good state", run
pytest tests/test_00_bake.py --force-baketo recover.