* Remove gradient sync nonce and simplify replay handling * Fix ONLY_CONFIG replay gating and stale gradient-sync comments Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/cfa93978-e2e0-4dc2-ba5f-b82b5b43cef8 Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> * Add transport mechanism to replay packets for client filtering * Comments * Update protobuf definitions to include precision_bits in PositionLite * Propagate position precision_bits and remove verbose NodeInfo sync log Agent-Logs-Url: https://github.com/meshtastic/firmware/sessions/41572cbc-408e-499d-b59e-00f330b5789f Co-authored-by: thebentern <9000580+thebentern@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
59 KiB
Meshtastic Firmware - Copilot Instructions
This document provides context and guidelines for AI assistants working with the Meshtastic firmware codebase.
Project Overview
Meshtastic is an open-source LoRa mesh networking project for long-range, low-power communication without relying on internet or cellular infrastructure. The firmware enables text messaging, location sharing, and telemetry over a decentralized mesh network. The project uses C++17 as its language standard across all platforms.
Supported Hardware Platforms
- ESP32 (ESP32, ESP32-S3, ESP32-C3, ESP32-C6) - Most common platform
- nRF52 (nRF52840, nRF52833) - Low power Nordic chips
- RP2040/RP2350 - Raspberry Pi Pico variants
- STM32WL - STM32 with integrated LoRa
- Linux/Portduino - Native Linux builds (Raspberry Pi, etc.)
- macOS native - Headless
meshtasticdon Apple Silicon / x86_64; seevariants/native/portduino/platformio.inifor Homebrew prereqs + CH341 LoRa setup
Supported Radio Chips
- SX1262/SX1268 - Sub-GHz LoRa (868/915 MHz regions)
- SX1280 - 2.4 GHz LoRa
- LR1110/LR1120/LR1121 - Wideband radios (sub-GHz and 2.4 GHz capable, but not simultaneously)
- RF95 - Legacy RFM95 modules
- LLCC68 - Low-cost LoRa
MQTT Integration
MQTT provides a bridge between Meshtastic mesh networks and the internet, enabling nodes with network connectivity to share messages with remote meshes or external services.
Key Components
src/mqtt/MQTT.cpp- Main MQTT client singleton, handles connection and message routingsrc/mqtt/ServiceEnvelope.cpp- Protobuf wrapper for mesh packets sent over MQTTmoduleConfig.mqtt- MQTT module configuration
MQTT Topic Structure
Messages are published/subscribed using a hierarchical topic format:
{root}/{channel_id}/{gateway_id}
root- Configurable prefix (default:msh)channel_id- Channel name/identifiergateway_id- Node ID of the publishing gateway
Configuration Defaults (from Default.h)
#define default_mqtt_address "mqtt.meshtastic.org"
#define default_mqtt_username "meshdev"
#define default_mqtt_password "large4cats"
#define default_mqtt_root "msh"
#define default_mqtt_encryption_enabled true
#define default_mqtt_tls_enabled false
Key Concepts
- Uplink - Mesh packets sent TO the MQTT broker (controlled by
uplink_enabledper channel) - Downlink - MQTT messages received and injected INTO the mesh (controlled by
downlink_enabledper channel) - Encryption - When
encryption_enabledis true, only encrypted packets are sent; plaintext JSON is disabled - ServiceEnvelope - Protobuf wrapper containing packet + channel_id + gateway_id for routing
- JSON Support - Optional JSON encoding for integration with external systems (disabled on nRF52 by default)
PKI Messages
PKI (Public Key Infrastructure) messages have special handling:
- Accepted on a special "PKI" channel
- Allow encrypted DMs between nodes that discovered each other on downlink-enabled channels
Encryption & Key Management
Meshtastic packets on the air are typically encrypted one of two ways: the per-channel symmetric layer (AES-CTR with a shared PSK) for broadcasts and channel traffic, and the per-peer PKI layer (X25519 ECDH → AES-256-CCM) for direct messages and remote admin. A channel with a 0-byte PSK (or Ham mode, which wipes PSKs) transmits cleartext — see the size table below. Both are implemented in src/mesh/CryptoEngine.cpp; the send/receive dispatch lives in src/mesh/Router.cpp; admin authorization lives in src/modules/AdminModule.cpp.
High-level model
- Channels are symmetric rooms: anyone with the PSK can read any message on the channel. Channel 0 is the "primary" channel and ships with the short-form default PSK on factory devices, forming the public mesh most users join. (The LoRa modem preset
LONG_FASTlives onconfig.lora.modem_presetand is an independent field — don't conflate "channel 0 default PSK" with the modem preset name.) - DMs addressed to a single node require PKI so that other holders of the channel PSK can't read them. Outside Ham mode, Meshtastic does not fall back to channel-symmetric encryption when the destination public key is unknown.
- Remote admin is a DM carrying an
AdminMessage. The receiver only acts on it if the sender's public key is on its allowlist (config.security.admin_key[0..2]). - Ham mode (
owner.is_licensed=true, whereowneris the localmeshtastic_Userrecord) disables PKI entirely and sends cleartext — FCC Part 97 prohibits encryption on amateur bands. - No ratchet, no session. Every packet is encrypted from scratch — a stateless design that matches the high-loss, store-and-forward nature of LoRa.
Symmetric channel encryption (AES-CTR)
CryptoEngine::encryptPacket / decrypt / encryptAESCtr in src/mesh/CryptoEngine.cpp.
- Cipher: AES-CTR, AES-128 or AES-256 depending on key length. Same routine in both directions (CTR is a stream cipher, so encrypt == decrypt).
- Key:
ChannelSettings.pskbytes. Size semantics:- 0 bytes → no encryption, cleartext on the air
- 1 byte → short-form index into the well-known
defaultpsk[]insrc/mesh/Channels.h. Index 0 = cleartext; 1 = defaultpsk unchanged; 2..255 = defaultpsk with its last byte incremented by (index − 1). This is what the CLI's--ch-set psk defaultproduces. - 16 bytes → raw AES-128 key
- 32 bytes → raw AES-256 key
- 2..15 bytes → zero-padded to 16 and used as AES-128 (with a warn log); 17..31 bytes → zero-padded to 32 and used as AES-256 (with a warn log). Defensive fallback for malformed PSK input, not something to rely on.
- Nonce (128 bit):
packet_id(u64 LE) ‖from_node(u32 LE) ‖block_counter(u32, starts at 0). Built inCryptoEngine::initNonce. - No AEAD: channel packets carry no MAC, so the channel-hash byte is not an integrity or authenticity check.
Channels::getHashis a 1-byte XOR-derived hint over the channel name bytes and PSK bytes that helps receivers pick a candidate channel/PSK for decryption. Because it is only a small hint and collisions are easy to find, it should be described purely as a PSK-selection aid, not as a security filter an attacker cannot bypass. - Channel 0 is special in one way only: it's the channel the Router attempts PKI decryption on before falling through to AES-CTR. Non-zero channels always go straight to AES-CTR.
PKI encryption for DMs (X25519 ECDH + AES-256-CCM)
CryptoEngine::encryptCurve25519 / decryptCurve25519 in src/mesh/CryptoEngine.cpp.
- Keypair: Curve25519 (aka X25519), 32-byte public + 32-byte private. Stored in
config.security.public_key/private_key; the public half is mirrored intoowner.public_keyso it rides along in NodeInfo broadcasts and propagates through the mesh like any other identity field. - Key generation (
generateKeyPair): stirsHardwareRNG::fill()(64 B from platform TRNG when available), the 16-bytemyNodeInfo.device_id, and a call torandom()into the rweather/Crypto library's software RNG, thenCurve25519::dh1.regeneratePublicKeyrecomputes the public half from a known private (used when restoring from backup). - Keygen entry points: at boot,
NodeDBcallsgenerateKeyPair(orregeneratePublicKeywhen a stored private key is present and passes a low-entropy check) directly when!owner.is_licensedandconfig.lora.region != UNSET.ensurePkiKeyswraps the same logic for runtime/admin flows — it's the pathAdminModule::handleSetConfigruns when first assigning a valid region or when security config is written; do not assume it's the universal boot-time gate, because the NodeDB path bypasses it. - Handshake:
Curve25519::dh2(local_private, remote_public) → 32-byte shared secret → SHA-256 → 32-byte AES-256 key. Recomputed per packet. The SHA-256 step is effectively a KDF over the raw ECDH output. - Cipher: AES-256-CCM via
aes_ccm_ae/aes_ccm_ad(src/mesh/aes-ccm.cpp). MAC length (theMparameter) is 8 bytes. No AAD — the MAC covers ciphertext only. - Nonce (13 bytes / 104 bit):
aes_ccm_ae/aes_ccm_aduse a 13-byte CCM nonce (L = 2is hardcoded insrc/mesh/aes-ccm.cpp), not a 16-byte nonce. For PKI packets,CryptoEngine::initNonce(fromNode, packetNum, extraNonce)starts from the usual packet-derived nonce material, then overwrites nonce bytes4..7with a fresh 32-bitextraNonce = random(). The effective nonce bytes are therefore: bytes0..3=packet_id, bytes4..7= transmittedextraNonce, bytes8..11=from_node, byte12=0x00. The receiver reconstructs the same 13-byte nonce from the packet metadata plus the appendedextraNonce. - Wire overhead: 12 bytes appended to the ciphertext = 8-byte MAC ‖ 4-byte extraNonce. Defined as
MESHTASTIC_PKC_OVERHEAD = 12insrc/mesh/RadioInterface.h. Only the 4-byteextraNonceis sent; the rest of the 13-byte CCM nonce is reconstructed from packet fields as described above. The Router's send path checks this overhead againstMAX_LORA_PAYLOAD_LENbefore committing to PKI. - Send selection (
Router::send): the sender enters the PKI path when all hold — we're the originator AND not Ham mode AND not Portduino simradio AND not on theserial/gpiochannels (unless the packet is already markedpki_encrypted) ANDconfig.security.private_key.size == 32AND destination is a single node (not broadcast) AND the portnum isn't infrastructure.TRACEROUTE_APP,NODEINFO_APP,ROUTING_APP, andPOSITION_APPare routed through channel encryption even when DMed (these need to be readable by relaying peers). Once on the PKI path, if the destination's public key isn't in our NodeDB the send fails withPKI_SEND_FAIL_PUBLIC_KEY— it does not silently fall back to channel encryption. If the client explicitly setpki_encrypted=trueand any condition blocks PKI, the send fails withPKI_FAILED. - Receive selection (
Router::perhapsDecode): try PKI decrypt first whenchannel == 0ANDisToUs(p)AND not broadcast AND both peers have public keys in NodeDB ANDrawSize > MESHTASTIC_PKC_OVERHEAD. On success the packet getspki_encrypted=truestamped and the sender's public key copied intop->public_keyfor downstream authorization.
Remote admin authorization
Implemented in src/modules/AdminModule.cpp → handleReceivedProtobuf. The authorization check runs in this order:
- Response messages — if
messageIsResponse(r)is true (the payload is a response to one of our earlier admin requests), it's accepted without any further check. The in-file comment flags this as a known-untightened gap: a stricter implementation would remember whichpublic_keywe last queried and reject responses that don't match. - Local admin —
mp.from == 0(phone app over BLE, serial CLI, internal module); never travels over the air. Rejected ifconfig.security.is_managedis true, because managed devices expect admin to arrive over the air through an authorized remote path. - Legacy admin channel (deprecated) — the packet arrived on a channel named literally
"admin". Gated byconfig.security.admin_channel_enabled; returnsNOT_AUTHORIZEDif the flag is false. Kept for backward compatibility; new deployments should use PKI admin. - PKI admin (preferred for remote) —
mp.pki_encrypted == trueANDmp.public_keymatches one ofconfig.security.admin_key[0..2](up to three authorized 32-byte Curve25519 public keys, typically copied from the admin node's ownuser.public_key). - Fallthrough →
NOT_AUTHORIZED.
On top of authorization, any remote admin message that mutates state (not a request, not a response) also has to pass a session-key check (checkPassKey): the client must first pull a fresh 8-byte session_passkey via get_admin_session_key_request, then echo that passkey back in the mutating message. The device rotates the passkey after 150 s and rejects values older than 300 s — a narrow anti-replay window on top of the PKI layer.
config.security.is_managed = true disables local admin writes (mp.from == 0 is rejected). It does not by itself force every admin action through PKI — the legacy "admin" channel still authorizes remote admin when config.security.admin_channel_enabled == true. The AdminModule refuses to persist is_managed=true unless at least one admin_key is populated — a deliberate guard against operators locking themselves out.
Key-rotation hazards (actions that invalidate peers)
factory_reset_device(the "full" variant, callsNodeDB::factoryReset(eraseBleBonds=true)) → wipes the X25519 private key; a fresh keypair is generated on the next region-set. Every existing peer holds the old public key, so DMs to this node silently fail PKI decrypt until every peer re-exchanges NodeInfo.factory_reset_config(the "partial" variant, callsNodeDB::factoryReset()witheraseBleBonds=false) → preserves the X25519 private key ininstallDefaultConfig(preserveKey=true); the public key is zeroed and gets rebuilt from the preserved private key on the next boot via the NodeDB path'sregeneratePublicKeycall. Identity is preserved and the mesh does not need to re-exchange keys.region=UNSET → valid region→ensurePkiKeysruns inside the samehandleSetConfigpath; missing keys get generated at that moment.- Ham mode transitions — entering Ham mode (
user.is_licensed=true) runsChannels::ensureLicensedOperation, which wipes every channel PSK (all traffic becomes cleartext) and disables the legacy admin channel. The X25519 private key is preserved on the device but not used becauseRouter::sendskips PKI whenowner.is_licensedis true. Leaving Ham mode re-enables PKI with the preserved keypair but does not restore the wiped channel PSKs — the operator has to re-set them. - Channel 0 PSK change → every peer must re-learn the channel hash; cached NodeInfo becomes temporarily unreachable until the next broadcast.
security.private_keyblanked via admin → regenerates both halves (unless in Ham mode) and propagates the new public key via NodeInfo.
NodeDB Layout (v25)
DEVICESTATE_CUR_VER = 25, DEVICESTATE_MIN_VER = 24. The on-device NodeDB was split in v25 into a slim header table plus four optional satellite stores. Older v24 saves auto-migrate at boot. Old training-data instincts (node->user.long_name, node->position.latitude_i, node->is_favorite, node->device_metrics.battery_level) are wrong now — the fields aren't there. Read this section before touching anything that walks nodeDB->meshNodes.
Slim NodeInfoLite
UserLite is flattened onto NodeInfoLite (no nested sub-message); position and device_metrics are removed entirely (tags reserved). MAC address is dropped. Long names are capped at 25 chars (max_size:25 in deviceonly.options); hw_model and role are int_size:8. Encoded size dropped from ~166 B → ~105 B per node.
Booleans are bit-packed into NodeInfoLite.bitfield. Do not read or write the bits directly — use the inline helpers in src/mesh/NodeDB.h:
nodeInfoLiteHasUser(n) // bit 5 — user fields populated
nodeInfoLiteIsFavorite(n) // bit 3
nodeInfoLiteIsIgnored(n) // bit 4
nodeInfoLiteIsMuted(n) // bit 1
nodeInfoLiteIsLicensed(n) // bit 6 — Ham mode peer
nodeInfoLiteIsKeyManuallyVerified(n) // bit 0
nodeInfoLiteHasIsUnmessagable(n) // bit 8 — "is_unmessagable was sent"
nodeInfoLiteIsUnmessagable(n) // bit 7
// via_mqtt is bit 2 (mask exposed; predicate uses the mask directly)
nodeInfoLiteSetBit(n, NODEINFO_BITFIELD_IS_FAVORITE_MASK, true); // setter
Satellite stores
Four std::unordered_map<NodeNum, …> members on NodeDB, each gated by its own build flag:
| Map | Value type | Build flag |
|---|---|---|
nodePositions |
meshtastic_PositionLite |
MESHTASTIC_EXCLUDE_POSITIONDB |
nodeTelemetry |
meshtastic_DeviceMetrics |
MESHTASTIC_EXCLUDE_TELEMETRYDB |
nodeEnvironment |
meshtastic_EnvironmentMetrics |
MESHTASTIC_EXCLUDE_ENVIRONMENTDB |
nodeStatus |
meshtastic_StatusMessage |
MESHTASTIC_EXCLUDE_STATUSDB |
Defaults are ON (i.e., maps excluded) for STM32WL only — see src/mesh/mesh-pb-constants.h. On every other arch all four maps are present. When excluded, the map member is absent and the corresponding accessors return false.
All four maps are guarded by mutable concurrency::Lock satelliteMutex — concurrent access from receive threads, the phone API state machine, and the renderer is the rule, not the exception.
Accessor convention
Never hand out pointers into the maps. Use the copy-out accessors on NodeDB:
bool copyNodePosition(NodeNum, meshtastic_PositionLite &out) const;
bool copyNodeTelemetry(NodeNum, meshtastic_DeviceMetrics &out) const;
bool copyNodeEnvironment(NodeNum, meshtastic_EnvironmentMetrics &out) const;
bool copyNodeStatus(NodeNum, meshtastic_StatusMessage &out) const;
Each takes the lock, copies the value if present, returns false if the entry is absent or the DB is excluded. Pass-by-out-param is deliberate — pointer-style accessors would invite UAF and lock-leak bugs across the renderer. The "has any X" convenience predicates (hasValidPosition etc.) are implemented in terms of these.
Writers go through setNodeStatus, updatePosition, updateTelemetry (which dispatches on which_variant for device vs environment metrics) — these own the lock and the eviction hooks.
Eviction
Every code path that drops a node from the header table must also evict the satellites. The single chokepoint is eraseNodeSatellites(NodeNum); it's already called from getOrCreateMeshNode's oldest-boring eviction, removeNodeByNum, both branches of resetNodes, cleanupMeshDB, addFromContact's ignored-branch, and AdminModule's set_ignored_node. Add new eviction sites here, not by calling .erase() directly.
Sync flow: thin NodeInfo + post-COMPLETE_ID replay (no opt-in)
There is no capability flag and no special "gradient" nonce. The default sync flow is:
- Config / module-config / channel / metadata segments (same as before).
STATE_SEND_OWN_NODEINFO— our own NodeInfo, still bundled with our position and device_metrics (because the replay snapshot excludes our own NodeNum). Emitted viaConvertToNodeInfo(lite).STATE_SEND_OTHER_NODEINFOS— every other peer's NodeInfo, always thin (noposition, nodevice_metrics). Emitted viaConvertToNodeInfoThin(lite).STATE_SEND_FILEMANIFEST→STATE_SEND_COMPLETE_ID— the phone seesconfig_complete_idand treats sync as done.STATE_SEND_PACKETS— live mesh packets, with a trailing replay drain interleaved. The replay drain walks four cached satellite stores in order (positions → telemetry → environment → status) and emits each cached entry as an ordinaryMeshPacketon the matching portnum (POSITION_APP,TELEMETRY_APPdevice + environment variants,NODE_STATUS_APP). These are indistinguishable on the wire from live mesh traffic, so clients need no special handling — any code that already updates UI onPOSITION_APPetc. works.
PhoneAPI::sendConfigComplete() arms replayPhase = REPLAY_PHASE_POSITIONS for default/full sync and SPECIAL_NONCE_ONLY_NODES, while SPECIAL_NONCE_ONLY_CONFIG skips replay. The drain runs inside STATE_SEND_PACKETS via popReplayPacket(), lower priority than live traffic. When all four phases drain, replayPhase flips back to REPLAY_PHASE_IDLE and the snapshot vectors get shrink_to_fited.
STM32WL and any other build with all four MESHTASTIC_EXCLUDE_*DB flags set produces zero replay packets — popReplayPacket advances through each phase in microseconds without emitting anything.
Special nonces that still mean something:
SPECIAL_NONCE_ONLY_CONFIG(69420) — skip node sync entirely, just config.SPECIAL_NONCE_ONLY_NODES(69421) — skip config segments, jump straight toSTATE_SEND_OWN_NODEINFO. Still gets the post-COMPLETE_ID replay drain.
There are no other reserved nonces; everything else is a fresh random want_config_id from the client.
v24 → v25 migration
The legacy migration code lives in src/mesh/NodeDBLegacyMigration.cpp, not in NodeDB.cpp. It owns the meshtastic_NodeDatabase_Legacy callback and NodeDB::migrateLegacyNodeDatabase(). The legacy proto descriptor is protobufs/meshtastic/deviceonly_legacy.proto (only included by the migration TU). The boot path peeks the file's leading version tag, runs the migration if version < 25, then re-saves in v25 layout. The legacy descriptor is scheduled for removal once DEVICESTATE_MIN_VER is bumped.
Read-site rules of thumb
- Never
node->position.X/node->device_metrics.X— those fields no longer exist. Pull from the satellite map viacopyNodePosition/copyNodeTelemetry. - Never
node->user.long_name—long_name,short_name,public_key,hw_model,role,macaddr(gone),is_licensed,is_unmessagableare flat onNodeInfoLite. - Never
node->is_favorite/node->is_ignored/node->via_mqtt/node->is_key_manually_verified— use the bitfield helpers. - Never assume
nodeDB->getMeshNode(num)->position.time— callcopyNodePositionand check the return. - Don't lock
satelliteMutexyourself in renderer code; the copy-out accessors already do.
Unit tests for the conversion layer live in test/test_type_conversions/test_main.cpp (Unity) — bitfield round-trips, long_name truncation, thin-vs-full conversions. Add cases there when extending the schema.
Project Structure
firmware/
├── src/ # Main source code
│ ├── main.cpp # Application entry point
│ ├── mesh/ # Core mesh networking
│ │ ├── NodeDB.* # Node database management
│ │ ├── Router.* # Packet routing
│ │ ├── Channels.* # Channel management
│ │ ├── CryptoEngine.* # AES-CTR (channels) + X25519 ECDH→AES-256-CCM (PKI for DMs/admin)
│ │ ├── *Interface.* # Radio interface implementations
│ │ ├── api/ # WiFi/Ethernet server APIs (ServerAPI, PacketAPI)
│ │ ├── http/ # HTTP server (WebServer, ContentHandler)
│ │ ├── wifi/ # WiFi support (WiFiAPClient)
│ │ ├── eth/ # Ethernet support (ethClient)
│ │ ├── udp/ # UDP multicast
│ │ ├── compression/ # Message compression (unishox2)
│ │ └── generated/ # Protobuf generated code
│ ├── modules/ # Feature modules (Position, Telemetry, etc.)
│ │ └── Telemetry/ # Telemetry subsystem
│ │ └── Sensor/ # 50+ I2C sensor drivers
│ ├── gps/ # GPS handling
│ ├── graphics/ # Display drivers and UI
│ │ └── niche/ # Specialized UIs (InkHUD e-ink framework)
│ ├── platform/ # Platform-specific code (esp32, nrf52, rp2xx0, stm32wl, portduino)
│ ├── input/ # Input device handling (InputBroker, keyboards, buttons)
│ ├── detect/ # I2C hardware auto-detection (80+ device types)
│ ├── motion/ # Accelerometer drivers (BMA423, BMI270, MPU6050, etc.)
│ ├── mqtt/ # MQTT bridge client
│ ├── power/ # Power HAL
│ ├── nimble/ # BLE via NimBLE
│ ├── buzz/ # Audio/notification (buzzer, RTTTL)
│ ├── serialization/ # JSON serialization, COBS encoding
│ ├── watchdog/ # Hardware watchdog thread
│ ├── concurrency/ # Threading utilities (OSThread, Lock)
│ ├── PowerFSM.* # Power finite state machine
│ └── Observer.h # Observer/Observable event pattern
├── variants/ # Hardware variant definitions
│ ├── esp32/ # ESP32 variants
│ ├── esp32s3/ # ESP32-S3 variants
│ ├── esp32c3/ # ESP32-C3 variants
│ ├── esp32c6/ # ESP32-C6 variants
│ ├── nrf52840/ # nRF52 variants
│ ├── rp2040/ # RP2040/RP2350 variants
│ ├── stm32/ # STM32WL variants
│ └── native/ # Linux/Portduino variants
├── protobufs/ # Protocol buffer definitions
├── boards/ # Custom PlatformIO board definitions
├── test/ # Unit tests (12 test suites)
└── bin/ # Build and utility scripts
Coding Conventions
General Style
- Follow existing code style - run
trunk fmtbefore commits - Prefer
LOG_DEBUG,LOG_INFO,LOG_WARN,LOG_ERRORfor logging - Use
assert()for invariants that should never fail - C++17 features are available (
std::optional, structured bindings,if constexpr, etc.) - Keep code comments minimal — one or two lines, max. Comment only when the why isn't obvious from the code; never restate what the next line does. No multi-paragraph block comments explaining straightforward changes. The diff and commit message carry the rationale; the code carries the behavior.
- Use
Throttlefor time-based rate limiting, not rawmillis()math.src/mesh/Throttle.hprovidesThrottle::isWithinTimespanMs(lastMs, intervalMs)(returns true while inside the cooldown) andThrottle::execute(&lastMs, intervalMs, func)(function-pointer form that updates the timestamp on fire). Use these for any "did N ms pass since X" check — rawmillis() > lastMs + Nis rollover-unsafe (breaks after ~49.7 days) and inconsistent with the rest of the codebase. The helpers computenow - lastMswith unsigned subtraction, which wraps correctly.
Naming Conventions
- Classes:
PascalCase(e.g.,PositionModule,NodeDB) - Functions/Methods:
camelCase(e.g.,sendOurPosition,getNodeNum) - Constants/Defines:
UPPER_SNAKE_CASE(e.g.,MAX_INTERVAL,ONE_DAY) - Member variables:
camelCase(e.g.,lastGpsSend,nodeDB) - Config defines:
USERPREFS_*for user-configurable options
Key Patterns
Module System
Modules use a three-tier class hierarchy:
MeshModule- Base class. ImplementwantPacket()andhandleReceived(). ReturnsProcessMessage::STOPorProcessMessage::CONTINUE.SinglePortModule- Handles a single portnum. SimplifiedwantPacket()that checksdecoded.portnum.ProtobufModule<T>- Template for protobuf-based modules. Handles encoding/decoding automatically.
Most modules also inherit from OSThread for periodic tasks (the "mixin" pattern):
class MyModule : public ProtobufModule<meshtastic_MyMessage>, private concurrency::OSThread
{
public:
MyModule();
protected:
virtual bool handleReceivedProtobuf(const meshtastic_MeshPacket &mp, meshtastic_MyMessage *msg) override;
virtual meshtastic_MeshPacket *allocReply() override; // Generate response packets
virtual int32_t runOnce() override; // Periodic task (returns next interval in ms)
virtual bool alterReceivedProtobuf(meshtastic_MeshPacket &mp, meshtastic_MyMessage *msg); // Modify in-flight
virtual bool wantUIFrame(); // Request a UI display frame
};
Modules are registered in src/modules/Modules.cpp guarded by MESHTASTIC_EXCLUDE_* flags.
Observer/Observable Pattern
Event-driven communication between subsystems uses src/Observer.h:
// Observable emits events
Observable<const meshtastic::Status *> newStatus;
newStatus.notifyObservers(&status);
// Observer receives events via callback
CallbackObserver<MyClass, const meshtastic::Status *> statusObserver =
CallbackObserver<MyClass, const meshtastic::Status *>(this, &MyClass::handleStatusUpdate);
Configuration Access
config.*- Device configuration (LoRa, position, power, etc.)moduleConfig.*- Module-specific configurationchannels.*- Channel configuration and managementowner- Device owner infomyNodeInfo- Local node info
Default Values
Use the Default class helpers in src/mesh/Default.h:
Default::getConfiguredOrDefaultMs(configured, default)- Returns ms, using default if configured is 0Default::getConfiguredOrDefault(configured, default)- Generic configured/default getterDefault::getConfiguredOrMinimumValue(configured, min)- Enforces minimum valuesDefault::getConfiguredOrDefaultMsScaled(configured, default, numNodes)- Scales based on network size
Thread Safety
- Use
concurrency::Lockandconcurrency::LockGuardfor mutex protection - Radio SPI access uses
SPILock - Prefer
OSThreadfor background tasks
Hardware Detection
src/detect/ScanI2C automatically enumerates 80+ I2C device types at boot including displays, sensors, RTCs, keyboards, PMUs, and touch controllers. This drives automatic initialization of the correct drivers.
Graphics/UI System
Multiple display driver families in src/graphics/:
- OLED: SSD1306, SH1106, ST7567
- TFT: TFTDisplay (LovyanGFX-based)
- E-Ink: EInkDisplay2, EInkDynamicDisplay, EInkParallelDisplay
InkHUD (src/graphics/niche/InkHUD/) is an event-driven e-ink UI framework:
- Applet-based architecture — modular display tiles
- Read-only, static display optimized for minimal refreshes and low power
- Configured per-variant via
nicheGraphics.h - Separate PlatformIO config:
src/graphics/niche/InkHUD/PlatformioConfig.ini
Input System
src/input/InputBroker is the centralized input event dispatcher. Supports multiple input sources: buttons, keyboards (BBQ10, Cardputer, TCA8418), touch screens, rotary encoders, and matrix keyboards.
Power Management
src/PowerFSM.* implements a finite state machine with states: stateON, statePOWER, stateSERIAL, stateDARK. Key events: EVENT_PRESS, EVENT_WAKE_TIMER, EVENT_LOW_BATTERY, EVENT_RECEIVED_MSG, EVENT_SHUTDOWN. Conditionally excluded with MESHTASTIC_EXCLUDE_POWER_FSM (falls back to FakeFsm).
Motion Sensors
src/motion/AccelerometerThread provides background motion monitoring with automatic screen wake and double-tap button press detection. Supports 10+ accelerometer/gyroscope chips (BMA423, BMI270, MPU6050, LIS3DH, LSM6DS3, STK8XXX, QMA6100P, ICM20948, BMX160).
Telemetry Sensor Library
src/modules/Telemetry/Sensor/ contains 50+ I2C sensor drivers organized by category:
- Power monitoring: INA219/226/260/3221, MAX17048
- Environmental: BME280/680, SCD4X (CO₂), SEN5X (particulate)
- Humidity/Temperature: SHT3X/4X, AHT10, MCP9808, MLX90614
- Light: BH1750, TSL2561/2591, VEML7700, LTR390UV, OPT3001
- Air quality: PMSA003I, SFA30
- Specialized: CGRadSens (radiation), NAU7802 (weight scale)
API/Networking
src/mesh/api/ provides a template-based ServerAPI for client communication over WiFi (WiFiServerAPI) and Ethernet (ethServerAPI). Default port: 4403. HTTP server in src/mesh/http/. JSON serialization in src/serialization/MeshPacketSerializer.
Hardware Variants
Each hardware variant has:
variant.h- Pin definitions and hardware capabilitiesplatformio.ini- Build configuration- Optional:
pins_arduino.h,rfswitch.h,nicheGraphics.h(for InkHUD variants)
Key defines in variant.h:
#define USE_SX1262 // Radio chip selection
#define HAS_GPS 1 // Hardware capabilities
#define HAS_SCREEN 1 // Display present
#define LORA_CS 36 // Pin assignments
#define SX126X_DIO1 14 // Radio-specific pins
Protobuf Messages
- Defined in
protobufs/meshtastic/*.proto(~32 proto files) - Generated code in
src/mesh/generated/meshtastic/ - Regenerate with
bin/regen-protos.sh - Message types prefixed with
meshtastic_ - Nanopb
.optionsfiles control field sizes and encoding
Conditional Compilation
#if !MESHTASTIC_EXCLUDE_GPS // Feature exclusion
#if !MESHTASTIC_EXCLUDE_WIFI // Network feature exclusion
#if !MESHTASTIC_EXCLUDE_BLUETOOTH // BLE exclusion
#if !MESHTASTIC_EXCLUDE_POWER_FSM // Power FSM exclusion
#ifdef ARCH_ESP32 // Architecture-specific
#ifdef ARCH_NRF52 // Nordic platform
#ifdef ARCH_RP2040 // Raspberry Pi Pico
#ifdef ARCH_PORTDUINO // Linux native
#if defined(USE_SX1262) // Radio-specific
#ifdef HAS_SCREEN // Hardware capability
#if USERPREFS_EVENT_MODE // User preferences
Build System
Agent Tooling Baseline
Mirror counterpart: AGENTS.md under Agent Tooling Baseline.
To reduce avoidable agent mistakes, assume these tools are available (or install them before significant repo work):
- Required CLI basics:
bash,git,find,grep,sed,awk,xargs - Strongly recommended:
rg(ripgrep) for fast file/text search,jqfor JSON processing - Build/test tools:
python3,pip, virtualenv (python3 -m venv),platformio(pio) - Containerized native testing:
docker(fallback for non-Linux hosts; macOS can also build natively viapio run -e native-macos)
Fallback expectations for agents:
- If
rgis unavailable, usefind+grepinstead of failing. - For native tests on hosts without Linux deps, prefer
./bin/test-native-docker.sh. - The simulator helper script is
./bin/test-simulator.sh.
Uses PlatformIO with custom scripts:
bin/platformio-pre.py- Pre-build scriptbin/platformio-custom.py- Custom build logic, manifest generation
Build commands:
pio run -e tbeam # Build specific target
pio run -e tbeam -t upload # Build and upload
pio run -e native # Build native/Linux version
pio run -e native-macos # Build headless macOS meshtasticd (Homebrew prereqs in variants/native/portduino/platformio.ini)
Build Manifest
bin/platformio-custom.py emits a build manifest with metadata:
hasMui,hasInkHud- UI capability flags (overridable viacustom_meshtastic_has_mui,custom_meshtastic_has_ink_hud)- Architecture normalization (e.g.,
esp32s3→esp32-s3for API compatibility)
Common Tasks
Adding a New Module
- Create
src/modules/MyModule.cppand.h - Inherit from appropriate base class (
MeshModule,SinglePortModule, orProtobufModule<T>) - Mix in
concurrency::OSThreadif periodic work is needed - Register in
src/modules/Modules.cppguarded by#if !MESHTASTIC_EXCLUDE_MYMODULE - Add protobuf messages if needed in
protobufs/meshtastic/ - Add test suite in
test/test_mymodule/if applicable
Adding a New Hardware Variant
- Create directory under
variants/<arch>/<name>/ - Add
variant.hwith pin definitions and hardware capability defines - Add
platformio.iniwith build config — useextendsto reference common base (e.g.,esp32s3_base) - Set
custom_meshtastic_support_level = 1(PR builds) or2(merge builds) - For e-ink displays, add
nicheGraphics.hfor InkHUD configuration
Adding a New Telemetry Sensor
- Create driver in
src/modules/Telemetry/Sensor/following existing sensor pattern - Register I2C address in
src/detect/ScanI2Cfor auto-detection - Integrate with the appropriate telemetry module (Environment, Health, Power, AirQuality)
- Add proto fields in
protobufs/meshtastic/telemetry.protoif new data types are needed
Modifying Configuration Defaults
- Check
src/mesh/Default.hfor default value defines - Check
src/mesh/NodeDB.cppfor initialization logic - Consider
isDefaultChannel()checks for public channel restrictions
Important Considerations
Traffic Management
The mesh network has limited bandwidth. When modifying broadcast intervals:
- Respect minimum intervals on default/public channels
- Use
Default::getConfiguredOrMinimumValue()to enforce minimums - Consider
numOnlineNodesscaling for congestion control
Power Management
Many devices are battery-powered:
- Use
IF_ROUTER(routerVal, normalVal)for role-based defaults - Check
config.power.is_power_savingfor power-saving modes - Implement proper
sleep()methods in radio interfaces
Channel Security
channels.isDefaultChannel(index)- Check if using default/public settings- Default channels get stricter rate limits to prevent abuse
- Private channels may have relaxed limits
GitHub Actions CI/CD
The project uses GitHub Actions extensively for CI/CD. Key workflows are in .github/workflows/:
Core CI Workflows
-
main_matrix.yml- Main CI pipeline, runs on push tomaster/developand PRs- Uses
bin/generate_ci_matrix.pyto dynamically generate build targets - Builds all supported hardware variants
- PRs build a subset (
--level pr) for faster feedback
- Uses
-
trunk_check.yml- Code quality checks on PRs- Runs Trunk.io for linting and formatting
- Must pass before merge
-
tests.yml- End-to-end and hardware tests- Runs daily on schedule
- Includes native tests and hardware-in-the-loop testing
-
test_native.yml- Native platform unit tests- Runs
pio test -e native
- Runs
Release Workflows
-
release_channels.yml- Triggered on GitHub release publish- Builds Docker images
- Packages for PPA (Ubuntu), OBS (openSUSE), and COPR (Fedora)
- Handles Alpha/Beta/Stable release channels
-
nightly.yml- Nightly builds from develop branch -
docker_build.yml/docker_manifest.yml- Docker image builds
Build Matrix Generation
The CI uses bin/generate_ci_matrix.py to dynamically select which targets to build:
# Generate full build matrix
./bin/generate_ci_matrix.py all
# Generate PR-level matrix (subset for faster builds)
./bin/generate_ci_matrix.py all --level pr
Variants can specify their support level in platformio.ini:
custom_meshtastic_support_level = 1- Actively supported, built on every PRcustom_meshtastic_support_level = 2- Supported, built on merge to main branchesboard_level = extra- Extra builds, only on full releases
Running Workflows Locally
Most workflows can be triggered manually via workflow_dispatch for testing.
Testing
Native unit tests (C++)
Unit tests in test/ directory with 12 test suites:
test_crypto/- Cryptographytest_mqtt/- MQTT integrationtest_radio/- Radio interfacetest_mesh_module/- Module frameworktest_meshpacket_serializer/- Packet serializationtest_transmit_history/- Retransmission trackingtest_atak/- ATAK integrationtest_default/- Default configurationtest_http_content_handler/- HTTP handlingtest_serial/- Serial communication
Run with: pio test -e native
Simulation testing: bin/test-simulator.sh
Quick entry point for new test modules: test/README.md (native unit-test authoring guide, skeleton, pitfalls, and setup checklist).
Hardware-in-the-loop tests (mcp-server/tests/)
Separate pytest suite that exercises real USB-connected Meshtastic devices. See the MCP Server & Hardware Test Harness section below for invocation, tier layout, and agent usage rules.
MCP Server & Hardware Test Harness
The mcp-server/ directory houses a firmware-aware MCP server plus a pytest-based integration suite. AI agents that speak MCP get a well-defined tool surface for flashing, configuring, and inspecting physical Meshtastic devices — use it instead of hand-rolling pio or meshtastic --port calls where possible. mcp-server/README.md is the operator-facing setup doc; this section is the agent-facing usage contract.
The repo registers the server via .mcp.json at the repo root — Claude Code picks it up automatically once mcp-server/.venv/ is built (cd mcp-server && python3 -m venv .venv && .venv/bin/pip install -e '.[test]').
When to use which surface
| Goal | Tool |
|---|---|
| Find a connected device | mcp__meshtastic__list_devices |
| Read a live node's config/state | mcp__meshtastic__device_info, list_nodes, get_config |
| Mutate a device (owner, region, channels, reboot) | set_owner, set_config, set_channel_url, reboot, shutdown, factory_reset — all require confirm=True |
| Flash firmware to a variant | pio_flash (any arch) or erase_and_flash (ESP32 factory install) |
| Stream serial logs while debugging | serial_open → serial_read loop → serial_close |
Administer userPrefs.jsonc build-time constants |
userprefs_get, userprefs_set, userprefs_reset, userprefs_manifest |
| Run the regression suite | ./mcp-server/run-tests.sh (or /test slash command) |
| Diagnose a specific device | /diagnose [role] slash command (read-only) |
| Triage a flaky test | /repro <node-id> [count] slash command |
One MCP call per port at a time. SerialInterface holds an exclusive OS-level lock on the serial port for its lifetime. If a serial_* session is open on /dev/cu.usbmodem101, calling device_info on the same port will fail fast pointing at the active session. Sequence calls: open → read/mutate → close, then next device. Never parallelize tool calls on the same port.
MCP tool surface (43 tools)
Grouped by purpose. Full argument shapes in mcp-server/README.md; a few high-value signatures are called out here.
- Discovery & metadata:
list_devices,list_boards,get_board - Build & flash:
build,clean,pio_flash,erase_and_flash(ESP32 only),update_flash(ESP32 OTA),touch_1200bps - Serial sessions (long-running, 10k-line ring buffer):
serial_open,serial_read,serial_list,serial_close - Device reads:
device_info,list_nodes - Device writes:
set_owner,get_config,set_config,get_channel_url,set_channel_url,send_text,send_input_event(inject a button/key press via the firmware's InputBroker),set_debug_log_api; destructive/power-state writes requireconfirm=True:reboot,shutdown,factory_reset - userPrefs admin (build-time constants, not runtime config):
userprefs_get,userprefs_set,userprefs_reset,userprefs_manifest,userprefs_testing_profile - Vendor escape hatches:
esptool_chip_info,esptool_erase_flash,esptool_raw,nrfutil_dfu,nrfutil_raw,picotool_info,picotool_load,picotool_raw - USB power control (via
uhubctl, per-port PPPS toggle):uhubctl_list(read-only),uhubctl_power(action='on'|'off', confirm=True),uhubctl_cycle(delay_s, confirm=True). Target by raw(location, port)or byrole("nrf52","esp32s3"); role lookup checksMESHTASTIC_UHUBCTL_LOCATION_<ROLE>+_PORT_<ROLE>env vars first, falls back to VID auto-detection. - Observability (UI tier + operator ad-hoc):
capture_screen(role, ocr=True)— grabs a USB-webcam frame of the device OLED and optionally OCRs it. Requiresmcp-server[ui]extras (opencv-python-headless,easyocr) andMESHTASTIC_UI_CAMERA_DEVICE_<ROLE>env var; falls through to a 1×1 black PNGNullBackendwhen unconfigured.
confirm=True is a tool-level gate on top of whatever permission prompt your MCP host shows. Don't bypass it by asking the host to auto-approve — it exists specifically because MCP hosts sometimes remember "always allow this tool" and that's dangerous for factory_reset, erase_and_flash, uhubctl_power(action='off'), and uhubctl_cycle.
TCP / native-host nodes. Setting MESHTASTIC_MCP_TCP_HOST=<host[:port]> makes list_devices surface a meshtasticd daemon (e.g. the native-macos build) as a synthetic tcp://host:port entry, and connect() routes through meshtastic.tcp_interface.TCPInterface instead of SerialInterface. Every read/write/admin tool that flows through connect() works against the daemon transparently. USB-only tools (pio_flash, erase_and_flash, update_flash, touch_1200bps, serial_open, esptool_*, nrfutil_*, picotool_*) raise a clear ConnectionError when handed a tcp:// port; pio_flash against a native* env raises a FlashError (no upload step — use build and run the binary directly). The pytest harness still assumes USB-attached devices per role; TCP-aware fixtures are deferred. See mcp-server/README.md § "TCP / native-host nodes".
Hardware test suite (mcp-server/run-tests.sh)
The wrapper auto-detects connected devices (VID → role map: 0x239A → nrf52, 0x303A/0x10C4 → esp32s3), maps each role to a PlatformIO env (nrf52 → rak4631, esp32s3 → heltec-v3, overridable via MESHTASTIC_MCP_ENV_<ROLE>), then invokes pytest. Zero pre-flight config needed from the operator.
Suite tiers (collected + run in this order via pytest_collection_modifyitems):
tests/unit/— pure Python (boards parse, pio wrapper, userPrefs parse, testing profile, uhubctl parser). No hardware.tests/test_00_bake.py— flashes each detected device with currentuserPrefs.jsoncmerged with the session's test profile. Has its own skip-if-already-baked check comparing region + primary channel to the session profile; skips cheaply on warm devices.tests/mesh/— multi-device mesh: bidirectional send, broadcast delivery, direct-with-ACK, mesh formation within 60s. Parametrized[nrf52->esp32s3]and[esp32s3->nrf52]. Includestest_peer_offline_recoverywhich uses uhubctl to physically power off one peer mid-conversation (requires uhubctl; skips without).tests/telemetry/—DEVICE_METRICS_APPbroadcast timing.tests/monitor/— boot-log panic check.tests/recovery/—uhubctlpower-cycle round-trip + NVS persistence across hard reset. Requiresuhubctlinstalled and a PPPS-capable hub; entire tier auto-skips otherwise.tests/ui/— input-broker-driven screen navigation with camera + OCR evidence.tests/fleet/— PSK seed session isolation.tests/admin/— channel URL roundtrip, owner persistence across reboot.tests/provisioning/— region + modem + slot bake, admin key presence,UNSETregion blocks TX, userPrefs survive factory reset.
Invocation patterns:
./mcp-server/run-tests.sh # full suite (auto-bake-if-needed)
./mcp-server/run-tests.sh --force-bake # reflash before testing
./mcp-server/run-tests.sh --assume-baked # skip bake (caller vouches for device state)
./mcp-server/run-tests.sh tests/mesh # one tier
./mcp-server/run-tests.sh tests/mesh/test_direct_with_ack.py # one file
./mcp-server/run-tests.sh -k telemetry # name filter
No hardware detected? The wrapper auto-narrows to tests/unit/ only and prints detected hub : (none) in the pre-flight header. Agents interpreting the output should call this out explicitly — a 52-test green run without hardware is qualitatively different from a 12-unit-test green run.
Artifacts every run produces:
mcp-server/tests/report.html— self-contained pytest-html. Each test gets aMeshtastic debugsection with the tail of firmware log + device state dump. Open this first on failures; it's the canonical evidence source.mcp-server/tests/junit.xml— CI-parseable.mcp-server/tests/reportlog.jsonl— pytest-reportlog stream ($report_typekeyed JSONL). Consumed by the live TUI.mcp-server/tests/fwlog.jsonl— firmware log mirror from themeshtastic.log.linepubsub topic. Populated by the_firmware_log_streamautouse session fixture.
Live TUI (meshtastic-mcp-test-tui)
A Textual-based live view that wraps run-tests.sh. Tails reportlog for per-test state, streams firmware logs, polls device state at startup + post-run (gated out of the active run because hub_devices holds exclusive port locks). Key bindings:
| Key | Action |
|---|---|
r |
re-run focused test (leaf → that node id; internal node → directory or -k) |
f |
filter tree by substring |
d |
failure detail modal (pulls longrepr + captured stdout from the reportlog) |
g |
export reproducer bundle (tar.gz with README, test_report.json, time-filtered fwlog, devices.json, env.json) |
l |
toggle firmware log pane |
x |
tool coverage modal |
c |
cross-run history sparkline |
q |
quit (SIGINT → SIGTERM → SIGKILL escalation, 5-s windows each) |
Launch:
cd mcp-server
.venv/bin/meshtastic-mcp-test-tui # full suite
.venv/bin/meshtastic-mcp-test-tui tests/mesh # args pass through to pytest
The plain CLI stays primary; the TUI is for operators who want a live dashboard. Both consume the same run-tests.sh.
Slash commands (Claude Code + Copilot)
Three AI-assisted workflows wrap the test harness. Claude Code operators get /test, /diagnose, /repro; Copilot operators get /mcp-test, /mcp-diagnose, /mcp-repro. Bodies:
.claude/commands/{test,diagnose,repro}.md.github/prompts/mcp-{test,diagnose,repro}.prompt.md
.claude/commands/README.md is the index.
House rules for agents running these prompts:
- Interpret failures, don't just echo them. Pull firmware log tails from
report.htmland classify each failure as transient / environmental / regression. Use the exact format in.claude/commands/test.md. - No destructive writes without operator approval. Any skill that could reflash, factory-reset, or reboot a device must describe the action and stop. The operator authorizes.
- Sequential MCP calls per port. See above.
- "Unknown" is a valid classification. If evidence doesn't support a root cause, say so and list what would disambiguate. Do not invent.
Key fixtures (test authors + agents debugging)
mcp-server/tests/conftest.py provides:
_session_userprefs(autouse session) — snapshotsuserPrefs.jsoncat session start, merges the session test profile viauserprefs.merge_active(test_profile), restores at teardown. Four layers of safety: pytest teardown +atexit+ sidecar file (userPrefs.jsonc.mcp-session-bak) + startup self-heal inrun-tests.sh. Do not edituserPrefs.jsoncfrom inside a test._firmware_log_stream(autouse session) — subscribes tomeshtastic.log.linepubsub on every connectedSerialInterfaceand mirrors lines totests/fwlog.jsonl. Drives the TUI firmware-log pane._debug_log_buffer(autouse per-test) — captures last 200 firmware log lines + device state for attachment to the pytest-htmlMeshtastic debugsection on failure.hub_devices(session) —dict[role, SerialInterface]with session-long exclusive port locks. Reason the TUI's device poller is gated to startup + post-run only.baked_mesh— parametrized mesh-pair fixture; depends ontest_00_bake.pytest_generate_testsinconftest.pyauto-generates[nrf52->esp32s3]and[esp32s3->nrf52]variants.test_profile— session-scoped dict: region, primary channel, admin key, PSK seed. Derived fromMESHTASTIC_MCP_SEED(defaults tomcp-<user>-<host>).
Firmware integration points tied to the test harness
Two firmware changes exist specifically so the test harness works reliably. Keep these in mind when touching related code.
src/mesh/StreamAPI.cpp+StreamAPI.h—emitLogRecorduses a dedicatedfromRadioScratchLog+txBufLogpair and aconcurrency::Lock streamLock. Before this fix,debug_log_api_enabled=truewould tearFromRadioprotobufs on the serial transport becauseemitTxBufferandemitLogRecordshared a single scratch buffer. The conftest enables the log stream session-wide; without this fix the device would corrupt its own FromRadio replies mid-session.src/mesh/PhoneAPI.cpp—ToRadioHeartbeat(nonce=1)triggersnodeInfoModule->sendOurNodeInfo(NODENUM_BROADCAST, true, 0, true)for serial clients, mirroring the pre-existing behavior for TCP/UDP clients inPacketAPI.cpp. The mesh tests rely on this to force a NodeInfo broadcast right after connect so the peer discovers them before the test's first assertion.
If you're modifying StreamAPI, PhoneAPI, NodeInfoModule, or userPrefs flow, run ./mcp-server/run-tests.sh at minimum before asking for review.
Recovery playbooks
| Symptom | First check | Fix |
|---|---|---|
userPrefs.jsonc dirty after test run |
git status --porcelain userPrefs.jsonc |
If non-empty, re-run ./mcp-server/run-tests.sh once — the pre-flight self-heal restores from sidecar. If still dirty, git checkout userPrefs.jsonc. |
| Port busy / wedged CP2102 on macOS | lsof /dev/cu.usbserial-0001 |
Kill the holder. USB replug if the kernel still reports busy. Often a stale pio device monitor or zombie meshtastic_mcp process. |
| nRF52 appears unresponsive | list_devices shows VID 0x239A but device_info times out |
touch_1200bps(port=...) drops it into the DFU bootloader → pio_flash re-installs. |
| Device fully wedged (Guru Meditation, frozen CDC, no DFU) | list_devices shows the VID but every admin call times out |
uhubctl_cycle(role="nrf52", confirm=True) hard-power-cycles the port via USB hub PPPS. baked_single's auto-recovery hook does this once automatically if uhubctl is installed. Falls back to physical replug if no PPPS hub. |
| Multiple MCP server processes | ps aux | grep meshtastic_mcp shows >1 |
Kill all but the one your MCP host spawned. Zombies hold ports and break tests. |
| Mesh formation fails, one side sees peer but other doesn't | /diagnose (or list_nodes on both sides) |
Asymmetric NodeInfo. test_direct_with_ack has a heal path; /repro it a few times. If persistent, both devices' clocks may be out of sync with their NodeInfo cooldown. |
| "role not present on hub" in skip reasons | list_devices |
Expected if a device is unplugged. Reconnect before re-running the tier. |
Entire tests/recovery/ tier skipped |
command -v uhubctl |
Expected if uhubctl isn't on PATH. Install via brew install uhubctl (macOS) or apt install uhubctl (Debian/Ubuntu). Also skips if no hub advertises PPPS. |
Entire tests/ui/ tier skipped ("firmware not baked with USERPREFS_UI_TEST_LOG") |
reportlog.jsonl for the skip reason | Re-run with --force-bake so the UI-log macro gets compiled into the fresh firmware. First run after the Round-3 landing always re-bakes. |
tests/ui/ runs but captures are all 1×1 black PNGs |
MESHTASTIC_UI_CAMERA_DEVICE_ESP32S3 |
Env var not set → NullBackend. Point a USB webcam at the heltec-v3 OLED and set the device index; .venv/bin/python -c "import cv2; [print(i, cv2.VideoCapture(i).read()[0]) for i in range(5)]" discovers it. |
| Tests fail only on first attempt then pass on rerun | — | State leak from a prior session. Run with --force-bake to reset to a known state. |
Never do these without asking
factory_reset— wipes node identity; regenerates PKI keypair. Mesh peers will reject old DMs until re-exchange. Legitimate only when the operator explicitly wants it.erase_and_flash— full chip erase; destroys all on-device state.esptool_erase_flash/esptool_rawwrite/erase — bypasses pio's safety chain.set_configonlora.region— changes regulatory domain; requires physical-location context the operator has and the agent doesn't.reboot/shutdownmid-test — breaks fixture invariants.push -f,rebase -i,reset --hard, or any history-rewriting git operation.- Clicking computer-use tools on web links in Mail/Messages/PDFs — open URLs via the claude-in-chrome MCP so the extension's link-safety checks apply.