* Add clamping logic for milliseconds conversion and unit tests
* Simplify comments in secondsToMsClamped function
Removed detailed comments about seconds to milliseconds conversion.
Router::handleReceived stores its allocCopy of the encrypted packet in the
class member p_encrypted. callModules() invokes module replies that re-enter
the router via MeshService::sendToMesh -> Router::sendLocal, which on a
broadcast reply recursively calls handleReceived. The inner call overwrites
the outer's p_encrypted without releasing it; on the way out it nulls the
member, the outer release(p_encrypted) now releases nullptr, and the original
allocation is permanently leaked. ~381 B per recursion.
Promote p_encrypted to a function-local so each invocation owns its own copy
for its full lifetime. The MQTT-publish null check at the call site (added by
PR #9136 as a workaround for this bug) stays in place because allocCopy can
still legitimately return nullptr on packetPool exhaustion.
Copilot's review of PR #8999 (the original introduction) flagged this exact
pattern at merge time:
"Storing p_encrypted as a class member can cause issues with recursive or
concurrent calls to handleReceived() since each call would overwrite the
previous packet pointer."
The historical reason for the member (S&F needing to retain the encrypted
copy across calls) was satisfied differently by PR #9799 (ServerAPI converted
to std::unique_ptr + cleanup on connection close), so the member is no longer
load-bearing.
Reproduces issues #9632 / #10101 / #8729 (heap leak when MeshMonitor
connected; TCP drops on Station G2 / LILYGO ServerAPI dump abort).
Hardware A/B on Station G2 under sustained TCP-API retry storm (open :4403,
request config, disconnect mid-stream, repeat at ~0.6/s) - 9 min run:
| Build | heapFree drift | rebootCount delta |
| this patch | -1.5 KB (noise)| 0 |
| stock 2.7.13 | -73 KB (8.1KB/min) | +1 (OOM crash) |
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
Router::handleReceived stores its allocCopy of the encrypted packet in the
class member p_encrypted. callModules() invokes module replies that re-enter
the router via MeshService::sendToMesh -> Router::sendLocal, which on a
broadcast reply recursively calls handleReceived. The inner call overwrites
the outer's p_encrypted without releasing it; on the way out it nulls the
member, the outer release(p_encrypted) now releases nullptr, and the original
allocation is permanently leaked. ~381 B per recursion.
Promote p_encrypted to a function-local so each invocation owns its own copy
for its full lifetime. The MQTT-publish null check at the call site (added by
PR #9136 as a workaround for this bug) stays in place because allocCopy can
still legitimately return nullptr on packetPool exhaustion.
Copilot's review of PR #8999 (the original introduction) flagged this exact
pattern at merge time:
"Storing p_encrypted as a class member can cause issues with recursive or
concurrent calls to handleReceived() since each call would overwrite the
previous packet pointer."
The historical reason for the member (S&F needing to retain the encrypted
copy across calls) was satisfied differently by PR #9799 (ServerAPI converted
to std::unique_ptr + cleanup on connection close), so the member is no longer
load-bearing.
Reproduces issues #9632 / #10101 / #8729 (heap leak when MeshMonitor
connected; TCP drops on Station G2 / LILYGO ServerAPI dump abort).
Hardware A/B on Station G2 under sustained TCP-API retry storm (open :4403,
request config, disconnect mid-stream, repeat at ~0.6/s) - 9 min run:
| Build | heapFree drift | rebootCount delta |
| this patch | -1.5 KB (noise)| 0 |
| stock 2.7.13 | -73 KB (8.1KB/min) | +1 (OOM crash) |
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
* Use hash table for O(1) lookup of recently seen packets
* Eliminate a packet lookup during deduplication
* Infinite loop checks for find and remove
* Consolidate conditional compilation
* Exclude hash table from minimal build
* Additional comment on hash table capacity
* Unit tests for packet history changes
* Update incorrect comment about size clamp
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Const
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
* PhoneAPI: add missing tak_tag case + skip reserved gap in module-config iteration
The STATE_SEND_MODULECONFIG state machine iterates config_state through
ModuleConfigType enum values (1..MAX+1 = 16) and switches on
meshtastic_ModuleConfig_*_tag values. Two of the resulting tag values
land in the default / LOG_ERROR path:
1. `tak_tag` (16) — the meshtastic_ModuleConfig_TAKConfig struct exists
in the protobuf and has a `.tak` member in payload_variant, but no
PhoneAPI case ever sends it to the phone. As a result, Android
clients can't read the persisted TAK (Team Awareness Kit) module
config at all. Added case that sends moduleConfig.tak, matching the
pattern used for all other module-config tags (paxcounter,
traffic_management, etc.). NodeDB already persists the struct via
the moduleConfig protobuf save; this just wires the read path to
the phone.
2. Tag 14 — reserved gap in the oneof numbering. No payload_variant
member exists at tag 14. Without this patch, every phone reconnect
walks through config_state=14 and hits
`LOG_ERROR("Unknown module config type %d", config_state)`. On an
active deployment that's ~1,400 LOG_ERROR lines per day per node —
burying real errors. Added explicit `case 14: break;` so the gap
is silently skipped.
Also: lowered the `default:` log level from LOG_ERROR to LOG_DEBUG. A
truly-new unknown type number would indicate firmware lagging the
protobuf — annoying but not an error event worth LOG_ERROR, especially
since this path runs on every phone handshake. If a new ModuleConfig
tag appears, devs will notice via the phone UI missing it, not via log.
Observed on a Station G2 fleet: 1403 "Unknown module config type 16"
and 1427 "Unknown module config type 14" LOG_ERROR lines in 24 hours
from routine phone reconnects.
* Get rid of the placeholder
---------
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
The "Invalid protobufs (bad psk?)" and "Invalid portnum (bad psk?)"
messages fire every time a neighbor transmits on a channel whose
8-bit hash matches one of ours but the PSK differs. In RF environments
with multiple mesh groups nearby this is routine — a single device
can see dozens of these per minute from SAR/MeshCA/private networks
sharing a hash collision.
LOG_ERROR for a benign "not our PSK" event:
- spams the log when you have any neighboring mesh group
- makes a genuine PSK misconfiguration on YOUR own channel
indistinguishable from the constant cross-channel noise
- hides actual errors in the stream
LOG_DEBUG matches how similar decryption-failure paths are handled
elsewhere (eg. the PKC "decrypt attempted but failed" uses LOG_WARN).
Dropping the trailing "!" as well — these are expected events, not
exceptional ones.
The constructor sets `RadioLibInterface::instance = this` immediately,
before `init()` runs. `initLoRa()` in RadioInterface.cpp creates each
radio variant with `new SX1262Interface(...)` or similar, then calls
`init()`, and if init fails the `unique_ptr<RadioInterface>` is reset
to nullptr — destroying the object — while the static `instance`
pointer continues to point at the freed memory.
Main loop then checks `RadioLibInterface::instance != nullptr` and
calls `pollMissedIrqs()` or `resetAGC()` on the dangling pointer →
Guru Meditation (IllegalInstruction / LoadProhibited).
Reported in #9880 on an ESP32-S3 dev board without radio hardware
attached, where init always fails and the leftover pointer crashes
the device on the next `loop()` iteration.
Fix: add a virtual destructor to `RadioLibInterface` that clears the
static pointer iff it still references this object. A later successful
init() may have replaced `instance` with a different interface — the
`instance == this` guard preserves that case.
Fixes#9880
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
The CALIBRATE_ALL (0x7F) command inside resetAGC() clears bit 0 of the
undocumented 0x8B5 register. That bit is set once in init() by #9571 and
#9777 to improve SX1262 RX sensitivity, and the AGC-reset path was not
re-applying it. Result: every SX1262 node silently loses the RX
sensitivity patch ~60s after boot and never recovers until reboot.
Empirically confirmed on Heltec Mesh Node T114 (nRF52840 + SX1262):
- Post-calibration read of 0x8B5 = 0x04 (bit 0 cleared)
- After re-apply: 0x05 (bit 0 set)
Reproducible every AGC_RESET_INTERVAL_MS tick.
Fix re-applies the register bit alongside the existing post-calibration
re-applies (setDio2AsRfSwitch, setRxBoostedGainMode).
The CALIBRATE_ALL (0x7F) command inside resetAGC() clears bit 0 of the
undocumented 0x8B5 register. That bit is set once in init() by #9571 and
#9777 to improve SX1262 RX sensitivity, and the AGC-reset path was not
re-applying it. Result: every SX1262 node silently loses the RX
sensitivity patch ~60s after boot and never recovers until reboot.
Empirically confirmed on Heltec Mesh Node T114 (nRF52840 + SX1262):
- Post-calibration read of 0x8B5 = 0x04 (bit 0 cleared)
- After re-apply: 0x05 (bit 0 set)
Reproducible every AGC_RESET_INTERVAL_MS tick.
Fix re-applies the register bit alongside the existing post-calibration
re-applies (setDio2AsRfSwitch, setRxBoostedGainMode).
* Add powerlimits to reconfigured radio settings as well as init settings.
* Refactor preamble length handling for wide LoRa configurations
* Moved the preamble setting to the main class and made the references static
The W5100S Ethernet chip has only 4 hardware sockets. On RAK4631
Ethernet gateways with syslog and NTP enabled, all 4 sockets were
permanently consumed (NTP UDP + Syslog UDP + TCP API listener + TCP
API client), leaving none for MQTT, DHCP lease renewal, or additional
TCP connections.
- NTP: Remove permanent timeClient.begin() at startup; NTPClient::update()
auto-initializes when needed. Add timeClient.end() after each query to
release the UDP socket immediately.
- Syslog: Remove socket allocation from Syslog::enable(). Open and close
the UDP socket on-demand in _sendLog() around each message send.
- MQTT: Fix socket leak in isValidConfig() where a successful test
connection was never closed (PubSubClient destructor does not call
disconnect). Add explicit pubSub->disconnect() before returning.
Made-with: Cursor
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
The W5100S Ethernet chip has only 4 hardware sockets. On RAK4631
Ethernet gateways with syslog and NTP enabled, all 4 sockets were
permanently consumed (NTP UDP + Syslog UDP + TCP API listener + TCP
API client), leaving none for MQTT, DHCP lease renewal, or additional
TCP connections.
- NTP: Remove permanent timeClient.begin() at startup; NTPClient::update()
auto-initializes when needed. Add timeClient.end() after each query to
release the UDP socket immediately.
- Syslog: Remove socket allocation from Syslog::enable(). Open and close
the UDP socket on-demand in _sendLog() around each message send.
- MQTT: Fix socket leak in isValidConfig() where a successful test
connection was never closed (PubSubClient destructor does not call
disconnect). Add explicit pubSub->disconnect() before returning.
Made-with: Cursor
Co-authored-by: Ben Meadors <benmmeadors@gmail.com>
* Fix TransmitHistory to improve epoch handling
* Enable epoch handling in unit tests
* Improve comments and test handling for epoch persistence in TransmitHistory
* Add boot-relative timestamp handling and unit tests for TransmitHistory
* loadFromDisk should handle legacy entries and clean up old v1 files after migration
* Revert "loadFromDisk should handle legacy entries and clean up old v1 files after migration"
This reverts commit eb7e5c7acf.
* Add NodeInfoModule integration for RTC quality changes and trigger immediate checks
* Update test conditions for RTC quality checks