Files
tailscale/wgengine
Brad Fitzpatrick 70de111394 wgengine/magicsock: fix three race conditions in TestTwoDevicePing
Fix three independent flake sources, at least as debugged by Claude,
though empirically no longer flaking as it was before:

1. Poll for connection counter data instead of reading immediately.
   The conncount callback fires asynchronously on received WireGuard
   traffic, so after counts.Reset() there is no guarantee the counter
   has been repopulated before checkStats reads it. Use tstest.WaitFor
   with a 5s timeout to retry until a matching connection appears.

2. Replace the *2 symmetry assumption in global metric assertions.
   metricSendUDP and friends are AggregateCounters that sum per-conn
   expvars from both magicsock instances. The old assertion assumed
   both instances had identical packet counts, which breaks under
   asymmetric background WireGuard activity (handshake retries, etc).
   The new assertGlobalMetricsMatchPerConn computes the actual sum of
   both conns' expvars and compares against the AggregateCounter value.

3. Tolerate physical stats being 0 when user metrics are non-zero.
   A rebind event replaces the socket mid-measurement, resetting the
   physical connection counter while user metrics still reflect packets
   processed before the rebind. Log instead of failing in this case.
   Also move counts.Reset() after metric reads and reorder the reset
   sequence (counts before metrics) to minimize the race window.

Fixes tailscale/tailscale#13420

Change-Id: I7b090a4dc229a862c1a52161b3f2547ec1d1f23f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-11 07:34:52 -07:00
..