This applies the same treatment from 8f210454dd (netlog) to wglog,
ending use of netmap.NetworkMap and instead getting the canonical data
from LocalBackend/nodeBackend.
This is a dependency to removing the netmap.NetworkMap from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Icb5af0799322def048a6f594b49f7d11273f025d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Outbound packets produced by netstack (used by tailscaled with
--tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies)
enter the wrapper via InjectOutbound{,PacketBuffer} and take the
injectedRead path, which bypasses Filter.RunOut.
RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple
into the connection-tracking LRU so that Filter.RunIn admits inbound
replies that no explicit ACL rule covers. Skipping it on the injected
path meant a netstack-side dial of UDP would send fine but the reply
would be dropped as "no matching rule". The kernel-TUN path was
already fine because it goes through RunOut.
Fixes#14229Fixes#20064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.
Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The Logger previously took a *netmap.NetworkMap at Startup and on every
ReconfigNetworkMap call, denormalizing it into per-IP and self lookup
maps. That denormalization is O(n) over all peers and ran on every
netmap update, contributing to the broader quadratic behavior we want
to eliminate when a single peer is added or removed.
Instead, this makes netlog ask LocalBackend (well, nodeBackend) for
the info it needs, letting us remove the netmap.NetworkMap type
entirely from the netlog package.
This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Ib5f2de96e788a667332c0a6f7ac833b3d0053b5c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Mappings from transit IPs to real IPs are stored ephemerally in the
connector, so they're lost on restart. When we send a packet to the
connector with a transit IP it does not recognize, it sends us a TSMP
message saying so (see #19883). If we (the client) know of such a
mapping, we now re-send it to the connector so that a connection can
proceed.
Fixestailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
decode6 didn't parse the IPv6 Fragment extension header (Next Header 44),
so any source-fragmented IPv6 packet was classified as an unknown protocol
and matched no ACL rule. The filter then silently dropped it and counted it
as an "acl" drop, even on allow-all tailnets, blackholing large UDP (DNS,
WebRTC, etc.) over a tailnet's IPv6 addresses. IPv4 fragments were already
handled by decode4.
Parse the fragment header the same way: read the first fragment's transport
ports so the filter matches it like an unfragmented packet, pass later
fragments through as ipproto.Fragment, and reject overlapping-fragment
offsets (RFC 1858) and first fragments too short to hold the transport
header as unknown.
Fixes#20083
Signed-off-by: Steve Avery <hello@stevenavery.com>
Bumps wireguard-go pin to include the roaming endpoints fix, and
two internal enhancements.
Pulls stock wireguard-go for non-tailscale simulation in tests,
to use its endpoint discovery mechanism.
Updates #20082
Change-Id: I2ff282cb7fe4ab099ce5e780a1d40ae86a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Package features/conn25 wires up the hooks directly on the tun wrapper
without needing to go through the userspace engine, so this codepath is
unused and not needed.
Updates #cleanup
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Since deltas are only (at present) received from the control plane, processing
a delta signifies we are no longer operating on a netmap fully loaded from
cache, even if most of the netmap is still in the same configuration.
Updates #12542
Change-Id: I84132c4bf2dde6e5c1c57144645edb986b051dca
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The 1 minute timeout was hitting timers inside wireguard-go, leading
stale connections hanging forever. Increasing the timeout to 2 minutes
makes a small subset of cached connections establish direct connections
slightly slower.
Updates to wireguard-go will allow a better hook for when to send these
messages in the future. This change only makes fixes the error mode but
if we have better triggers coming in wireguard-go, we should be using
those.
Updates #20081
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
9be21088f4 changed sending disco pings so
a callMeMaybe would be not be gated by endpoints existing if the node
was running off of a cached netmap.
This commit partly reverts that change, but keeps in a few bug fixes in
that commit and the tests that was introduced and now skipped.
The behaviour prior to 9be21088f4 is
retained.
Updates #20085
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This is a refinement of #19916. Previously, we would only emit a latency log
when going from a cached netmap to an uncached one (i.e., from the control
plane). We would like to know the latency in both conditions, though, so
instead use the validity of the previous self state.
Updates #12639
Updates tailscale/projects#27
Change-Id: I6bbeb5d3162f1f98cdb3dcd244f67ef31c170957
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
magicsock de-duplicates NetInfo callbacks against c.netInfoLast, a cache
that lives on the long-lived magicsock.Conn. That cache survives a control
client swap (interactive login or profile switch), where only the control
client (and its own per-client NetInfo dedup) is replaced. As a result, the
first netcheck after the swap produces a structurally-identical NetInfo
(same PreferredDERP, same NAT shape), magicsock suppresses it as unchanged,
and the new control session never learns our home DERP. Peers can't reach
the node over DERP until some unrelated NetInfo field happens to change.
Add Conn.ResetNetInfoLast to clear the dedup cache, and call it from
LocalBackend.setControlClientLocked whenever a control client is installed,
so the next netcheck re-reports the current NetInfo to the new client.
netInfoLast is only a dedup/optimization cache (all readers nil-guard, and
it is recomputed by every netcheck), so clearing it can only add a delivery,
never lose or misroute one; it is scoped to control-client lifecycle events,
not steady-state operation.
Updates #17887Fixes#20024
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Commit 2b338dd6a8 removed watchdogEngine because it was weird
(so many methods) and increasingly unnecessary after we'd cleaned up
and simplified so much of the locking.
This adds back a watchdog, but an easier to maintain one that's more
idiomatic.
Updates #19759
Change-Id: I86c458473e126c0809f37696446ce7acf4cc4eb9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In order to allow us to measure the performance effects of client-side netmap
caching, both with and without the feature enabled, add logs to record how long
it takes after a client restart or profile switch for the node to establish
contact with peers, relative to the first uncached netmap.
We do this by keeping track of a timestamp when the connection is constructed,
and logging a record for "new" peer contacts that records how long (in
microseconds) it took from the time the peer was recorded as a candidate. The
message includes whether the contact was via DERP or direct, and whether a
cached netmap was in use at the time.
This builds on and extends the counters from #19699, but here we include new
contacts whether or not a cached netmap is in use, so that we can establish a
baseline for comparison.
Updates #12639
Updates tailscale/projects#27
Change-Id: I4f6d050e221f3881848d05a0425c4a5d1a59294c
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.
Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.
Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.
This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.
This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Add four control-plane node attributes that let us disable UDP GSO/GRO
on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN
device.
These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and
TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate
upstream Linux kernel regressions on a deployed fleet without
requiring a client release, after two incidents (#13041, #19777) where
buggy kernel patches landed upstream and the fix took an excessively
long time to reach downstream distros.
Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via
a comparison against a cached "last applied" value and only an actual
transition triggers work: magicsock Rebind()+ReSTUN for UDP,
ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and
is one-way (wireguard-go GRO disablement is sticky); re-enabling
requires a client restart.
Updates #13041
Updates #19777
Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273
Signed-off-by: James Tucker <james@tailscale.com>
Originally found when adding tests for working with cached netmaps, and
finding the added tests to be flakey.
When working off of a cached netmap, if a node exists in the cached
netmap but does not yet have any endpoints, DERP connections are
available but not direct ones. By sending callMeMaybe to nodes
without endpoints in the cached netmap, we can establish direct
connections for this edge case.
Aditionally, ensure that TSMP disco advert messages are not sent if the
endpoint does not have a valid address yet.
Fixes#19843
Updates #19597
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
For large tailnets (~50k+ nodes) with frequent peer churn (ephemeral
GitHub Actions workers etc.), tailscaled used to rebuild the full
netmap and fan it out on the IPN bus on every MapResponse that
added or removed a peer. There were two O(N) costs per delta: the
full netmap rebuild + every Notify.NetMap encode to every bus watcher.
This change tackles both:
1. Plumb O(1) peer add/remove through the delta path. PeersChanged
and PeersRemoved no longer prevent the delta happy path; instead,
they mutate the per-node-backend peer map in place.
2. Restrict ipn.Notify.NetMap emission to the platforms whose host
GUIs still depend on it (Windows, macOS, iOS) and migrate
in-tree consumers off it everywhere else:
- Migrate reactive consumers (containerboot, kube agents,
sniproxy, tsconsensus, etc.) off Notify.NetMap to the
previously-added Notify.SelfChange signal so they no longer
have to subscribe to the full netmap.
- Add ipn.NotifyNoNetMap so GUI clients on "legacy-emit" platforms
that have already migrated can opt out of the per-watcher
NetMap encode.
- Gate Notify.NetMap emission on the producer side by a compile-
time GOOS check, so the supporting code is dead-code-eliminated
on Linux and other geese where no GUI consumer needs it.
Re-running BenchmarkGiantTailnet from tstest/largetailnet, which was
added along with baseline numbers on unmodified main in ad5436af0d,
the per-delta cost (one peer add+remove pair) is now ~O(1) regardless
of tailnet size N:
N no-watcher (ms/op) bus-watcher (ms/op)
before now factor before now factor
10000 32 0.11 300x 166 0.13 1300x
50000 222 0.11 2000x 865 0.13 6700x
100000 504 0.12 4100x 1765 0.13 13400x
250000 1551 0.12 12500x 4696 0.15 32400x
Updates #12542
Change-Id: I94e34b37331d1a8ec74c299deffadf4d061fda9e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
SetDERPMap spawns a goroutine that calls ReSTUN, which logs via the
test logger. If the test returns before that goroutine logs, the
goroutine races with testing cleanup.
Use tstest.WhileTestRunningLogger so the goroutine's logf call becomes
a no-op once the test finishes.
Fixes#19829
Change-Id: I1097f98e40ffd1c5dd7fb7a715c918255853e3c6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The Engine watchdog wrapped every wgengine.Engine method call in a
goroutine with a 45s timeout and crashed the process on timeout. It
was added years ago to surface deadlocks during development, but the
underlying deadlocks have long since been fixed, and even when it did
fire it produced obscure stack traces (from inside the watchdog
goroutine, not the original caller) without buying much.
Audit of userspaceEngine's methods shows none have cyclic locking or
unbounded blocking now that ResetAndStop no longer loops waiting for
DERPs to drain (fa49009ee). The watchdog is dead weight; remove it
along with the TS_DEBUG_DISABLE_WATCHDOG escape hatch.
Updates #19759
Change-Id: Iba9d718fe1f8718a6631296e336b138c31b99ff1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
linuxRouter has two blocks (connmark rules and the CGNAT drop rule) that
gate on cfg.NetfilterMode, the requested config state. This may cause an
error when setNetfilterModeLocked fails, since it may keep assuming this
config is valid.
We now gate both blocks on r.netfilterMode, matching the pattern used by
SNAT, stateful, and loopback paths.
Fixes#19737
Change-Id: Ia6003a082db99c376e662132d725661afbac0ee9
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
Since f343b496c3 ("wgengine, all: remove LazyWG, use wireguard-go
callback API for on-demand peers"), Reconfig is fully synchronous:
magicConn.UpdatePeers, wgdev.RemovePeer, router.Set, and dns.Set all
return when the work is done, and the peer list is updated under
wgLock before Reconfig returns. So after Reconfig with empty configs,
len(st.Peers) is already 0.
The old loop also waited for st.DERPs to drain to 0, but UpdatePeers
only edits maps; active DERP connections idle out on their own
timeout. The sole caller (LocalBackend.stopEngineAndWait) doesn't
inspect st.DERPs anyway; it just hands the Status to
setWgengineStatusLocked. So the drain-wait was for nothing observable
and could theoretically (or at least appear to readers to) loop
forever holding b.mu. Remove that reader confusion by removing
the backoff loop entirely.
Updates #19759
Change-Id: Ibfac3f0baabcad7604b713c934a8fc37932e0a50
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The codegen path for map-of-slice-of-pointer fields, skipped
nil-valued entries. That dropped the key from the map.
This broke how dns.Config.Routes uses nil values sentinels.
Fixes#19730Fixes#19732Fixes#19746Fixes#19744
Change-Id: Ic6400227f4ab21b3ca0e8c0eeecf9b83d145a9ab
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
This patch fixes a data race in wgengine/netstack that surfaced while
running both TestTCPForwardLimits and TestTCPForwardLimits_PerClient.
Because these two tests both setup the TS_DEBUG_NETSTACK envknob, a
race happens because netstack.Impl.Close leaked its inject goroutine.
The inject goroutine also reads the TS_DEBUG_NETSTACK envknob, so if
it is still running when the next test starts, then it will break.
This patch also cleans up the tests a bit, ensuring that neither of
them run in T.Parallel. It also adds a T.Cleanup call to clear the
envknob.
Fixes#19720
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Add new clientmetric counters for establishing contact with peers while using
cached network map data. To do this, instrument the magicsock.Conn with a bit
to indicate whether its peer data came from a cached netmap. If so, there are
two conditions we will count as establishing connectivity to a peer:
- Receipt of a CallMeMaybe from a peer via disco.
- Establishing a valid endpoint address for a peer.
In vmtest, add Env.ClientMetrics to scrape metrics from the specified node.
Use this to check that counters were updated in caching tests.
Updates https://github.com/tailscale/projects/issues/13
Updates #12639
Change-Id: Ie8cf3244ac8af4f5bcfe4d0d944078da2ba08990
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Per recent chat with @raggi about all this, I went and looked at this
test again.
Updates #cleanup
Change-Id: Icb7d87b1ed2cebf481ee4e358a3aa603e63fb8a4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Replace the UAPI text protocol-based wireguard configuration with
wireguard-go's new direct callback API (SetPeerLookupFunc,
SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey).
Instead of computing a trimmed wireguard config ahead of time upon
control plane updates and pushing it via UAPI, install callbacks so
wireguard-go creates peers on demand when packets arrive. This removes
all the LazyWG trimming machinery: idle peer tracking, activity maps,
noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the
ts_omit_lazywg build tag.
For incoming packets, PeerLookupFunc answers wireguard-go's questions
about unknown public keys by looking up the peer in the full config.
For outgoing packets, PeerByIPPacketFunc (installed from
LocalBackend.lookupPeerByIP) maps destination IPs to node public keys
using the existing nodeByAddr index.
Updates tailscale/corp#12345
Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Commit 78627c132f changed the signature of magicsock.Conn.SetDERPMap to
take an additional bool doReStun parameter. Avoid both the boolean
parameter and the API signature change by restoring SetDERPMap to its
original single-argument form and adding a new SetDERPMapWithoutReSTUN
method for the cache-loading caller that wants to skip the post-set
ReSTUN.
Updates #19490
Change-Id: I97d9e82156bfc546ccf59756d1ea52f039b5de06
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The mismatch behaviour of falling back to a previous key could end up
breaking connections when the netmap update took longer than the 2
seconds allowed in controlClient.auto for netmap updates, or if the
controlClient context was canceled. This could end up breaking
legitimate updates to the netmap for disco keys coming from control.
Instead, log the event, and let the connection be reset to that of the
key as that is safer.
Issue found by @bradfitz.
Updates #19574
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
With netmap caching, the home DERP of the self node was neither saved to
the cache or loaded from it, making nodes not stick to a DERP when
starting without a connection to control.
Instead, make sure that when a cache is available, load that cache,
before looking for DERP servers. This is implemented by allowing a skip
of ReSTUN in setting the DERP map (we must have a DERP map before
setting the home DERP), so the DERP from cache will set itself and be
sticky until a connection to control is established.
Making DERP only change when connected to control is handled by existing
code from f072d017bd.
Updates #19490
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
On systems where this sysctl defaults to 0 (including GCP VMs), rp_filter performs its lookup with fwmark=0, hits rule 5270 then table 52 and routes to 0.0.0.0/0 dev tailscale0, and drops every reply packet arriving on the physical interface as a martian. This breaks all connectivity when using an exit node: DERP, DNS, control plane, and even the cloud metadata service.
Set src_valid_mark=1 when enabling the connmark rules so the rp_filter workaround actually works in these cases.
Updates #3310
Updates tailscale/corp#37846
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Previously, handleLocalPackets intercepted traffic to the Tailscale
service IP (100.100.100.100 / fd7a:115c:a1e0::53) only for an allow-list
of ports: TCP 53/80/8080 and UDP 53. Any other port returned
filter.Accept, letting the packet fall through to the ACL filter and
wireguard-go, which would attempt a peer lookup. No peer owns the
quad-100 AllowedIP, so after ~5s pendopen.go would log:
open-conn-track: timeout opening ...; no associated peer node
This is the common "conntrack error no peer found for 100.100.100.100:853"
log spam seen in the wild (e.g. from systemd-resolved or another
resolver speculatively trying DoT on quad-100). It also leaks quad-100
packets onto the tailnet.
Remove the port allow-list so handleLocalPackets absorbs every quad-100
packet into netstack regardless of IP protocol or port. Traffic never
reaches the conntrack / peer-routing layers.
With the allow-list gone, acceptTCP needs a corresponding guard: on a
quad-100 TCP port we don't serve, execution used to fall through to the
isTailscaleIP case (quad-100 is in the tailscale IP range), which
rewrote the dial target to 127.0.0.1:<port> and forwardTCP'd the
connection to whatever happened to be listening on the host's loopback
at that port. Add a hittingServiceIP case that RSTs cleanly instead,
placed before the isTailscaleIP fallthrough.
TestQuad100UnservedTCPPortDoesNotForward is a new integration test that
injects a TCP SYN to 100.100.100.100:853 via handleLocalPackets, stubs
forwardDialFunc, and asserts the dialer is not invoked; it catches
regressions of the acceptTCP recursion/loopback-redirection case.
Fixes#15796Fixes#19421
Updates #3261
Updates #11305
Signed-off-by: James Tucker <james@tailscale.com>
When there is an active connection between devices, do not send new
disco keys via TSMP.
Updates #12639
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Replace Conn.peers (sorted views.Slice) with peersByID, a
map[tailcfg.NodeID]tailcfg.NodeView. The only caller that needed
the sorted slice (the disco message receive path's binary search)
becomes a single map lookup. Drop nodesEqual.
Add Conn.UpsertPeer / Conn.RemovePeer for O(1) single-peer endpoint
work. RemovePeer also performs a targeted single-disco-key cleanup
(previously that scan was O(discoInfo)).
Extract the shared per-peer upsert body as upsertPeerLocked; still
used by SetNetworkMap's bulk path. SetNetworkMap is documented as
the bulk / initial / self-change path; UpsertPeer and RemovePeer
are preferred for single-peer changes.
Make the relay server set update O(1) per peer: add serverUpsertCh
/ serverRemoveCh to relayManager with matching run-loop handlers.
UpsertPeer / RemovePeer evaluate the per-peer relay predicate
locally and dispatch upsert or remove. The full-rebuild
updateRelayServersSet stays for the initial netmap, filter
changes, and fallback.
Move the hasPeerRelayServers atomic from Conn onto relayManager,
next to the serversByNodeKey map it summarizes. The run loop is
now the single writer and needs no back-pointer to Conn;
endpoint's two hot-path readers take one extra hop to
de.c.relayManager.hasPeerRelayServers but the cost is the same
atomic load.
No callers use UpsertPeer/RemovePeer yet; a subsequent change will
plumb per-peer add/remove through the incremental map update path.
Updates #12542
Change-Id: If6a3442fe29ccbd77890ea61b754a4d1ad6ef225
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Endpoint's best address was cleared on trustBestAddrUntil expiry
only if it was a udprelay connection. This generalizes invalidation
to also cover direct UDP.
Trust deadline is checked in two cases:
On disco ping timeout from the endpoint's best address.
Traffic goes DERP-only, heartbeats to the old address stop.
The discovery pings are still in flight, handled by the following.
On disco ping success from an alternative. BestAddr switches to the
working path, trust refreshed, eager discovery stops. The still
in flight pongs are handled by betterAddr().
Updates #19407
Change-Id: Ic41ed18edb4a6e4350a2d49271ba01566a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
pickPort would bind a UDP socket on :0 to get a free port, close
the socket, then hope to rebind to the same port in NewConn. This
is a TOCTOU race that can cause flaky test failures when another
process grabs the port in between.
Instead, pass Port: 0 to NewConn and let the OS assign the port
atomically, then read back the assigned port via conn.LocalPort().
Fixes#19409
Change-Id: Ie44b599fb93c361e29a05f2171ad747c46f82b7a
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Avery Pennarun <apenwarr@tailscale.com>
Clients with the newly added node attribute
`"disable-linux-cgnat-drop-rule"` will not automatically drop inbound
traffic on non-Tailscale network interfaces with the source IP in the
CGNAT IP range. This is an initial proof-of-concept for enabling
connectivity with off-Tailnet CGNAT endpoints.
Fixestailscale/corp#36270.
Signed-off-by: Naman Sood <mail@nsood.in>
reflect.DeepEqual is expensive and allocates heavily. Replace it with
a field-by-field comparison that does zero allocations.
Adds tests and benchmarks for the new Equal method.
Fixes#19363
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
The compare-metrics-stats subtest reset two independent counting
systems (physical connection counters and expvar.Int user metrics)
non-atomically. Background WireGuard keepalives arriving between the
resets could increment one system but not the other, causing
off-by-one packet/byte mismatches in either direction.
Replace the reset-then-compare pattern with snapshot-and-delta:
snapshot both systems before pings, snapshot again after, and compare
the deltas. This eliminates the non-atomic reset window entirely.
As a belt-and-suspenders safety net, tolerate a difference of exactly
one packet (and corresponding bytes) from a stray keepalive that
could still arrive in the narrow window between the two snapshots.
flakestress passes with ~5900 runs (~2800 without -race, ~3100 with
-race) but it also passed previously too. This is an annoying one to
repro.
Fixes#11762
Change-Id: I3447ad67e71c8146e85eed38b7a665033ef9e284
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Avery found a bunch of tests that fail with -count=2.
Updates tailscale/corp#40176 (tracks making our CI detect them)
Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
The maxInFlightConnectionAttemptsForTest and
maxInFlightConnectionAttemptsPerClientForTest globals were plain ints
read by background gVisor TCP handler goroutines (via
wrapTCPProtocolHandler) and written by tstest.Replace cleanup in
TestTCPForwardLimits_PerClient. When a gVisor goroutine outlived the
test cleanup window, the race detector caught the unsynchronized
access.
The race-prone code was introduced in c5abbcd4b4 (2024-02-26,
"wgengine/netstack: add a per-client limit for in-flight TCP
forwards") which added both the plain int globals and the
TestTCPForwardLimits_PerClient test that writes them via
tstest.Replace. It is not obvious why this has only recently started
being detected as a data race; likely some combination of gVisor
version bumps, Go toolchain scheduler changes, and additional
TCP-injecting subtests (e.g. 03461ea7f, 2026-01-30) increased
goroutine churn enough to hit the window.
Change both globals to atomic.Int32 and replace tstest.Replace (which
does non-atomic *target = old on cleanup) with explicit Store/Cleanup
pairs.
Fixes#19118
Change-Id: Id26ba6fbfb2e4ade319976db80af8e16c7c8778e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
These test failures were never caught by CI because the package in question
was missing from our privileged tests list. tailscale/corp#40007 covers improving
our process around this.
Fixes#19316
Signed-off-by: Amal Bansode <amal@tailscale.com>
Start using a common helper for tests to declare that they require root.
This is step 1. A later step will then make this helper track which tests were
skipped so a subsequent pass will run these test as root.
Updates tailscale/corp#40007
Change-Id: I4979e1def0fa3691d38c83f48c89aaa443e7f62e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Fixes UDP listeners on VIP Service addresses not receiving inbound traffic.
- Modified shouldProcessInbound to check for registered UDP transport endpoints when processing packets to service VIPs
- Uses FindTransportEndpoint to determine if a UDP listener exists for the destination VIP/port
- Supports both IPv4 and IPv6
The aim was to mirror the existing TCP logic, providing feature parity for UDP-based services on VIP Services.
Fixes#18971
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.
tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.
The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.
Updates #8085
Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>