Commit Graph

1384 Commits

Author SHA1 Message Date
James Tucker
25b8ed8d9e control/controlknobs,net/{batching,tstun},wgengine: add nodecaps to disable UDP & TUN GRO/GSO
Add four control-plane node attributes that let us disable UDP GSO/GRO
on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN
device.

These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and
TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate
upstream Linux kernel regressions on a deployed fleet without
requiring a client release, after two incidents (#13041, #19777) where
buggy kernel patches landed upstream and the fix took an excessively
long time to reach downstream distros.

Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via
a comparison against a cached "last applied" value and only an actual
transition triggers work: magicsock Rebind()+ReSTUN for UDP,
ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and
is one-way (wireguard-go GRO disablement is sticky); re-enabling
requires a client restart.

Updates #13041
Updates #19777

Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273
Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 17:10:14 -07:00
kari-ts
1a17ec1988 net/netmon: in Android, replace system/bin/ip call with cached LinkProperties gateway (#19804)
bind() on NETLINK_ROUTE sockets does not work on Android 11+ (https://developer.android.com/identity/user-data-ids#mac-11-plus) . Since system/bin/ip uses bind(), likelyHomeRouterIPHelper() always fails on Andoroid 11+, so that GatewayAndSelfIP never caches the result, causing repeated ip process spawns on every periodic ReSTUN.

This replaces the system/bin/ip fallback with a cached gateway IP pushed from Android’s ConnectivityManager via LinkProperties.getRoutes(). This is the same patterm used by UpdateLastKnownDefaultRouteInterface for the interface name (see https://github.com/tailscale/tailscale/pull/11784/). We keep the proc/net/route path as a fallback for early startup before NetworkChangeCallback has fired.

Updates tailscale/tailscale#18622
Updates tailscale/tailscale#13352

Signed-off-by: kari-ts <kari@tailscale.com>
2026-05-27 15:42:48 -07:00
James Tucker
dea49bb4da net/batching: add envknobs to disable UDP GRO & GSO
It is sometimes useful when diagnosing subtle and specific performance
problems to rule out GRO/GSO independently and/or toggle them to
influence packet pacing.

Updates #17835
Updates tailscale/corp#31164

Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 12:05:00 -07:00
BeckyPauley
0ed6da2826 cmd/k8s-operator, net/netutil: support 4via6 in egress proxy and connector (#19863)
Add support for configuring egress to destinations reachable via 4via6
subnet routes. This change affects standalone egress proxy only- egress
ProxyGroup needs IPv6 support before being able to support 4via6. Egress may
be configured using either the synthesized 4via6 address or the MagicDNS
name (in the form
<IPv4-address-with-hyphens-instead-of-dots>-via-<siteid>[.*]).

Also update the Connector to validate and advertise 4via6 subnet routes.
Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.

Updates #19334

Signed-off-by: Becky Pauley <becky@tailscale.com>
2026-05-27 10:54:35 +01:00
Adrian Dewhurst
5d8f401956 net/dns: fix handling non-IP single split DNS
Fixes #19834

Change-Id: I4d48efed00cd080b14c6fd713ff21e53a5a6ee3c
Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>
2026-05-22 20:45:58 -04:00
Simon Law
7dabebc691 net/traffic: switch rendezvous hashing from SHA256 to FNV-1a (#19821)
In PR tailscale/corp#30448, we originally decided to break ties using
SHA256 for our rendezvous hashing algorithm. Now that we’ve had some
experience with it, we think that FNV-1a is a better choice. It
distributes bits evenly, it’s much faster, and it doesn’t need to be
cryptographically secure. The FNV designers recommend FNV-1a over the
deprecated FNV-1.

This PR makes the switch and updates the related tests, since changing
the algorithm changes which stable pick gets selected. As of 2026-05,
this is the best time to make this change, since there are almost no
clients in the wild with traffic steering enabled.

Updates #17366
Updates tailscale/corp#29964
Updates tailscale/corp#29966
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-05-21 10:11:59 -07:00
Simon Law
7ebca58042 net/traffic,ipn/ipnlocal: extract traffic steering utilities (#19682)
The traffic package contains helpers for evaluating traffic steering
scores and picking appropriate nodes. These were extracted from
ipnlocal.suggestExitNodeUsingTrafficSteering so they can be reused by
the new routecheck package to probe exit nodes in priority order.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-05-21 08:28:27 -07:00
Brad Fitzpatrick
f3a117e813 net/tsdial: run happy eyeballs across A and AAAA in UserDial
When tailscaled is running in userspace-networking mode behind an
exit node (e.g. as a SOCKS5 proxy), it resolves a hostname and then
dials a single resolved IP through the tunnel. If the name has both
A and AAAA, Go's net.Resolver merges them and we pick ips[0], which
on an IPv6-native host is usually AAAA. If the exit node has no IPv6
egress (or vice versa), the dial fails silently through the tunnel
and the user sees a hang.

Resolve all candidates and race connect attempts across address
families with a 300ms happy-eyeballs delay, matching Go's net.Dialer
default and the existing pattern in net/dnscache (commit ee0a03b14).
First success wins; losers are cancelled and any conns they produce
are closed. A failBoost channel wakes the launcher when a connect
fails fast (e.g. ICMP "no route" via the tunnel) so we don't sit on
the 300ms timer when the answer is already known.

userDialResolve is refactored into userDialResolveAll (returns the
full candidate list) plus a thin single-IP wrapper for callers like
UserDialPlan that don't race. UserDial's per-IP dispatch (netstack
vs peer dialer vs SystemDial vs std) is extracted to dialOneUser so
each candidate can route correctly on its own merits.

Also fix serveDial in localapi to pass the original hostname to
UserDial rather than a pre-resolved IP, so the race can fire.

This fix is single-ended: it works against any exit node, including
old ones, with no protocol changes. The trade-off versus filtering
on the exit-node side via PeerAPI DoH is that every dial through an
unreachable-family exit node costs one failed connect attempt per
cache window, rather than zero, which is acceptable given the
simplicity.

Fixes #19792
Fixes #13257

Change-Id: I9d7645d0034caf3ee22ecdd8070798353f77e94b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-20 18:35:55 -07:00
Claus Lensbøl
ee0a03b140 net/dnscache: run happy eyeballs with more than one dest IP (#19770)
If the context given to DialContext has a shorter lifetime than the OS
TCP SYN timeout, and TCP SYNs are dropped from the path to the remote,
DialContext would never fall back to try IPv6 after IPv4.

Instead, use the normal happy eyeballs race if there is more than one
address. This does remove the implicit prioritization of IPv4 over IPv6
in cases where there is only a single IPv4 remote address.

Updates #13346

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-05-19 12:59:11 -04:00
Fernando Serboncini
2a06fb66d0 cmd/cloner: preserve nil-valued entries when cloning map (#19749)
The codegen path for map-of-slice-of-pointer fields, skipped
nil-valued entries. That dropped the key from the map.

This broke how dns.Config.Routes uses nil values sentinels.

Fixes #19730
Fixes #19732
Fixes #19746
Fixes #19744

Change-Id: Ic6400227f4ab21b3ca0e8c0eeecf9b83d145a9ab

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-05-14 10:30:59 -04:00
Nick Khyl
32f984f54c net/dns: create a new hosts file if it doesn't exist on Windows
A missing hosts file is not a fatal error. We should log it, but still proceed
and create a new one instead of failing the DNS reconfiguration completely.

Fixes #19733

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2026-05-13 16:10:36 -05:00
Brad Fitzpatrick
883d4fd2cd wgengine/netstack, net/ping: stop using pro-bing and use our net/ping instead
Fixes #19633
Fixes #13760

Change-Id: I0fa9423523a3a0fb1dfcde57de0f26e51723ff97
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-04 14:05:24 -07:00
Fran Bull
bdf3419e7d net/dns: add custom scheme resolvers
If another part of the client code registers a custom scheme with the
forwarder, the forwarder will check resolver addresses to see if they
match the scheme. If they do, the corresponding custom scheme handler
will be called to find the actual address for the resolver at this
moment. If the handler returns the empty string then that resolver will
be ignored.

This is useful if you want to dynamically determine where to send
certain DNS requests. It is being added to support new app connector
(conn25) work that would like to make sure it sends DNS requests to the
current connector peer in a high availability configuration.

Updates tailscale/corp#39858

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-01 14:01:10 -07:00
Brad Fitzpatrick
f343b496c3 wgengine, all: remove LazyWG, use wireguard-go callback API for on-demand peers
Replace the UAPI text protocol-based wireguard configuration with
wireguard-go's new direct callback API (SetPeerLookupFunc,
SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey).

Instead of computing a trimmed wireguard config ahead of time upon
control plane updates and pushing it via UAPI, install callbacks so
wireguard-go creates peers on demand when packets arrive. This removes
all the LazyWG trimming machinery: idle peer tracking, activity maps,
noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the
ts_omit_lazywg build tag.

For incoming packets, PeerLookupFunc answers wireguard-go's questions
about unknown public keys by looking up the peer in the full config.
For outgoing packets, PeerByIPPacketFunc (installed from
LocalBackend.lookupPeerByIP) maps destination IPs to node public keys
using the existing nodeByAddr index.

Updates tailscale/corp#12345

Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 19:46:19 -07:00
Andrew Dunham
33714211c8 net/dns: use os.Root to prevent path traversal in darwin resolver
The darwinConfigurator writes split DNS resolver files to
/etc/resolver/$SUFFIX using os.WriteFile with string concatenation.
A crafted MatchDomain value containing path traversal sequences
(e.g. "../evil") could write files outside the resolver directory.

Use os.OpenRoot to confine all file operations in SetDNS and
removeResolverFiles to the resolver directory. os.Root rejects any
path component that escapes the root, returning an error instead of
following the traversal.

Also parametrize the resolver directory path on the struct to enable
testing with t.TempDir(), and add tests.

As far as I can tell, this would require a malicious controlplane to
exploit, but still worth fixing.

Updates tailscale/corp#39751

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2026-04-28 11:08:22 -04:00
Brad Fitzpatrick
0e10a3f580 net/tsdial, ipn/localapi, client/local: let clients dial non-Tailscale addresses directly
Add a tsdial.Dialer.UserDialPlan method that resolves an address and
reports whether the dialer would route it via Tailscale. The LocalAPI
/dial handler now uses this to skip proxying for addresses that aren't
Tailscale routes (e.g. localhost), returning a Dial-Self response with
the resolved address so the client can dial it directly. This avoids
an unnecessary round-trip through the daemon for local connections.

The client's UserDial handles the new response by dialing the resolved
address itself, and the server passes the pre-resolved IP:port for
Tailscale dials to avoid redundant DNS lookups.

Thanks to giacomo and Moyao for pointing this out!

Updates tailscale/corp#39702

Change-Id: I78d640f11ccd92f43ddd505cbb0db8fee19f43a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 09:33:27 -07:00
Andrew Dunham
d52ae45e9b cmd/cloner: deep-clone pointer elements in map-of-slice values
The cloner's codegen for map[K][]*V fields was doing a shallow
append (copying pointer values) instead of cloning each element.
This meant that cloned structs aliased the original's pointed-to
values through the map's slice entries.

Mirror the existing standalone-slice logic that checks
ContainsPointers(sliceType.Elem()) and generates per-element
cloning for pointer, interface, and struct types.

Regenerate net/dns and tailcfg which both had affected
map[...][]*dnstype.Resolver fields.

Fixes #19284

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2026-04-17 11:36:05 -04:00
Brad Fitzpatrick
49eb1b5d26 net/dns: fix TestDNSTrampleRecovery failure under flakestress
The test had two problems:

1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher,
   but the test filesystem uses a temp directory prefix. The watcher
   was watching the real /etc/, never seeing the test's file writes.

2. The test's watchFile used gonotify.NewDirWatcher which creates
   goroutines that block on real inotify syscalls. These don't work
   inside synctest's fake-time bubble. The test only passed standalone
   by accident: gonotify walks /etc/ on startup producing fake events
   that happened to trigger trample detection at the right time.

Fix the path issue by adding ActualPath to the wholeFileFS interface,
which translates logical paths (like "/etc/resolv.conf") to real
filesystem paths (respecting any test prefix). Use it in
runFileWatcher so the inotify watch targets the correct directory.

Replace gonotify in the test with a one-shot timer that synctest can
advance through fake time, reliably triggering the trample check.

Fixes #19400

Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-14 06:55:35 -07:00
Brad Fitzpatrick
9fbe4b3ed2 all: fix six tests that failed with -count=2
Avery found a bunch of tests that fail with -count=2.

Updates tailscale/corp#40176 (tracks making our CI detect them)

Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
2026-04-13 18:52:57 -07:00
Brad Fitzpatrick
a182b864ac tsd, all: add Sys.ExtraRootCAs, plumb through TLS dial paths
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.

tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.

The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.

Updates #8085

Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-07 18:10:54 -07:00
James Tucker
21695cdbf8 ipn/ipnlocal,net/netmon: make frequent darkwake more efficient
Investigating battery costs on a busy tailnet I noticed a large number
of nodes regularly reconnecting to control and DERP. In one case I was
able to analyze closely `pmset` reported the every-minute wake-ups being
triggered by bluetooth. The node was by side effect reconnecting to
control constantly, and this was at times visible to peers as well.

Three changes here improve the situation:
- Short time jumps (less than 10 minutes) no longer produce "major
  network change" events, and so do not trigger full rebind/reconnect.
- Many "incidental" fields on interfaces are ignored, like MTU, flags
  and so on - if the route is still good, the rest should be manageable.
- Additional log output will provide more detail about the cause of
  major network change events.

Updates #3363

Signed-off-by: James Tucker <james@tailscale.com>
2026-04-06 15:46:51 -07:00
Brad Fitzpatrick
86f42ea87b cmd/cloner, cmd/viewer: handle named map/slice types with Clone/View methods
The cloner and viewer code generators didn't handle named types
with basic underlying types (map/slice) that have their own Clone
or View methods. For example, a type like:

    type Map map[string]any
    func (m Map) Clone() Map { ... }
    func (m Map) View() MapView { ... }

When used as a struct field, the cloner would descend into the
underlying map[string]any and fail because it can't clone the any
(interface{}) value type. Similarly, the viewer would try to create
a MapFnOf view and fail.

Fix the cloner to check for a Clone method on the named type
before falling through to the underlying type handling.

Fix the viewer to check for a View method on named map/slice types,
so the type author can provide a purpose-built safe view that
doesn't leak raw any values. Named map/slice types without a View
method fall through to normal handling, which correctly rejects
types like map[string]any as unsupported.

Updates tailscale/corp#39502 (needed by tailscale/corp#39594)

Change-Id: Iaef0192a221e02b4b8e409c99ef8398090327744
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 20:20:32 -07:00
Brad Fitzpatrick
5ef3713c9f cmd/vet: add subtestnames analyzer; fix all existing violations
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.

The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
  regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
  slice/map literal with bad name field values

Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.

Updates #19242

Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-05 15:52:51 -07:00
M. J. Fromberger
211ef67222 tailcfg,ipn/ipnlocal: regulate netmap caching via a node attribute (#19117)
Add a new tailcfg.NodeCapability (NodeAttrCacheNetworkMaps) to control whether
a node with support for caching network maps will attempt to do so. Update the
capability version to reflect this change (mainly as a safety measure, as the
control plane does not currently need to know about it).

Use the presence (or absence) of the node attribute to decide whether to create
and update a netmap cache for each profile. If caching is disabled, discard the
cached data; this allows us to use the presence of a cached netmap as an
indicator it should be used (unless explicitly overridden). Add a test that
verifies the attribute is respected. Reverse the sense of the environment knob
to be true by default, with an override to disable caching at the client
regardless what the node attribute says.

Move the creation/update of the netmap cache (when enabled) until after
successfully applying the network map, to reduce the possibility that we will
cache (and thus reuse after a restart) a network map that fails to correctly
configure the client.

Updates #12639

Change-Id: I1df4dd791fdb485c6472a9f741037db6ed20c47e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-01 15:02:53 -07:00
Alex Chan
4ace87a965 net,tsnet: fix the capitalisation of "Wireshark"
See https://www.wireshark.org/; there's no intercapped S.

Updates #cleanup

Change-Id: I7c89a3fc6fb0436d0ce0e25a620bde7e310e89d2
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-26 19:39:29 +00:00
Alex Valiushko
330a17b7d7 net/batching: use vectored writes on Linux (#19054)
On Linux batching.Conn will now write a vector of
coalesced buffers via sendmmsg(2) instead of copying
fragments into a single buffer.

Scatter-gather I/O has been available on Linux since the
earliest days (reworked in 2.6.24). Kernel passes fragments
to the driver if it supports it, otherwise linearizes
upon receiving the data.

Removing the copy overhead from userspace yields up to 4-5%
packet and bitrate improvement on Linux with GSO enabled:
46Gb/s 4.4m pps vs 44Gb/s 4.2m pps w/32 Peer Relay client flows.

Updates tailscale/corp#36989


Change-Id: Idb2248d0964fb011f1c8f957ca555eab6a6a6964

Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
2026-03-25 16:38:54 -07:00
Greg Steuck
954a2dfd31 net/dns: fix duplicate search line entries (OpenBSD, primarily)
Fixes #12360

Signed-off-by: Greg Steuck <greg@nest.cx>
2026-03-25 10:19:02 -07:00
Jordan Whited
f0ba1f3909 net/udprelay: remove experimental label from package docs
Update #cleanup

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-24 10:41:17 -07:00
Jordan Whited
9c36a71a90 feature/*,net/tstun: add tundev_txq_drops clientmetric on Linux
By polling RTM_GETSTATS via netlink. RTM_GETSTATS is a relatively
efficient and targeted (single device) polling method available since
Linux v4.7.

The tundevstats "feature" can be extended to other platforms in the
future, and it's trivial to add new rtnl_link_stats64 counters on
Linux.

Updates tailscale/corp#38181

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-24 09:44:58 -07:00
Alex Chan
1d0fde6fc2 all: use bart.Lite instead of bart.Table where appropriate
When we don't care about the payload value and are just checking whether
a set contains an IP/prefix, we can use `bart.Lite` for the same lookup
times but a lower memory footprint.

Fixes #19075

Change-Id: Ia709e8b718666cc61ea56eac1066467ae0b6e86c
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-03-24 14:45:23 +00:00
Brendan Creane
0b4c0f2080 net/dns/resolver: treat DNS REFUSED responses as soft errors in forwarder race (#19053)
When racing multiple upstream DNS resolvers, a REFUSED (RCode 5) response
from a broken or misconfigured resolver could win the race and be returned
to the client before healthier resolvers had a chance to respond with a
valid answer. This caused complete DNS failure in cases where, e.g., a
broken upstream resolver returned REFUSED quickly while a working resolver
(such as 1.1.1.1) was still responding.

Previously, only SERVFAIL (RCode 2) was treated as a soft error. REFUSED
responses were returned as successful bytes and could win the race
immediately. This change also treats REFUSED as a soft error in the UDP
and TCP forwarding paths, so the race continues until a better answer
arrives. If all resolvers refuse, the first REFUSED response is returned
to the client.

Additionally, SERVFAIL responses from upstream resolvers are now returned
verbatim to the client rather than replaced with a locally synthesized
packet. Synthesized SERVFAIL responses were authoritative and guaranteed
to include a question section echoing the original query; upstream
responses carry no such guarantees but may include extended error
information (e.g. RFC 8914 extended DNS errors) that would otherwise
be lost.

Fixes #19024

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-03-23 10:40:05 -07:00
Claus Lensbøl
85bb5f84a5 wgengine/magicsock,control/controlclient: do not overwrite discokey with old key (#18606)
When a client starts up without being able to connect to control, it
sends its discoKey to other nodes it wants to communicate with over
TSMP. This disco key will be a newer key than the one control knows
about.

If the client that can connect to control gets a full netmap, ensure
that the disco key for the node not connected to control is not
overwritten with the stale key control knows about.

This is implemented through keeping track of mapSession and use that for
the discokey injection if it is available. This ensures that we are not
constantly resetting the wireguard connection when getting the wrong
keys from control.

This is implemented as:
 - If the key is received via TSMP:
   - Set lastSeen for the peer to now()
   - Set online for the peer to false
 - When processing new keys, only accept keys where either:
   - Peer is online
   - lastSeen is newer than existing last seen

If mapSession is not available, as in we are not yet connected to
control, punt down the disco key injection to magicsock.

Ideally, we will want to have mapSession be long lived at some point in
the near future so we only need to inject keys in one location and then
also use that for testing and loading the cache, but that is a yak for
another PR.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-20 08:56:27 -04:00
Nick Khyl
0d8d3831b9 net/dns: use the correct separator for multiple servers in the same NRPT rule on Windows
If an NRPT rule lists more than one server, those servers should be separated by a semicolon (";"),
rather than a semicolon followed by a space ("; "). Otherwise, Windows fails to parse the created
registry value, and DNS resolution may fail.

https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-gpnrpt/06088ca3-4cf1-48fa-8837-ca8d853ee1e8

Fixes #19040
Updates #15404 (enabled MagicDNS IPv6 by default, adding a second server and triggering the issue)

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2026-03-19 09:07:39 -05:00
Claus Lensbøl
2534bc3202 net/tstun: do not write when Wrapper is closed (#19038)
Two methods could deadlock during shutdown when closing the wrapper.

Ensure that the writers are aware of the wrapper being closed.

Fixes #19037

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-18 17:53:34 -04:00
Jordan Whited
31d65a909d net/batching: eliminate gso helper func indirection
These were previously swappable for historical reasons that are no
longer relevant.

Removing the indirection enables future inlining optimizations if we
simplify further.

Updates tailscale/corp#38703

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-18 10:11:33 -07:00
Jordan Whited
96dde53b43 net/{batching,udprelay},wgengine/magicsock: add SO_RXQ_OVFL clientmetrics
For the purpose of improved observability of UDP socket receive buffer
overflows on Linux.

Updates tailscale/corp#37679

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-13 14:27:03 -07:00
kari-ts
4c7c1091ba netns: add Android callback to bind socket to network (#18915)
After switching from cellular to wifi without ipv6, ForeachInterface still sees rmnet prefixes, so HaveV6 stays true, and magicsock keeps attempting ipv6 connections that either route through cellular or time out for users on wifi without ipv6

This:
-Adds SetAndroidBindToNetworkFunc, a callback to bind the socket to the selected Android Network object

Updates tailscale/tailscale#6152

Signed-off-by: kari-ts <kari@tailscale.com>
2026-03-11 12:28:28 -07:00
Jordan Whited
607d01cdae net/batching: clarify & simplify single packet read limitations
ReadFromUDPAddrPort worked if UDP GRO was unsupported, but we don't
actually want attempted usage, nor does any exist today. Future work
on tailscale/corp#37679 would have required more complexity in this
method, vs clarifying the API intents.

Updates tailscale/corp#37679

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-03-11 10:25:58 -07:00
Brad Fitzpatrick
bd2a2d53d3 all: use Go 1.26 things, run most gofix modernizers
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.

Some of it's older "for x := range 123".

Also: errors.AsType, any, fmt.Appendf, etc.

Updates #18682

Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-06 13:32:03 -08:00
Brad Fitzpatrick
2a64c03c95 types/ptr: deprecate ptr.To, use Go 1.26 new
Updates #18682

Change-Id: I62f6aa0de2a15ef8c1435032c6aa74a181c25f8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 20:13:18 -08:00
Brad Fitzpatrick
2810f0c6f1 all: fix typos in comments
Fix its/it's, who's/whose, wether/whether, missing apostrophes
in contractions, and other misspellings across the codebase.

Updates #cleanup

Change-Id: I20453b81a7aceaa14ea2a551abba08a2e7f0a1d8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-05 13:52:01 -08:00
Claus Lensbøl
1b53c00f2b clientupdate,net/tstun: add support for OpenWrt 25.12.0 using apk (#18545)
OpenWrt is changing to using alpine like `apk` for package installation
over its previous opkg. Additionally, they are not using the same repo
files as alpine making installation fail.

Add support for the new repository files and ensure that the required
package detection system uses apk.

Updates #18535

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-05 13:39:07 -05:00
Brad Fitzpatrick
87bf76de89 net/porttrack: change magic listen address format for Go 1.26
Go 1.26's url.Parser is stricter and made our tests elsewhere fail
with this scheme because when these listen addresses get shoved
into a URL, it can't parse back out.

I verified this makes tests elsewhere pass with Go 1.26.

Updates #18682

Change-Id: I04dd3cee591aa85a9417a0bbae2b6f699d8302fa
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-04 21:57:05 -08:00
Daniel Pañeda
d58bfb8a1b net/udprelay: use GOMAXPROCS instead of NumCPU for socket count
runtime.NumCPU() returns the number of CPUs on the host, which in
containerized environments is the node's CPU count rather than the
container's CPU limit. This causes excessive memory allocation in
pods with low CPU requests running on large nodes, as each socket's
packetReadLoop allocates significant buffer memory.

Use runtime.GOMAXPROCS(0) instead, which is container-aware since
Go 1.25 and respects CPU limits set via cgroups.

Fixes #18774

Signed-off-by: Daniel Pañeda <daniel.paneda@clickhouse.com>
2026-03-04 16:30:12 -08:00
Mike O'Driscoll
2c9ffdd188 cmd/tailscale,ipn,net/netutil: remove rp_filter strict mode warnings (#18863)
PR #18860 adds firewall rules in the mangle table to save outbound packet
marks to conntrack and restore them on reply packets before the routing
decision. When reply packets have their marks restored, the kernel uses
the correct routing table (based on the mark) and the packets pass the
rp_filter check.

This makes the risk check and reverse path filtering warnings unnecessary.

Updates #3310
Fixes tailscale/corp#37846

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-03-04 14:09:19 -05:00
Brad Fitzpatrick
d42b3743b7 net/porttrack: add net.Listen wrapper to help tests allocate ports race-free
Updates tailscale/corp#27805
Updates tailscale/corp#27806
Updates tailscale/corp#37964

Change-Id: I7bb5ed7f258e840a8208e5d725c7b2f126d7ef96
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-03-03 20:56:20 -08:00
Claus Lensbøl
2d21dd46cd wgengine/magicsoc,net/tstun: put disco key advertisement behind a nob (#18857)
To be less spammy in stable, add a nob that disables the creation and
processing of TSMPDiscoKeyAdvertisements until we have a proper rollout
mechanism.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-03-03 09:04:37 -05:00
James Tucker
45305800a6 net/netmon: ignore NetBird interface on Linux
Windows and macOS are not covered by this change, as neither have safely
distinct names to make it easy to do so. This covers the requested case
on Linux.

Updates #18824

Signed-off-by: James Tucker <james@tailscale.com>
2026-02-27 17:38:52 -08:00
joshua stein
518d241700 netns,wgengine: add OpenBSD support to netns via an rtable
When an exit node has been set and a new default route is added,
create a new rtable in the default rdomain and add the current
default route via its physical interface.  When control() is
requesting a connection not go through the exit-node default route,
we can use the SO_RTABLE socket option to force it through the new
rtable we created.

Updates #17321

Signed-off-by: joshua stein <jcs@jcs.org>
2026-02-25 12:44:32 -08:00
Brad Fitzpatrick
eb819c580e cmd/containerboot, net/dns/resolver: remove unused funcs in tests
staticcheck was complaining about it on a PR
I sent: https://github.com/tailscale/tailscale/actions/runs/22408882872/job/64876543467?pr=18804

And: https://github.com/tailscale/tailscale/actions/runs/22408882872/job/64876543475?pr=18804

Updates #cleanup
Updates #18157

Change-Id: I6225481f3aab9e43ef1920aa1a12e86c5073a638
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-25 10:24:04 -08:00