Outbound packets produced by netstack (used by tailscaled with
--tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies)
enter the wrapper via InjectOutbound{,PacketBuffer} and take the
injectedRead path, which bypasses Filter.RunOut.
RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple
into the connection-tracking LRU so that Filter.RunIn admits inbound
replies that no explicit ACL rule covers. Skipping it on the injected
path meant a netstack-side dial of UDP would send fine but the reply
would be dropped as "no matching rule". The kernel-TUN path was
already fine because it goes through RunOut.
Fixes#14229Fixes#20064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b
suggestExitNodeLocked now ranks exit node candidates using the per-region
latency tracked by the netcheck Client (RecentRegionLatency), which merges
the reports retained in c.prev. That history is only useful for far-away
regions if it contains a full netcheck report, since incremental reports
only re-probe the home region and a handful of the fastest ones.
The full-report cadence in GetReport and the c.prev retention window were
two independent 5-min constants - the way we schedule netchecks ensured
that the history always contaned a full report, but it was not a strong
contract and we did not have any checks around this.
Now full report interval and retention window are driven by the same
var, and a test confirms that the history contains a full report.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.
Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
In direct mode we write resolv.conf via a temp file and rename(2), which
preserves the source's generic etc_t label instead of net_conf_t, causing
AVC denials when NetworkManager later manages the file. Run restorecon
after the rename (Linux, SELinux-enforcing, best effort) to restore the
policy-default label.
Fixes#20149
Signed-off-by: Brendan Creane <bcreane@gmail.com>
Add support for configuring egress to destinations reachable via 4via6
subnet routes, using either the synthesized 4via6 address or the MagicDNS
name (in the form <IPv4-with-hyphens>-via-<siteID>[.*]).
Also update the Connector to validate and advertise 4via6 subnet routes.
Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.
This change only affects standalone egress proxies — ProxyGroup egress
requires IPv6 support before it can use 4via6.
Updates #19334
Change-Id: I6faecd6eb61ab55fc0cd97fe417af6b6a12fe7fc
Signed-off-by: Becky Pauley <becky@tailscale.com>
Mappings from transit IPs to real IPs are stored ephemerally in the
connector, so they're lost on restart. When we send a packet to the
connector with a transit IP it does not recognize, it sends us a TSMP
message saying so (see #19883). If we (the client) know of such a
mapping, we now re-send it to the connector so that a connection can
proceed.
Fixestailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
Follow-up cleanups to the IPv6 fragment extension header support added in
the previous commit:
- Document that minFragBlks is sized for IPv4 but intentionally reused by
decode6 for IPv6 fragments, where it is conservative (IPv6 fragments
carry no per-fragment IP header) and only ever rejects more later
fragments as Unknown, never fewer.
- Add a TestDecode case for a first fragment reachable only through a
chained extension header (base Next Header = Hop-by-Hop Options, which
chains to Fragment). decode6 only parses the Fragment header when it is
the base header's immediate Next Header, so this must classify as
Unknown. The test locks in that scoping decision.
Updates #20083
Updates #20140
Change-Id: Ibece03c6baf2385b0cc399f179819b08cbe921cc
Signed-off-by: James Tucker <james@tailscale.com>
decode6 didn't parse the IPv6 Fragment extension header (Next Header 44),
so any source-fragmented IPv6 packet was classified as an unknown protocol
and matched no ACL rule. The filter then silently dropped it and counted it
as an "acl" drop, even on allow-all tailnets, blackholing large UDP (DNS,
WebRTC, etc.) over a tailnet's IPv6 addresses. IPv4 fragments were already
handled by decode4.
Parse the fragment header the same way: read the first fragment's transport
ports so the filter matches it like an unfragmented packet, pass later
fragments through as ipproto.Fragment, and reject overlapping-fragment
offsets (RFC 1858) and first fragments too short to hold the transport
header as unknown.
Fixes#20083
Signed-off-by: Steve Avery <hello@stevenavery.com>
This removes deprecated magic-dns formats for 4via6 subnet routers.
These are superseded by the current format: Q-R-S-T-via-X.
Fixes#20053
Change-Id: I0eed1f057f856f248c4dc8ce3b751f6c7edcfbfd
Signed-off-by: Becky Pauley <becky@tailscale.com>
macOS 26.4 emits RTM_MISS on the routing socket for every failed route
lookup. skipRouteMessage never inspected the message type, so each miss
woke the monitor as a link change and triggered a netcheck. On networks
without an IPv6 default route the netcheck's IPv6 DERP probes fail and
emit more RTM_MISS messages, sustaining the loop indefinitely: netchecks
run at roughly 40x the intended rate, with sustained probe traffic and
corresponding CPU and battery cost.
RTM_MISS scales with traffic volume, not network state, and is never
the leading signal for a topology change: route withdrawals emit
RTM_DELETE synchronously before any subsequent lookup can miss, so
ignoring it loses no signal. Other routing daemons (bird, dhcpcd, frr)
ignore it as well.
Same fix as coder/tailscale@e956a95074.
Fixes#19324
Signed-off-by: Doug Bryant <dougbryant@anthropic.com>
When MagicDNS is enabled but no global upstream resolvers are configured,
the forwarder only handles specific suffixes and defers other names to the
system resolver. A query it has no resolver for is expected in that case, so
don't raise the dns-forward-failing warning unless a default "." route makes
Tailscale the default resolver.
Fixes#19931
Signed-off-by: Brendan Creane <bcreane@gmail.com>
Commit 2b338dd6a8 removed watchdogEngine because it was weird
(so many methods) and increasingly unnecessary after we'd cleaned up
and simplified so much of the locking.
This adds back a watchdog, but an easier to maintain one that's more
idiomatic.
Updates #19759
Change-Id: I86c458473e126c0809f37696446ce7acf4cc4eb9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Several packages built their HTTP transports with
http.DefaultTransport.(*http.Transport).Clone()
The standard library only documents http.DefaultTransport as an
http.RoundTripper, so an application is free to replace it with a
RoundTripper that is not a *http.Transport (e.g. an instrumented or
tracing wrapper). When such an application embeds tsnet.Server, the
unchecked type assertion panics as soon as tsnet brings up its control
connection, DNS bootstrap, or log uploader.
Add netutil.NewDefaultTransport, which returns a clone of the global
when it is still the standard *http.Transport (preserving existing
behavior) and otherwise returns a fresh transport mirroring the stdlib
defaults. Route every clone site through it.
Updates #19937
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Achille Roussel <achille.roussel@gmail.com>
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.
This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
When a connector receives a packet from a client on a transit IP that it
can't find a real IP mapping for, it drops the packet. This commit
starts notifying the client of this dropping over TSMP, so the client
can tell the connector to re-establish the transit IP-real IP binding.
Updates tailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.
Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.
Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.
This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.
This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In case we land on this branch during a goto retry. Also, protect
Geneve offset from mutation across retries.
Fixes#19927
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In PR #19682, we introduced the traffic package which provides a
traffic.Scores.SortNodes method that uses rendezvous hashing to
break ties by equally distribute the “best” node for any given client.
This PR adds a fuzzer to make sure this algorithm is not wildly unfair.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This NodeCapability works around the UDP GSO bugs introduced by
torvalds/linux@b10b446 (v7.0-rc1). These bugs were later fixed by
torvalds/linux@78effd8 and torvalds/linux@5f17ae0 (v7.1-rc5). These
Linux kernel bugs cause mangled UDP headers and UDP checksums, resulting
in high levels of packet loss.
The aforementioned bugs have already made their way downstream into
various distros, e.g. Ubuntu 26.04 LTS. Impacted users are now dealing
with poor UDP performance in tailscaled, and in any other software that
makes use of UDP GSO.
Not all users of the affected kernels are impacted as the relevant
kernel code path sits between kernel and netdev driver, and behaviors
vary by driver/device capability.
We cannot detect impact at runtime, as this would require gathering all
netdevs, and performing loopback tests. This is invasive and in many
cases impossible.
So, we are left to choose between disabling UDP GSO for all users on
affected kernels, whether they experience real impact or not, or try
and work around the bugs. Disabling UDP GSO for a user that is not
impacted can cut max throughput in half, and consume more CPU cycles.
This commit attempts to workaround the bugs by avoiding UDP GSO when
batches are small, and injecting a 1-byte sentinel tail payload when
they are large. This tail payload is smaller than "GSO size", which
sidesteps the primary trigger of all fragments in a batch being
equal in length.
The end result is slightly increased payload and packet overhead, but
functional UDP GSO for all Linux 7.0-7.1.4 users, regardless of
netdev/driver.
Updates #19777
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Add four control-plane node attributes that let us disable UDP GSO/GRO
on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN
device.
These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and
TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate
upstream Linux kernel regressions on a deployed fleet without
requiring a client release, after two incidents (#13041, #19777) where
buggy kernel patches landed upstream and the fix took an excessively
long time to reach downstream distros.
Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via
a comparison against a cached "last applied" value and only an actual
transition triggers work: magicsock Rebind()+ReSTUN for UDP,
ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and
is one-way (wireguard-go GRO disablement is sticky); re-enabling
requires a client restart.
Updates #13041
Updates #19777
Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273
Signed-off-by: James Tucker <james@tailscale.com>
bind() on NETLINK_ROUTE sockets does not work on Android 11+ (https://developer.android.com/identity/user-data-ids#mac-11-plus) . Since system/bin/ip uses bind(), likelyHomeRouterIPHelper() always fails on Andoroid 11+, so that GatewayAndSelfIP never caches the result, causing repeated ip process spawns on every periodic ReSTUN.
This replaces the system/bin/ip fallback with a cached gateway IP pushed from Android’s ConnectivityManager via LinkProperties.getRoutes(). This is the same patterm used by UpdateLastKnownDefaultRouteInterface for the interface name (see https://github.com/tailscale/tailscale/pull/11784/). We keep the proc/net/route path as a fallback for early startup before NetworkChangeCallback has fired.
Updates tailscale/tailscale#18622
Updates tailscale/tailscale#13352
Signed-off-by: kari-ts <kari@tailscale.com>
It is sometimes useful when diagnosing subtle and specific performance
problems to rule out GRO/GSO independently and/or toggle them to
influence packet pacing.
Updates #17835
Updates tailscale/corp#31164
Signed-off-by: James Tucker <james@tailscale.com>
Add support for configuring egress to destinations reachable via 4via6
subnet routes. This change affects standalone egress proxy only- egress
ProxyGroup needs IPv6 support before being able to support 4via6. Egress may
be configured using either the synthesized 4via6 address or the MagicDNS
name (in the form
<IPv4-address-with-hyphens-instead-of-dots>-via-<siteid>[.*]).
Also update the Connector to validate and advertise 4via6 subnet routes.
Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.
Updates #19334
Signed-off-by: Becky Pauley <becky@tailscale.com>
In PR tailscale/corp#30448, we originally decided to break ties using
SHA256 for our rendezvous hashing algorithm. Now that we’ve had some
experience with it, we think that FNV-1a is a better choice. It
distributes bits evenly, it’s much faster, and it doesn’t need to be
cryptographically secure. The FNV designers recommend FNV-1a over the
deprecated FNV-1.
This PR makes the switch and updates the related tests, since changing
the algorithm changes which stable pick gets selected. As of 2026-05,
this is the best time to make this change, since there are almost no
clients in the wild with traffic steering enabled.
Updates #17366
Updates tailscale/corp#29964
Updates tailscale/corp#29966
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The traffic package contains helpers for evaluating traffic steering
scores and picking appropriate nodes. These were extracted from
ipnlocal.suggestExitNodeUsingTrafficSteering so they can be reused by
the new routecheck package to probe exit nodes in priority order.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
When tailscaled is running in userspace-networking mode behind an
exit node (e.g. as a SOCKS5 proxy), it resolves a hostname and then
dials a single resolved IP through the tunnel. If the name has both
A and AAAA, Go's net.Resolver merges them and we pick ips[0], which
on an IPv6-native host is usually AAAA. If the exit node has no IPv6
egress (or vice versa), the dial fails silently through the tunnel
and the user sees a hang.
Resolve all candidates and race connect attempts across address
families with a 300ms happy-eyeballs delay, matching Go's net.Dialer
default and the existing pattern in net/dnscache (commit ee0a03b14).
First success wins; losers are cancelled and any conns they produce
are closed. A failBoost channel wakes the launcher when a connect
fails fast (e.g. ICMP "no route" via the tunnel) so we don't sit on
the 300ms timer when the answer is already known.
userDialResolve is refactored into userDialResolveAll (returns the
full candidate list) plus a thin single-IP wrapper for callers like
UserDialPlan that don't race. UserDial's per-IP dispatch (netstack
vs peer dialer vs SystemDial vs std) is extracted to dialOneUser so
each candidate can route correctly on its own merits.
Also fix serveDial in localapi to pass the original hostname to
UserDial rather than a pre-resolved IP, so the race can fire.
This fix is single-ended: it works against any exit node, including
old ones, with no protocol changes. The trade-off versus filtering
on the exit-node side via PeerAPI DoH is that every dial through an
unreachable-family exit node costs one failed connect attempt per
cache window, rather than zero, which is acceptable given the
simplicity.
Fixes#19792Fixes#13257
Change-Id: I9d7645d0034caf3ee22ecdd8070798353f77e94b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If the context given to DialContext has a shorter lifetime than the OS
TCP SYN timeout, and TCP SYNs are dropped from the path to the remote,
DialContext would never fall back to try IPv6 after IPv4.
Instead, use the normal happy eyeballs race if there is more than one
address. This does remove the implicit prioritization of IPv4 over IPv6
in cases where there is only a single IPv4 remote address.
Updates #13346
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
The codegen path for map-of-slice-of-pointer fields, skipped
nil-valued entries. That dropped the key from the map.
This broke how dns.Config.Routes uses nil values sentinels.
Fixes#19730Fixes#19732Fixes#19746Fixes#19744
Change-Id: Ic6400227f4ab21b3ca0e8c0eeecf9b83d145a9ab
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
A missing hosts file is not a fatal error. We should log it, but still proceed
and create a new one instead of failing the DNS reconfiguration completely.
Fixes#19733
Signed-off-by: Nick Khyl <nickk@tailscale.com>
If another part of the client code registers a custom scheme with the
forwarder, the forwarder will check resolver addresses to see if they
match the scheme. If they do, the corresponding custom scheme handler
will be called to find the actual address for the resolver at this
moment. If the handler returns the empty string then that resolver will
be ignored.
This is useful if you want to dynamically determine where to send
certain DNS requests. It is being added to support new app connector
(conn25) work that would like to make sure it sends DNS requests to the
current connector peer in a high availability configuration.
Updates tailscale/corp#39858
Signed-off-by: Fran Bull <fran@tailscale.com>
Replace the UAPI text protocol-based wireguard configuration with
wireguard-go's new direct callback API (SetPeerLookupFunc,
SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey).
Instead of computing a trimmed wireguard config ahead of time upon
control plane updates and pushing it via UAPI, install callbacks so
wireguard-go creates peers on demand when packets arrive. This removes
all the LazyWG trimming machinery: idle peer tracking, activity maps,
noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the
ts_omit_lazywg build tag.
For incoming packets, PeerLookupFunc answers wireguard-go's questions
about unknown public keys by looking up the peer in the full config.
For outgoing packets, PeerByIPPacketFunc (installed from
LocalBackend.lookupPeerByIP) maps destination IPs to node public keys
using the existing nodeByAddr index.
Updates tailscale/corp#12345
Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The darwinConfigurator writes split DNS resolver files to
/etc/resolver/$SUFFIX using os.WriteFile with string concatenation.
A crafted MatchDomain value containing path traversal sequences
(e.g. "../evil") could write files outside the resolver directory.
Use os.OpenRoot to confine all file operations in SetDNS and
removeResolverFiles to the resolver directory. os.Root rejects any
path component that escapes the root, returning an error instead of
following the traversal.
Also parametrize the resolver directory path on the struct to enable
testing with t.TempDir(), and add tests.
As far as I can tell, this would require a malicious controlplane to
exploit, but still worth fixing.
Updates tailscale/corp#39751
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Add a tsdial.Dialer.UserDialPlan method that resolves an address and
reports whether the dialer would route it via Tailscale. The LocalAPI
/dial handler now uses this to skip proxying for addresses that aren't
Tailscale routes (e.g. localhost), returning a Dial-Self response with
the resolved address so the client can dial it directly. This avoids
an unnecessary round-trip through the daemon for local connections.
The client's UserDial handles the new response by dialing the resolved
address itself, and the server passes the pre-resolved IP:port for
Tailscale dials to avoid redundant DNS lookups.
Thanks to giacomo and Moyao for pointing this out!
Updates tailscale/corp#39702
Change-Id: I78d640f11ccd92f43ddd505cbb0db8fee19f43a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The cloner's codegen for map[K][]*V fields was doing a shallow
append (copying pointer values) instead of cloning each element.
This meant that cloned structs aliased the original's pointed-to
values through the map's slice entries.
Mirror the existing standalone-slice logic that checks
ContainsPointers(sliceType.Elem()) and generates per-element
cloning for pointer, interface, and struct types.
Regenerate net/dns and tailcfg which both had affected
map[...][]*dnstype.Resolver fields.
Fixes#19284
Signed-off-by: Andrew Dunham <andrew@tailscale.com>
The test had two problems:
1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher,
but the test filesystem uses a temp directory prefix. The watcher
was watching the real /etc/, never seeing the test's file writes.
2. The test's watchFile used gonotify.NewDirWatcher which creates
goroutines that block on real inotify syscalls. These don't work
inside synctest's fake-time bubble. The test only passed standalone
by accident: gonotify walks /etc/ on startup producing fake events
that happened to trigger trample detection at the right time.
Fix the path issue by adding ActualPath to the wholeFileFS interface,
which translates logical paths (like "/etc/resolv.conf") to real
filesystem paths (respecting any test prefix). Use it in
runFileWatcher so the inotify watch targets the correct directory.
Replace gonotify in the test with a one-shot timer that synctest can
advance through fake time, reliably triggering the trample check.
Fixes#19400
Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Avery found a bunch of tests that fail with -count=2.
Updates tailscale/corp#40176 (tracks making our CI detect them)
Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.
tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.
The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.
Updates #8085
Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Investigating battery costs on a busy tailnet I noticed a large number
of nodes regularly reconnecting to control and DERP. In one case I was
able to analyze closely `pmset` reported the every-minute wake-ups being
triggered by bluetooth. The node was by side effect reconnecting to
control constantly, and this was at times visible to peers as well.
Three changes here improve the situation:
- Short time jumps (less than 10 minutes) no longer produce "major
network change" events, and so do not trigger full rebind/reconnect.
- Many "incidental" fields on interfaces are ignored, like MTU, flags
and so on - if the route is still good, the rest should be manageable.
- Additional log output will provide more detail about the cause of
major network change events.
Updates #3363
Signed-off-by: James Tucker <james@tailscale.com>
The cloner and viewer code generators didn't handle named types
with basic underlying types (map/slice) that have their own Clone
or View methods. For example, a type like:
type Map map[string]any
func (m Map) Clone() Map { ... }
func (m Map) View() MapView { ... }
When used as a struct field, the cloner would descend into the
underlying map[string]any and fail because it can't clone the any
(interface{}) value type. Similarly, the viewer would try to create
a MapFnOf view and fail.
Fix the cloner to check for a Clone method on the named type
before falling through to the underlying type handling.
Fix the viewer to check for a View method on named map/slice types,
so the type author can provide a purpose-built safe view that
doesn't leak raw any values. Named map/slice types without a View
method fall through to normal handling, which correctly rejects
types like map[string]any as unsupported.
Updates tailscale/corp#39502 (needed by tailscale/corp#39594)
Change-Id: Iaef0192a221e02b4b8e409c99ef8398090327744
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.
The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
slice/map literal with bad name field values
Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.
Updates #19242
Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a new tailcfg.NodeCapability (NodeAttrCacheNetworkMaps) to control whether
a node with support for caching network maps will attempt to do so. Update the
capability version to reflect this change (mainly as a safety measure, as the
control plane does not currently need to know about it).
Use the presence (or absence) of the node attribute to decide whether to create
and update a netmap cache for each profile. If caching is disabled, discard the
cached data; this allows us to use the presence of a cached netmap as an
indicator it should be used (unless explicitly overridden). Add a test that
verifies the attribute is respected. Reverse the sense of the environment knob
to be true by default, with an override to disable caching at the client
regardless what the node attribute says.
Move the creation/update of the netmap cache (when enabled) until after
successfully applying the network map, to reduce the possibility that we will
cache (and thus reuse after a restart) a network map that fails to correctly
configure the client.
Updates #12639
Change-Id: I1df4dd791fdb485c6472a9f741037db6ed20c47e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
On Linux batching.Conn will now write a vector of
coalesced buffers via sendmmsg(2) instead of copying
fragments into a single buffer.
Scatter-gather I/O has been available on Linux since the
earliest days (reworked in 2.6.24). Kernel passes fragments
to the driver if it supports it, otherwise linearizes
upon receiving the data.
Removing the copy overhead from userspace yields up to 4-5%
packet and bitrate improvement on Linux with GSO enabled:
46Gb/s 4.4m pps vs 44Gb/s 4.2m pps w/32 Peer Relay client flows.
Updates tailscale/corp#36989
Change-Id: Idb2248d0964fb011f1c8f957ca555eab6a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>