tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2026-06-23 15:31:47 -04:00

Author	SHA1	Message	Date
Brad Fitzpatrick	e0677ccc76	net/tstun, wgengine/filter: track UDP flow state for injected packets Outbound packets produced by netstack (used by tailscaled with --tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies) enter the wrapper via InjectOutbound{,PacketBuffer} and take the injectedRead path, which bypasses Filter.RunOut. RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple into the connection-tracking LRU so that Filter.RunIn admits inbound replies that no explicit ACL rule covers. Skipping it on the injected path meant a netstack-side dial of UDP would send fine but the reply would be dropped as "no matching rule". The kernel-TUN path was already fine because it goes through RunOut. Fixes #14229 Fixes #20064 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b	2026-06-22 15:57:37 -07:00
Anton Tolchanov	e9e209673e	net/netcheck: ensure recent history has a full report suggestExitNodeLocked now ranks exit node candidates using the per-region latency tracked by the netcheck Client (RecentRegionLatency), which merges the reports retained in c.prev. That history is only useful for far-away regions if it contains a full netcheck report, since incremental reports only re-probe the home region and a handful of the fastest ones. The full-report cadence in GetReport and the c.prev retention window were two independent 5-min constants - the way we schedule netchecks ensured that the history always contaned a full report, but it was not a strong contract and we did not have any checks around this. Now full report interval and retention window are driven by the same var, and a test confirms that the history contains a full report. Updates tailscale/corp#17516 Signed-off-by: Anton Tolchanov <anton@tailscale.com>	2026-06-22 12:28:09 +02:00
Anton Tolchanov	f442cda999	ipn/ipnlocal: consider all DERP regions for exit node recommendations When recommending an exit node, suggestExitNodeLocked ranks candidates by the latency to their home DERP region, taken from the most recent netcheck report. But netcheck alternates between full reports, which probe every region, and incremental reports, which only re-probe the home region and a handful of the fastest regions. When the most recent report is incremental, the suggestion fell back to a random for exit nodes that are far away. Now we rank candidates against the best recent latency, tracked by the `netcheck.Client` - the same data that is used to pick the preferred DERP. It uses a history of measurements which includes a full netcheck report, so should cover all DERP regions. Updates tailscale/corp#17516 Signed-off-by: Anton Tolchanov <anton@tailscale.com>	2026-06-22 12:28:09 +02:00
Brendan Creane	0861dafddf	net/dns: restore SELinux context on /etc/resolv.conf after rename (#20167 ) In direct mode we write resolv.conf via a temp file and rename(2), which preserves the source's generic etc_t label instead of net_conf_t, causing AVC denials when NetworkManager later manages the file. Run restorecon after the rename (Linux, SELinux-enforcing, best effort) to restore the policy-default label. Fixes #20149 Signed-off-by: Brendan Creane <bcreane@gmail.com>	2026-06-18 16:36:56 -07:00
Alex Chan	c3c2aa7093	all: don't repeat the the word "the" unnecessarily Updates #cleanup Change-Id: Ic1f430cd5dbf6cc1a385c59074a5d5cabe6fca57 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-06-18 16:32:08 +01:00
BeckyPauley	35a1a413f9	cmd/{containerboot,k8s-operator}: add 4via6 support in singleton egress (#19983 ) Add support for configuring egress to destinations reachable via 4via6 subnet routes, using either the synthesized 4via6 address or the MagicDNS name (in the form <IPv4-with-hyphens>-via-<siteID>[.*]). Also update the Connector to validate and advertise 4via6 subnet routes. Export net/netutil.ValidateViaPrefix so it can be reused by the Connector validation logic. This change only affects standalone egress proxies — ProxyGroup egress requires IPv6 support before it can use 4via6. Updates #19334 Change-Id: I6faecd6eb61ab55fc0cd97fe417af6b6a12fe7fc Signed-off-by: Becky Pauley <becky@tailscale.com>	2026-06-18 16:13:10 +01:00
Naman Sood	47333e9487	feature/conn25: recreate transit IP mappings when connector loses them Mappings from transit IPs to real IPs are stored ephemerally in the connector, so they're lost on restart. When we send a packet to the connector with a transit IP it does not recognize, it sends us a TSMP message saying so (see #19883). If we (the client) know of such a mapping, we now re-send it to the connector so that a connection can proceed. Fixes tailscale/corp#34256. Signed-off-by: Naman Sood <mail@nsood.in>	2026-06-17 13:50:51 -04:00
James Tucker	26b2ed0a6a	net/packet: clarify minFragBlks reuse for IPv6 and test chained ext header Follow-up cleanups to the IPv6 fragment extension header support added in the previous commit: - Document that minFragBlks is sized for IPv4 but intentionally reused by decode6 for IPv6 fragments, where it is conservative (IPv6 fragments carry no per-fragment IP header) and only ever rejects more later fragments as Unknown, never fewer. - Add a TestDecode case for a first fragment reachable only through a chained extension header (base Next Header = Hop-by-Hop Options, which chains to Fragment). decode6 only parses the Fragment header when it is the base header's immediate Next Header, so this must classify as Unknown. The test locks in that scoping decision. Updates #20083 Updates #20140 Change-Id: Ibece03c6baf2385b0cc399f179819b08cbe921cc Signed-off-by: James Tucker <james@tailscale.com>	2026-06-16 10:16:06 -07:00
Steve Avery	4c4ec3d468	net/packet,wgengine/filter: handle IPv6 fragment extension header decode6 didn't parse the IPv6 Fragment extension header (Next Header 44), so any source-fragmented IPv6 packet was classified as an unknown protocol and matched no ACL rule. The filter then silently dropped it and counted it as an "acl" drop, even on allow-all tailnets, blackholing large UDP (DNS, WebRTC, etc.) over a tailnet's IPv6 addresses. IPv4 fragments were already handled by decode4. Parse the fragment header the same way: read the first fragment's transport ports so the filter matches it like an unfragmented packet, pass later fragments through as ipproto.Fragment, and reject overlapping-fragment offsets (RFC 1858) and first fragments too short to hold the transport header as unknown. Fixes #20083 Signed-off-by: Steve Avery <hello@stevenavery.com>	2026-06-15 11:18:00 -07:00
Alex Chan	abe5fbbf49	all: make this spelling mistake non-existant Updates #cleanup Change-Id: I088aa91218354f6208190c8f6673f9c5a98e65fc Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-06-11 10:37:50 +01:00
BeckyPauley	60b935e30f	net/dns/resolver: remove deprecated 4via6 magic-dns formats (#20057 ) This removes deprecated magic-dns formats for 4via6 subnet routers. These are superseded by the current format: Q-R-S-T-via-X. Fixes #20053 Change-Id: I0eed1f057f856f248c4dc8ce3b751f6c7edcfbfd Signed-off-by: Becky Pauley <becky@tailscale.com>	2026-06-09 18:10:47 +01:00
Doug Bryant	2767100bc2	net/netmon: skip RTM_MISS route messages on darwin (#20050 ) macOS 26.4 emits RTM_MISS on the routing socket for every failed route lookup. skipRouteMessage never inspected the message type, so each miss woke the monitor as a link change and triggered a netcheck. On networks without an IPv6 default route the netcheck's IPv6 DERP probes fail and emit more RTM_MISS messages, sustaining the loop indefinitely: netchecks run at roughly 40x the intended rate, with sustained probe traffic and corresponding CPU and battery cost. RTM_MISS scales with traffic volume, not network state, and is never the leading signal for a topology change: route withdrawals emit RTM_DELETE synchronously before any subsequent lookup can miss, so ignoring it loses no signal. Other routing daemons (bird, dhcpcd, frr) ignore it as well. Same fix as coder/tailscale@e956a95074. Fixes #19324 Signed-off-by: Doug Bryant <dougbryant@anthropic.com>	2026-06-08 10:45:13 -07:00
BeckyPauley	98f1ac0880	cmd/k8s-operator, net/netutil: revert 4via6 changes (#19990 ) Reverts support 4via6 in egress proxy and connector (#19863) Updates #19334 Signed-off-by: Becky Pauley <becky@tailscale.com>	2026-06-03 20:20:36 +01:00
Brendan Creane	b26dadf1b5	net/dns/resolver: skip DNS health warning when doing split DNS (#19959 ) When MagicDNS is enabled but no global upstream resolvers are configured, the forwarder only handles specific suffixes and defers other names to the system resolver. A query it has no resolver for is expected in that case, so don't raise the dns-forward-failing warning unless a default "." route makes Tailscale the default resolver. Fixes #19931 Signed-off-by: Brendan Creane <bcreane@gmail.com>	2026-06-03 09:14:48 -07:00
Brad Fitzpatrick	52400dc6f4	ipn/ipnlocal: add back a watchdog after earlier removal from engine Commit `2b338dd6a8` removed watchdogEngine because it was weird (so many methods) and increasingly unnecessary after we'd cleaned up and simplified so much of the locking. This adds back a watchdog, but an easier to maintain one that's more idiomatic. Updates #19759 Change-Id: I86c458473e126c0809f37696446ce7acf4cc4eb9 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-06-02 11:57:12 -07:00
Achille Roussel	7f3bbc9865	net/netutil: add NewDefaultTransport to avoid http.DefaultTransport panics Several packages built their HTTP transports with http.DefaultTransport.(http.Transport).Clone() The standard library only documents http.DefaultTransport as an http.RoundTripper, so an application is free to replace it with a RoundTripper that is not a http.Transport (e.g. an instrumented or tracing wrapper). When such an application embeds tsnet.Server, the unchecked type assertion panics as soon as tsnet brings up its control connection, DNS bootstrap, or log uploader. Add netutil.NewDefaultTransport, which returns a clone of the global when it is still the standard *http.Transport (preserving existing behavior) and otherwise returns a fresh transport mirroring the stdlib defaults. Route every clone site through it. Updates #19937 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Achille Roussel <achille.roussel@gmail.com>	2026-06-01 12:28:36 -07:00
Simon Law	2d6844c565	cmd/tailscale/cli: add routecheck command (#19641 ) Introduce a new `tailscale routecheck` command which prints a report of high-availability routers that are reachable. This command rhymes with the `tailscale netcheck` command and but instead of reporting on local network conditions, `routecheck` reports on remote connectivity. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 11:50:24 -07:00
Naman Sood	da51072b98	feature/conn25: send TSMP message to client for no IP mapping on connector When a connector receives a packet from a client on a transit IP that it can't find a real IP mapping for, it drops the packet. This commit starts notifying the client of this dropping over TSMP, so the client can tell the connector to re-establish the transit IP-real IP binding. Updates tailscale/corp#34256. Signed-off-by: Naman Sood <mail@nsood.in>	2026-06-01 14:46:27 -04:00
Simon Law	2ee9eacb94	client/local,ipn/localapi: add /localapi/v0/routecheck endpoint (#19640 ) In order to support a `tailscale routecheck` command, we introduce the `/localapi/v0/routecheck` endpoint to the local API. This endpoint returns the most recent report collected by the routecheck client. If `force=true` is an argument in the query string, then this endpoint will actively probe before returning the report. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 11:06:14 -07:00
Simon Law	28801674a6	net/routecheck: introduce new package for checking peer reachability (#19639 ) The routecheck package parallels the netcheck package, where the former checks routes and routers while the latter checks networks. Like netcheck, it compiles reports for other systems to consume. Historically, the client has never known whether a peer is actually reachable. Most of the time this doesn’t matter, since the client will want to establish a WireGuard tunnel to any given destination. However, if the client needs to choose between two or more nodes, then it should try to choose a node that it can reach. Suggested exit nodes are one such example, where the client filters out any nodes that aren’t connected to the control plane. Sometimes an exit node will get disconnected from the control plane: when the network between the two is unreliable or when the exit node is too busy to keep its control connection alive. In these cases, Control disables the Node.Online flag for the exit node and broadcasts this across the tailnet. Arguably, the client should never have relied on this flag, since it only makes sense in the admin console. This patch implements an initial routecheck client that can probe every node that your client knows about. You should not ping scan your visible tailnet, this method is for debugging only. This patch also introduces a new OnNetMapToggle hook, which fires when the netmap transitions from nil to non-nil, or vice versa. This happens either when the client receives its first MapResponse after connecting to the control plane, or when it clears the netmap while it is disconnecting. Routecheck uses this to wait for a valid netmap so it knows which peers to probe. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 10:33:08 -07:00
Jordan Whited	8a294e3c34	net/batching: reset Buffers len in WriteBatchTo In case we land on this branch during a goto retry. Also, protect Geneve offset from mutation across retries. Fixes #19927 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-05-31 06:12:53 -07:00
Simon Law	5d935c8900	net/traffic: add fuzz test for sorting nodes by traffic score (#19893 ) In PR #19682, we introduced the traffic package which provides a traffic.Scores.SortNodes method that uses rendezvous hashing to break ties by equally distribute the “best” node for any given client. This PR adds a fuzzer to make sure this algorithm is not wildly unfair. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-05-29 11:55:49 -07:00
Jordan Whited	8b58bd6c64	net/batching: implement NodeAttrNeverGSOEqualTail This NodeCapability works around the UDP GSO bugs introduced by torvalds/linux@b10b446 (v7.0-rc1). These bugs were later fixed by torvalds/linux@78effd8 and torvalds/linux@5f17ae0 (v7.1-rc5). These Linux kernel bugs cause mangled UDP headers and UDP checksums, resulting in high levels of packet loss. The aforementioned bugs have already made their way downstream into various distros, e.g. Ubuntu 26.04 LTS. Impacted users are now dealing with poor UDP performance in tailscaled, and in any other software that makes use of UDP GSO. Not all users of the affected kernels are impacted as the relevant kernel code path sits between kernel and netdev driver, and behaviors vary by driver/device capability. We cannot detect impact at runtime, as this would require gathering all netdevs, and performing loopback tests. This is invasive and in many cases impossible. So, we are left to choose between disabling UDP GSO for all users on affected kernels, whether they experience real impact or not, or try and work around the bugs. Disabling UDP GSO for a user that is not impacted can cut max throughput in half, and consume more CPU cycles. This commit attempts to workaround the bugs by avoiding UDP GSO when batches are small, and injecting a 1-byte sentinel tail payload when they are large. This tail payload is smaller than "GSO size", which sidesteps the primary trigger of all fragments in a batch being equal in length. The end result is slightly increased payload and packet overhead, but functional UDP GSO for all Linux 7.0-7.1.4 users, regardless of netdev/driver. Updates #19777 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-05-29 11:36:35 -07:00
James Tucker	25b8ed8d9e	control/controlknobs,net/{batching,tstun},wgengine: add nodecaps to disable UDP & TUN GRO/GSO Add four control-plane node attributes that let us disable UDP GSO/GRO on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN device. These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate upstream Linux kernel regressions on a deployed fleet without requiring a client release, after two incidents (#13041, #19777) where buggy kernel patches landed upstream and the fix took an excessively long time to reach downstream distros. Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via a comparison against a cached "last applied" value and only an actual transition triggers work: magicsock Rebind()+ReSTUN for UDP, ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and is one-way (wireguard-go GRO disablement is sticky); re-enabling requires a client restart. Updates #13041 Updates #19777 Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-27 17:10:14 -07:00
kari-ts	1a17ec1988	net/netmon: in Android, replace system/bin/ip call with cached LinkProperties gateway (#19804 ) bind() on NETLINK_ROUTE sockets does not work on Android 11+ (https://developer.android.com/identity/user-data-ids#mac-11-plus) . Since system/bin/ip uses bind(), likelyHomeRouterIPHelper() always fails on Andoroid 11+, so that GatewayAndSelfIP never caches the result, causing repeated ip process spawns on every periodic ReSTUN. This replaces the system/bin/ip fallback with a cached gateway IP pushed from Android’s ConnectivityManager via LinkProperties.getRoutes(). This is the same patterm used by UpdateLastKnownDefaultRouteInterface for the interface name (see https://github.com/tailscale/tailscale/pull/11784/). We keep the proc/net/route path as a fallback for early startup before NetworkChangeCallback has fired. Updates tailscale/tailscale#18622 Updates tailscale/tailscale#13352 Signed-off-by: kari-ts <kari@tailscale.com>	2026-05-27 15:42:48 -07:00
James Tucker	dea49bb4da	net/batching: add envknobs to disable UDP GRO & GSO It is sometimes useful when diagnosing subtle and specific performance problems to rule out GRO/GSO independently and/or toggle them to influence packet pacing. Updates #17835 Updates tailscale/corp#31164 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-27 12:05:00 -07:00
BeckyPauley	0ed6da2826	cmd/k8s-operator, net/netutil: support 4via6 in egress proxy and connector (#19863 ) Add support for configuring egress to destinations reachable via 4via6 subnet routes. This change affects standalone egress proxy only- egress ProxyGroup needs IPv6 support before being able to support 4via6. Egress may be configured using either the synthesized 4via6 address or the MagicDNS name (in the form <IPv4-address-with-hyphens-instead-of-dots>-via-<siteid>[.*]). Also update the Connector to validate and advertise 4via6 subnet routes. Export net/netutil.ValidateViaPrefix so it can be reused by the Connector validation logic. Updates #19334 Signed-off-by: Becky Pauley <becky@tailscale.com>	2026-05-27 10:54:35 +01:00
Adrian Dewhurst	5d8f401956	net/dns: fix handling non-IP single split DNS Fixes #19834 Change-Id: I4d48efed00cd080b14c6fd713ff21e53a5a6ee3c Signed-off-by: Adrian Dewhurst <adrian@tailscale.com>	2026-05-22 20:45:58 -04:00
Simon Law	7dabebc691	net/traffic: switch rendezvous hashing from SHA256 to FNV-1a (#19821 ) In PR tailscale/corp#30448, we originally decided to break ties using SHA256 for our rendezvous hashing algorithm. Now that we’ve had some experience with it, we think that FNV-1a is a better choice. It distributes bits evenly, it’s much faster, and it doesn’t need to be cryptographically secure. The FNV designers recommend FNV-1a over the deprecated FNV-1. This PR makes the switch and updates the related tests, since changing the algorithm changes which stable pick gets selected. As of 2026-05, this is the best time to make this change, since there are almost no clients in the wild with traffic steering enabled. Updates #17366 Updates tailscale/corp#29964 Updates tailscale/corp#29966 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-05-21 10:11:59 -07:00
Simon Law	7ebca58042	net/traffic,ipn/ipnlocal: extract traffic steering utilities (#19682 ) The traffic package contains helpers for evaluating traffic steering scores and picking appropriate nodes. These were extracted from ipnlocal.suggestExitNodeUsingTrafficSteering so they can be reused by the new routecheck package to probe exit nodes in priority order. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-05-21 08:28:27 -07:00
Brad Fitzpatrick	f3a117e813	net/tsdial: run happy eyeballs across A and AAAA in UserDial When tailscaled is running in userspace-networking mode behind an exit node (e.g. as a SOCKS5 proxy), it resolves a hostname and then dials a single resolved IP through the tunnel. If the name has both A and AAAA, Go's net.Resolver merges them and we pick ips[0], which on an IPv6-native host is usually AAAA. If the exit node has no IPv6 egress (or vice versa), the dial fails silently through the tunnel and the user sees a hang. Resolve all candidates and race connect attempts across address families with a 300ms happy-eyeballs delay, matching Go's net.Dialer default and the existing pattern in net/dnscache (commit `ee0a03b14`). First success wins; losers are cancelled and any conns they produce are closed. A failBoost channel wakes the launcher when a connect fails fast (e.g. ICMP "no route" via the tunnel) so we don't sit on the 300ms timer when the answer is already known. userDialResolve is refactored into userDialResolveAll (returns the full candidate list) plus a thin single-IP wrapper for callers like UserDialPlan that don't race. UserDial's per-IP dispatch (netstack vs peer dialer vs SystemDial vs std) is extracted to dialOneUser so each candidate can route correctly on its own merits. Also fix serveDial in localapi to pass the original hostname to UserDial rather than a pre-resolved IP, so the race can fire. This fix is single-ended: it works against any exit node, including old ones, with no protocol changes. The trade-off versus filtering on the exit-node side via PeerAPI DoH is that every dial through an unreachable-family exit node costs one failed connect attempt per cache window, rather than zero, which is acceptable given the simplicity. Fixes #19792 Fixes #13257 Change-Id: I9d7645d0034caf3ee22ecdd8070798353f77e94b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-20 18:35:55 -07:00
Claus Lensbøl	ee0a03b140	net/dnscache: run happy eyeballs with more than one dest IP (#19770 ) If the context given to DialContext has a shorter lifetime than the OS TCP SYN timeout, and TCP SYNs are dropped from the path to the remote, DialContext would never fall back to try IPv6 after IPv4. Instead, use the normal happy eyeballs race if there is more than one address. This does remove the implicit prioritization of IPv4 over IPv6 in cases where there is only a single IPv4 remote address. Updates #13346 Signed-off-by: Claus Lensbøl <claus@tailscale.com>	2026-05-19 12:59:11 -04:00
Fernando Serboncini	2a06fb66d0	cmd/cloner: preserve nil-valued entries when cloning map (#19749 ) The codegen path for map-of-slice-of-pointer fields, skipped nil-valued entries. That dropped the key from the map. This broke how dns.Config.Routes uses nil values sentinels. Fixes #19730 Fixes #19732 Fixes #19746 Fixes #19744 Change-Id: Ic6400227f4ab21b3ca0e8c0eeecf9b83d145a9ab Signed-off-by: Fernando Serboncini <fserb@tailscale.com>	2026-05-14 10:30:59 -04:00
Nick Khyl	32f984f54c	net/dns: create a new hosts file if it doesn't exist on Windows A missing hosts file is not a fatal error. We should log it, but still proceed and create a new one instead of failing the DNS reconfiguration completely. Fixes #19733 Signed-off-by: Nick Khyl <nickk@tailscale.com>	2026-05-13 16:10:36 -05:00
Brad Fitzpatrick	883d4fd2cd	wgengine/netstack, net/ping: stop using pro-bing and use our net/ping instead Fixes #19633 Fixes #13760 Change-Id: I0fa9423523a3a0fb1dfcde57de0f26e51723ff97 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-04 14:05:24 -07:00
Fran Bull	bdf3419e7d	net/dns: add custom scheme resolvers If another part of the client code registers a custom scheme with the forwarder, the forwarder will check resolver addresses to see if they match the scheme. If they do, the corresponding custom scheme handler will be called to find the actual address for the resolver at this moment. If the handler returns the empty string then that resolver will be ignored. This is useful if you want to dynamically determine where to send certain DNS requests. It is being added to support new app connector (conn25) work that would like to make sure it sends DNS requests to the current connector peer in a high availability configuration. Updates tailscale/corp#39858 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-01 14:01:10 -07:00
Brad Fitzpatrick	f343b496c3	wgengine, all: remove LazyWG, use wireguard-go callback API for on-demand peers Replace the UAPI text protocol-based wireguard configuration with wireguard-go's new direct callback API (SetPeerLookupFunc, SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey). Instead of computing a trimmed wireguard config ahead of time upon control plane updates and pushing it via UAPI, install callbacks so wireguard-go creates peers on demand when packets arrive. This removes all the LazyWG trimming machinery: idle peer tracking, activity maps, noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the ts_omit_lazywg build tag. For incoming packets, PeerLookupFunc answers wireguard-go's questions about unknown public keys by looking up the peer in the full config. For outgoing packets, PeerByIPPacketFunc (installed from LocalBackend.lookupPeerByIP) maps destination IPs to node public keys using the existing nodeByAddr index. Updates tailscale/corp#12345 Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-29 19:46:19 -07:00
Andrew Dunham	33714211c8	net/dns: use os.Root to prevent path traversal in darwin resolver The darwinConfigurator writes split DNS resolver files to /etc/resolver/$SUFFIX using os.WriteFile with string concatenation. A crafted MatchDomain value containing path traversal sequences (e.g. "../evil") could write files outside the resolver directory. Use os.OpenRoot to confine all file operations in SetDNS and removeResolverFiles to the resolver directory. os.Root rejects any path component that escapes the root, returning an error instead of following the traversal. Also parametrize the resolver directory path on the struct to enable testing with t.TempDir(), and add tests. As far as I can tell, this would require a malicious controlplane to exploit, but still worth fixing. Updates tailscale/corp#39751 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2026-04-28 11:08:22 -04:00
Brad Fitzpatrick	0e10a3f580	net/tsdial, ipn/localapi, client/local: let clients dial non-Tailscale addresses directly Add a tsdial.Dialer.UserDialPlan method that resolves an address and reports whether the dialer would route it via Tailscale. The LocalAPI /dial handler now uses this to skip proxying for addresses that aren't Tailscale routes (e.g. localhost), returning a Dial-Self response with the resolved address so the client can dial it directly. This avoids an unnecessary round-trip through the daemon for local connections. The client's UserDial handles the new response by dialing the resolved address itself, and the server passes the pre-resolved IP:port for Tailscale dials to avoid redundant DNS lookups. Thanks to giacomo and Moyao for pointing this out! Updates tailscale/corp#39702 Change-Id: I78d640f11ccd92f43ddd505cbb0db8fee19f43a6 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-27 09:33:27 -07:00
Andrew Dunham	d52ae45e9b	cmd/cloner: deep-clone pointer elements in map-of-slice values The cloner's codegen for map[K][]V fields was doing a shallow append (copying pointer values) instead of cloning each element. This meant that cloned structs aliased the original's pointed-to values through the map's slice entries. Mirror the existing standalone-slice logic that checks ContainsPointers(sliceType.Elem()) and generates per-element cloning for pointer, interface, and struct types. Regenerate net/dns and tailcfg which both had affected map[...][]dnstype.Resolver fields. Fixes #19284 Signed-off-by: Andrew Dunham <andrew@tailscale.com>	2026-04-17 11:36:05 -04:00
Brad Fitzpatrick	49eb1b5d26	net/dns: fix TestDNSTrampleRecovery failure under flakestress The test had two problems: 1. runFileWatcher passed hardcoded "/etc/" to the inotify watcher, but the test filesystem uses a temp directory prefix. The watcher was watching the real /etc/, never seeing the test's file writes. 2. The test's watchFile used gonotify.NewDirWatcher which creates goroutines that block on real inotify syscalls. These don't work inside synctest's fake-time bubble. The test only passed standalone by accident: gonotify walks /etc/ on startup producing fake events that happened to trigger trample detection at the right time. Fix the path issue by adding ActualPath to the wholeFileFS interface, which translates logical paths (like "/etc/resolv.conf") to real filesystem paths (respecting any test prefix). Use it in runFileWatcher so the inotify watch targets the correct directory. Replace gonotify in the test with a one-shot timer that synctest can advance through fake time, reliably triggering the trample check. Fixes #19400 Change-Id: Idb252881ec24d0ab3b3c1d154dbdaf532db837d4 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-14 06:55:35 -07:00
Brad Fitzpatrick	9fbe4b3ed2	all: fix six tests that failed with -count=2 Avery found a bunch of tests that fail with -count=2. Updates tailscale/corp#40176 (tracks making our CI detect them) Change-Id: Ie3e4398070dd92e4fe0146badddf1254749cca20 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Co-authored-by: Avery Pennarun <apenwarr@tailscale.com>	2026-04-13 18:52:57 -07:00
Brad Fitzpatrick	a182b864ac	tsd, all: add Sys.ExtraRootCAs, plumb through TLS dial paths Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through the control client, noise transport, DERP, and wgengine layers so that platforms like Android can inject user-installed CA certificates into Go's TLS verification. tlsdial.Config now honors base.RootCAs as additional trusted roots, tried after system roots and before the baked-in LetsEncrypt fallback. SetConfigExpectedCert gets the same treatment for domain-fronted DERP. The Android client will set sys.ExtraRootCAs with a pool built from x509.SystemCertPool + user-installed certs obtained via the Android KeyStore API, replacing the current SSL_CERT_DIR environment variable approach. Updates #8085 Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-07 18:10:54 -07:00
James Tucker	21695cdbf8	ipn/ipnlocal,net/netmon: make frequent darkwake more efficient Investigating battery costs on a busy tailnet I noticed a large number of nodes regularly reconnecting to control and DERP. In one case I was able to analyze closely `pmset` reported the every-minute wake-ups being triggered by bluetooth. The node was by side effect reconnecting to control constantly, and this was at times visible to peers as well. Three changes here improve the situation: - Short time jumps (less than 10 minutes) no longer produce "major network change" events, and so do not trigger full rebind/reconnect. - Many "incidental" fields on interfaces are ignored, like MTU, flags and so on - if the route is still good, the rest should be manageable. - Additional log output will provide more detail about the cause of major network change events. Updates #3363 Signed-off-by: James Tucker <james@tailscale.com>	2026-04-06 15:46:51 -07:00
Brad Fitzpatrick	86f42ea87b	cmd/cloner, cmd/viewer: handle named map/slice types with Clone/View methods The cloner and viewer code generators didn't handle named types with basic underlying types (map/slice) that have their own Clone or View methods. For example, a type like: type Map map[string]any func (m Map) Clone() Map { ... } func (m Map) View() MapView { ... } When used as a struct field, the cloner would descend into the underlying map[string]any and fail because it can't clone the any (interface{}) value type. Similarly, the viewer would try to create a MapFnOf view and fail. Fix the cloner to check for a Clone method on the named type before falling through to the underlying type handling. Fix the viewer to check for a View method on named map/slice types, so the type author can provide a purpose-built safe view that doesn't leak raw any values. Named map/slice types without a View method fall through to normal handling, which correctly rejects types like map[string]any as unsupported. Updates tailscale/corp#39502 (needed by tailscale/corp#39594) Change-Id: Iaef0192a221e02b4b8e409c99ef8398090327744 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-05 20:20:32 -07:00
Brad Fitzpatrick	5ef3713c9f	cmd/vet: add subtestnames analyzer; fix all existing violations Add a new vet analyzer that checks t.Run subtest names don't contain characters requiring quoting when re-running via "go test -run". This enforces the style guide rule: don't use spaces or punctuation in subtest names. The analyzer flags: - Direct t.Run calls with string literal names containing spaces, regex metacharacters, quotes, or other problematic characters - Table-driven t.Run(tt.name, ...) calls where tt ranges over a slice/map literal with bad name field values Also fix all 978 existing violations across 81 test files, replacing spaces with hyphens and shortening long sentence-like names to concise hyphenated forms. Updates #19242 Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-04-05 15:52:51 -07:00
M. J. Fromberger	211ef67222	tailcfg,ipn/ipnlocal: regulate netmap caching via a node attribute (#19117 ) Add a new tailcfg.NodeCapability (NodeAttrCacheNetworkMaps) to control whether a node with support for caching network maps will attempt to do so. Update the capability version to reflect this change (mainly as a safety measure, as the control plane does not currently need to know about it). Use the presence (or absence) of the node attribute to decide whether to create and update a netmap cache for each profile. If caching is disabled, discard the cached data; this allows us to use the presence of a cached netmap as an indicator it should be used (unless explicitly overridden). Add a test that verifies the attribute is respected. Reverse the sense of the environment knob to be true by default, with an override to disable caching at the client regardless what the node attribute says. Move the creation/update of the netmap cache (when enabled) until after successfully applying the network map, to reduce the possibility that we will cache (and thus reuse after a restart) a network map that fails to correctly configure the client. Updates #12639 Change-Id: I1df4dd791fdb485c6472a9f741037db6ed20c47e Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-04-01 15:02:53 -07:00
Alex Chan	4ace87a965	net,tsnet: fix the capitalisation of "Wireshark" See https://www.wireshark.org/; there's no intercapped S. Updates #cleanup Change-Id: I7c89a3fc6fb0436d0ce0e25a620bde7e310e89d2 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-03-26 19:39:29 +00:00
Alex Valiushko	330a17b7d7	net/batching: use vectored writes on Linux (#19054 ) On Linux batching.Conn will now write a vector of coalesced buffers via sendmmsg(2) instead of copying fragments into a single buffer. Scatter-gather I/O has been available on Linux since the earliest days (reworked in 2.6.24). Kernel passes fragments to the driver if it supports it, otherwise linearizes upon receiving the data. Removing the copy overhead from userspace yields up to 4-5% packet and bitrate improvement on Linux with GSO enabled: 46Gb/s 4.4m pps vs 44Gb/s 4.2m pps w/32 Peer Relay client flows. Updates tailscale/corp#36989 Change-Id: Idb2248d0964fb011f1c8f957ca555eab6a6a6964 Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>	2026-03-25 16:38:54 -07:00
Greg Steuck	954a2dfd31	net/dns: fix duplicate search line entries (OpenBSD, primarily) Fixes #12360 Signed-off-by: Greg Steuck <greg@nest.cx>	2026-03-25 10:19:02 -07:00

1 2 3 4 5 ...

1407 Commits