tailscale

mirror of https://github.com/tailscale/tailscale.git synced 2026-06-23 23:41:41 -04:00

Author	SHA1	Message	Date
Simon Law	c898aeb0d8	.github/workflows: fix `-run='^$'` quoting when skipping all tests (#19962 ) This bug was surfaced by #19960 because benchmarks shouldn’t have run TestListenService, but they did because PowerShell interpreted match empty string `"^$"` as beginning of string `'^'`. This patch has the Windows build run `./tool/go` binaries with bash and synchronizes it with the *nix `bench all` run. Updates #18884 Updates #19960 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 21:20:10 -07:00
Charlotte Som	7ba49cbcbb	words: add 'flops' to the list of scales floating point operations per second is a measure of computational throughput Signed-off-by: Charlotte Som <charlotte@som.codes>	2026-06-01 18:18:54 -07:00
Simon Law	b47dd932f3	cmd/tailscale/cli: use tstime constant for `tailscale routecheck` (#19957 ) Updates #19928 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 17:42:18 -07:00
ferrumclaudepilgrim	3f70abdc6f	cmd/tailscaled, version/distro: default to userspace-networking on Crostini cros-garcon NULL-derefs on cold-boot netlink enumeration when tailscale0 is present, preventing the Crostini container and ChromeOS Terminal from starting cleanly. This is an upstream ChromiumOS bug in cros-garcon; tailscaled can work around it by defaulting to userspace-networking mode on Crostini. Tailscale SSH continues to work via tailscaled's netstack. Users can override with --tun=tailscale0 on ChromeOS builds where cros-garcon is fixed. Crostini is detected via /opt/google/cros-containers/bin/garcon, which is present in every Crostini penguin container. ssh/tailssh extends the existing Debian default-PATH case to cover Crostini, since Crostini is Debian-based and benefits from the same SSH PATH defaults. RELNOTE: Crostini now defaults to userspace-networking. Fixes #19488 Updates #12090 Signed-off-by: ferrumclaudepilgrim <ferrumclaudepilgrim@users.noreply.github.com>	2026-06-01 17:40:07 -07:00
Brad Fitzpatrick	a6ab7efa4f	ipn/ipnlocal, cmd/tailscale/cli: auto-renew TLS certs and warn while pending The Tailscale daemon only refreshed TLS certs as a side effect of inbound TLS handshakes or "tailscale cert" CLI calls. A node that doesn't see inbound traffic during the renewal window silently rolls past expiry. Add a once-per-hour background loop on LocalBackend that enumerates Serve and Funnel HTTPS hostnames (filtered against the netmap's CertDomains so we don't poke ACME for other nodes' service hostnames) and calls the existing GetCertPEM path. The renewal decision (ARI window, then 2/3 expiry fallback) is unchanged; the loop just guarantees it runs. For visibility during initial issuance or restart with a long-expired cached cert, add a "tls-cert-pending" health Warnable that's set while ACME is in flight and no usable cached cert exists. Async renewal of a still-valid cert intentionally doesn't fire it. And then make the CLI "cert" subcommand print out a warning if it's blocking due to a cert fetch in flight, using that health info. Fixes #19911 Fixes #19912 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Change-Id: I144e46c40e957b2e879587decace32a523a6eade	2026-06-01 16:31:54 -07:00
Simon Law	92bfda580c	cmd/tailscale/cli: fix time in `tailscale routecheck` (#19956 ) When running `tailscale netcheck`, the reported timestamp used to be in UTC and formatted according to RFC 3339 with a `T` to separate the date from the time: sfllaw@h2co3:~$ tailscale netcheck \| head -n3 Report: * Time: 2026-06-01T21:12:32.252620138Z This is machine-readable time leaking out to the user interface. Times in normal commands are formatted for humans to read: sfllaw@h2co3:~$ date Mon 01 Jun 2026 02:39:14 PM PDT sfllaw@h2co3:~$ journalctl -t tailscaled \| tail -n1 Jun 01 14:35:21 h2co3 tailscaled[3328921]: wgengine: sending TSMP disco key advertisement to 100.90.144.102 sfllaw@h2co3:~$ timedatectl show Timezone=America/Los_Angeles LocalRTC=no CanNTP=yes NTP=yes NTPSynchronized=yes TimeUSec=Mon 2026-06-01 14:38:32 PDT RTCTimeUSec=Mon 2026-06-01 14:38:32 PDT sfllaw@h2co3:~$ uptime --since 2026-05-15 07:37:45 This PR makes the times printed by the CLI commands consistent: - For `tailscale routecheck`, it now prints local time as `2026-05-15 07:37:45-07:00`. - For `netlogfmt`, it has always printed local time with a space, but now includes the time zone. - All machine-readable outputs continue to be standard RFC 3339 in UTC, i.e. `--format=json`. As part of a general cleanup, this PR also adds standard common time.Format layouts as tstime constants. Fixes #19928 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 16:12:08 -07:00
M. J. Fromberger	8a63c023f0	tailcfg: add a node attribute to explicitly disable netmap caching (#19947 ) Add a new tailcfg.NodeCapability (NodeAttrDisableCacheNetworkMaps) to allow the policy document to override whether a node will receive the cache-network-maps attribute by default. The client does not interpret this attribute directly, it is used to influence decisions by the control plane. As of 2026-06-01, cache-network-maps is only sent when explicitly requested by the policy. In a future version, we will send it by default for clients with a sufficient capability version (to be added in a future commit), except to ephemeral nodes, unless the policy sets disable-cached-network-maps. Updates #12639 Updates tailscale/projects#28 Change-Id: I6376376d7898f7da8db977e457dcd45df9deef41 Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>	2026-06-01 15:16:45 -07:00
Brad Fitzpatrick	d64aaffc06	control/controlclient: fix map context race Capture Auto.mapCtx while holding Auto.mu before using it for incremental map update forwarding. Pause and restart paths can replace the context under the same mutex, so using it after unlocking races with those writers. Add a race regression test for the UserProfiles path that repeatedly cancels the map context while incremental profile updates are forwarded. Fixes #19953 Change-Id: Icc55c4a0dffbc16d6507a2b446b3909d4d0a0278 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-06-01 13:44:19 -07:00
Brad Fitzpatrick	c234dcc2ef	go.mod: bump wireguard-go `e3ac4a0afb...b48af7099c` Updates tailscale/tailscale#7053 Updates tailscale/corp#36989 Updates tailscale/tailscale#19820 Change-Id: I5652535bd32d3784702dcd2544abd430c2c95c96 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-06-01 12:53:18 -07:00
Achille Roussel	7f3bbc9865	net/netutil: add NewDefaultTransport to avoid http.DefaultTransport panics Several packages built their HTTP transports with http.DefaultTransport.(http.Transport).Clone() The standard library only documents http.DefaultTransport as an http.RoundTripper, so an application is free to replace it with a RoundTripper that is not a http.Transport (e.g. an instrumented or tracing wrapper). When such an application embeds tsnet.Server, the unchecked type assertion panics as soon as tsnet brings up its control connection, DNS bootstrap, or log uploader. Add netutil.NewDefaultTransport, which returns a clone of the global when it is still the standard *http.Transport (preserving existing behavior) and otherwise returns a fresh transport mirroring the stdlib defaults. Route every clone site through it. Updates #19937 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> Signed-off-by: Achille Roussel <achille.roussel@gmail.com>	2026-06-01 12:28:36 -07:00
License Updater	5495eb7e1a	licenses: update license notices Signed-off-by: License Updater <noreply+license-updater@tailscale.com>	2026-06-01 12:09:49 -07:00
Brad Fitzpatrick	0d92a69259	cmd/tailscale/cli: add "tailscale get" command This adds @alexwlchan's proposed "tailscale get" command that reads current preference values, complementing "tailscale set". It uses the same flag names as set. tailscale get # show all settings as a table tailscale get all # same tailscale get accept-dns # show a single value tailscale get --json # output as JSON object tailscale get --set-flags # output as tailscale set argv Fixes #11389 Fixes tailscale/corp#38702 Change-Id: Ie366f27f11ccc56c76fff9a94ed8a9de9c835bd0 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-06-01 11:59:33 -07:00
Simon Law	2d6844c565	cmd/tailscale/cli: add routecheck command (#19641 ) Introduce a new `tailscale routecheck` command which prints a report of high-availability routers that are reachable. This command rhymes with the `tailscale netcheck` command and but instead of reporting on local network conditions, `routecheck` reports on remote connectivity. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 11:50:24 -07:00
Naman Sood	da51072b98	feature/conn25: send TSMP message to client for no IP mapping on connector When a connector receives a packet from a client on a transit IP that it can't find a real IP mapping for, it drops the packet. This commit starts notifying the client of this dropping over TSMP, so the client can tell the connector to re-establish the transit IP-real IP binding. Updates tailscale/corp#34256. Signed-off-by: Naman Sood <mail@nsood.in>	2026-06-01 14:46:27 -04:00
Evan Lowry	4f07a071e7	client/systray: don't repeat account name for single-user tailnets (#19930 ) Single-user tailnets often have the same tailnet display name as login name. This change omits the duplication when matching, and skips the user-switching submenu when only one account is configured, to clean up the account display a little bit. Fixes #16889 Signed-off-by: Evan Lowry <evan@tailscale.com>	2026-06-01 15:25:45 -03:00
Brad Fitzpatrick	d961e44856	cmd/testwrapper: auto-retry every failing test Previously, testwrapper only retried tests explicitly annotated with flakytest.Mark. Authors don't pre-emptively mark tests that haven't flaked yet, so the first flake of a brand-new test failed CI even when a re-run would have passed. testwrapper now retries every failing test within a per-test wall-clock budget (default: 5 minute per-attempt timeout capped at 1.5x the first failure duration, 10 minute total). A test that fails and then passes on retry is reported as flaky; a test that never passes within the budget remains a real failure (exit non-zero). For flakeapp's existing log scraping, the wire format is preserved: the "flakytest failures JSON:" line is now emitted only for tests that ultimately flaked (passed on retry). Unmarked tests get a fake issue URL of the form https://github.com/{owner}/{repo}/issues/UNKNOWN where owner/repo is detected from GITHUB_REPOSITORY, the local git remote, or falls back to tailscale/tailscale. A new "permanent test failures JSON:" line is emitted for tests that never passed; flakeapp ignores it for now (a follow-up can teach it to record real failures separately). flakytest.Mark stays as an opt-in API: still useful for tracking a known-flaky test against a real issue and for TS_SKIP_FLAKY_TESTS. Updates tailscale/corp#38960 Change-Id: I56dfc9b023486d239f60793a53e9690578ce8017 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-06-01 11:07:56 -07:00
Simon Law	2ee9eacb94	client/local,ipn/localapi: add /localapi/v0/routecheck endpoint (#19640 ) In order to support a `tailscale routecheck` command, we introduce the `/localapi/v0/routecheck` endpoint to the local API. This endpoint returns the most recent report collected by the routecheck client. If `force=true` is an argument in the query string, then this endpoint will actively probe before returning the report. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 11:06:14 -07:00
Simon Law	28801674a6	net/routecheck: introduce new package for checking peer reachability (#19639 ) The routecheck package parallels the netcheck package, where the former checks routes and routers while the latter checks networks. Like netcheck, it compiles reports for other systems to consume. Historically, the client has never known whether a peer is actually reachable. Most of the time this doesn’t matter, since the client will want to establish a WireGuard tunnel to any given destination. However, if the client needs to choose between two or more nodes, then it should try to choose a node that it can reach. Suggested exit nodes are one such example, where the client filters out any nodes that aren’t connected to the control plane. Sometimes an exit node will get disconnected from the control plane: when the network between the two is unreliable or when the exit node is too busy to keep its control connection alive. In these cases, Control disables the Node.Online flag for the exit node and broadcasts this across the tailnet. Arguably, the client should never have relied on this flag, since it only makes sense in the admin console. This patch implements an initial routecheck client that can probe every node that your client knows about. You should not ping scan your visible tailnet, this method is for debugging only. This patch also introduces a new OnNetMapToggle hook, which fires when the netmap transitions from nil to non-nil, or vice versa. This happens either when the client receives its first MapResponse after connecting to the control plane, or when it clears the netmap while it is disconnecting. Routecheck uses this to wait for a valid netmap so it knows which peers to probe. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-06-01 10:33:08 -07:00
Patrick O'Doherty	651049ec19	ssh/tailssh: reject dangerous LD_/DYLD_ env vars in acceptEnv filtering (#19914 ) Block dynamic linker environment variables (LD_PRELOAD, LD_LIBRARY_PATH, DYLD_INSERT_LIBRARIES, and friends) from being forwarded regardless of acceptEnv policy, preventing privilege escalation via wildcard patterns like "*". We are not aware of any legitimate use of these variables so they are safe to exclude from being passed. Thanks to Tim Sageser (dtrsecurity) for this report. Updates tailscale/corp#42033 Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>	2026-06-01 09:19:27 -07:00
Brad Fitzpatrick	2ba426802f	ipn/ipnlocal: fix 'tailscale status --peers=false' missing user profile Fixes #19894 Change-Id: I310504987170e0742480c8a02706eb0dbf4ec3dc Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-31 20:34:43 -07:00
Martin Zihlmann	3ef42d8b0b	derp/derphttp: drop dial-only proxy port test Signed-off-by: Martin Zihlmann <martizih@outlook.com>	2026-05-31 19:22:11 -07:00
Martin Zihlmann	48eba4e971	derp/derphttp: add tests for proxied CONNECT port selection Adds two tests covering the fix in `0e4c8fc92`: TestDialNodeUsingProxyPort exercises dialNodeUsingProxy directly via a stub CONNECT proxy, asserting the recorded target across four cases: HTTPS/HTTP default fallback and explicit DERPPort override for each. TestConnectThroughProxyHonorsDERPPort drives the full path end-to-end: a real derpserver on an ephemeral TLS port, a real CONNECT proxy that tunnels bytes bidirectionally, and a region client routed through it via feature.HookProxyFromEnvironment. Without the fix, Connect fails because the proxy is asked to dial :443. Signed-off-by: Martin Zihlmann <martizih@outlook.com>	2026-05-31 19:22:11 -07:00
Martin Zihlmann	4c8c0baf2b	derp/derphttp: honor DERPNode.DERPPort in proxied CONNECT dial dialNode picks the destination port from n.DERPPort when non-zero, falling back to 443 (or 3340 when useHTTPS is false). The proxy path, dialNodeUsingProxy, hardcoded "443" in the CONNECT target, so a DERP server reachable only on a custom port was unreachable through HTTPS_PROXY: the proxy would faithfully tunnel to :443 at the DERP hostname, and TLS would either fail cert validation or talk to the wrong service. Mirror dialNode's port selection so both paths behave the same. Fixes #19748 Signed-off-by: Martin Zihlmann <martizih@outlook.com>	2026-05-31 19:22:11 -07:00
Jordan Whited	8a294e3c34	net/batching: reset Buffers len in WriteBatchTo In case we land on this branch during a goto retry. Also, protect Geneve offset from mutation across retries. Fixes #19927 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-05-31 06:12:53 -07:00
Brad Fitzpatrick	3e34e721e8	tsnet: add opt-in SSH support (Server.ListenSSH) This adds tsnet.Server.ListenSSH which, if the SSH feature is linked, returns a net.Listener whose Accept yields *tailssh.Session values (as net.Conn). This lets tsnet apps accept incoming SSH connections to implement custom TUI applications. Basic apps can use net.Conn directly (Read/Write/Close). Rich apps import ssh/tailssh and type-assert for peer identity, PTY, signals, etc. If feature/ssh isn't imported, ListenSSH returns an error. Includes a demo guess-the-number game in tsnet/example/ssh-game. Updates tailscale/corp#37839 Change-Id: I4e7c3c96afb030cdf4da8f2d8b2253820628129a Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-30 14:17:50 -07:00
Fran Bull	c9333854fb	appc,feature/conn25: use custom scheme resolvers for conn25 Currently we are picking a peer for the split dns routes when we get a netmap. Use the new custom scheme resolvers, installed per app in the config in the netmap, to allow us to choose which connector peer should handle a DNS request at the time the request is made. Fixes tailscale/corp#39858 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-29 12:23:47 -07:00
Simon Law	5d935c8900	net/traffic: add fuzz test for sorting nodes by traffic score (#19893 ) In PR #19682, we introduced the traffic package which provides a traffic.Scores.SortNodes method that uses rendezvous hashing to break ties by equally distribute the “best” node for any given client. This PR adds a fuzzer to make sure this algorithm is not wildly unfair. Updates #17366 Updates tailscale/corp#33033 Signed-off-by: Simon Law <sfllaw@tailscale.com>	2026-05-29 11:55:49 -07:00
Jordan Whited	8b58bd6c64	net/batching: implement NodeAttrNeverGSOEqualTail This NodeCapability works around the UDP GSO bugs introduced by torvalds/linux@b10b446 (v7.0-rc1). These bugs were later fixed by torvalds/linux@78effd8 and torvalds/linux@5f17ae0 (v7.1-rc5). These Linux kernel bugs cause mangled UDP headers and UDP checksums, resulting in high levels of packet loss. The aforementioned bugs have already made their way downstream into various distros, e.g. Ubuntu 26.04 LTS. Impacted users are now dealing with poor UDP performance in tailscaled, and in any other software that makes use of UDP GSO. Not all users of the affected kernels are impacted as the relevant kernel code path sits between kernel and netdev driver, and behaviors vary by driver/device capability. We cannot detect impact at runtime, as this would require gathering all netdevs, and performing loopback tests. This is invasive and in many cases impossible. So, we are left to choose between disabling UDP GSO for all users on affected kernels, whether they experience real impact or not, or try and work around the bugs. Disabling UDP GSO for a user that is not impacted can cut max throughput in half, and consume more CPU cycles. This commit attempts to workaround the bugs by avoiding UDP GSO when batches are small, and injecting a 1-byte sentinel tail payload when they are large. This tail payload is smaller than "GSO size", which sidesteps the primary trigger of all fragments in a batch being equal in length. The end result is slightly increased payload and packet overhead, but functional UDP GSO for all Linux 7.0-7.1.4 users, regardless of netdev/driver. Updates #19777 Signed-off-by: Jordan Whited <jordan@tailscale.com>	2026-05-29 11:36:35 -07:00
kari-ts	7355116c05	ipn/store: make WriteState(id, nil) delete key instead of adding nil entry (#19920 ) All StateStore implementations store a nil value in the cache map when WriteState is called with a nil byte slice instead of deleting the key. This causes ReadState to return (nil, nil) instead of (nil, ErrStateNotExist), since the key is still present in the map. This breaks reset-auth in Windows, Linux, and Android, and the node can't log back in without manually editing the state file. (macOS uses a different state store) DeleteProfile, DeleteAllProfilesForUser, setUnattendedModeAsConfigured are impacted but don't seem to break because the deleted keys are not reread. This deletes the key from the cache instead. Fixes tailscale/corp#42477 Signed-off-by: kari-ts <kari@tailscale.com>	2026-05-29 11:22:14 -07:00
Fran Bull	3d5102090f	feature/conn25: use new pool nodeattr We have been reading the pool config from the app nodeattr, but it is global config, not per app, so it needs to be its own thing. Updates tailscale/corp#39999 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-29 08:29:34 -07:00
Brad Fitzpatrick	412c812d76	ipn/ipnlocal: use ACME ALPN for authorized Funnel non-CertDomain domains If a user explicitly adds a non-ts.net (not a CertDomain domain) domain like "foo.com" to their serve config as a web target that's also an allowed funnel domain (using raw "tailscale serve set-config"), then use the new ALPN cert fetching (from `b553969b`) to get certs for that domain. This is just plumbing; there's no new product functionality to actually enable this easily client-side, and it also has no visible product surface to enable it server-side. Updates tailscale/corp#41736 Change-Id: Ie2e421ac9611bce64bba3de6a454b2d505ea0e8a Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-28 13:33:45 -07:00
Tom Proctor	788a49eca5	.github/workflows: run vet on GitHub-hosted runners (#19913 ) The github-ci-vm machine that runs our self-hosted CI for this repo is only designed for the `vm` job in test.yml. That uses a different cache dir which is causing github-ci-vm's small disk to fill up. Switch to ubuntu 24.04 like the rest of our CI for this repo that doesn't require anything special. Updates tailscale/corp#40465 Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>	2026-05-28 21:30:46 +01:00
James Tucker	524a374f01	tsnet: wait for peer in netmap before pinging in setupTwoClientTest If we dispatch a ping too early (after a later patch removes a 250ms blockage) then the ping may be lost due to the peers not yet knowing about each other. The ping is retained in order to setup and ensure a wireguard session prior to test flow. Updates #19822 Change-Id: I6cfea28931646a9387b6ffc2654e72cd846f4e55 Signed-off-by: James Tucker <james@tailscale.com> Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-28 11:27:54 -07:00
Brad Fitzpatrick	c086992f4f	cmd/tailscale/cli: add whoami subcommand Add a "tailscale whoami" subcommand that is equivalent to running "tailscale whois $(tailscale ip -4)" but more ergonomic. It supports the --json flag just like whois, and shares the WhoIsResponse rendering code with whois. Fixes #19907 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com> Change-Id: I8f33ba7a5608bab7dffa8213303beb5f345936d3	2026-05-28 10:49:17 -07:00
Alex Chan	9d126aec34	all: remove network lock references from private method names Updates tailscale/corp#37904 Change-Id: I312d46d958209ca3d1152d1877fb91a57c91798d Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-28 18:00:36 +01:00
Brendan Creane	8d90a6ab1e	ipn/ipnlocal: add HTTP/2 Content-Type tests for serve reverse proxy (#19905 ) Adds two tests exercising the HTTP/2-inbound -> plaintext HTTP/1.1 backend path through serve's reverseProxy and through the full serveWebHandler entry point (with a funnel serveHTTPContext). Updates #19866 Signed-off-by: Brendan Creane <bcreane@gmail.com>	2026-05-28 09:46:36 -07:00
Alex Chan	f4a280cdbd	all: update a few more references to network/tailnet lock Updates tailscale/corp#37904 Change-Id: I746b06328e080fa2b9ff28a2d099f95645aa3d0b Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-28 16:44:16 +01:00
Alex Chan	446ae97491	ipn: improve --exit-node hostname error during startup When parsing the `tailscale up --exit-node=ARG` argument, we try to resolve hostnames by searching the list of peers. However, at startup, the peer list is empty, causing hostname lookups to trivially fail with an unhelpful "invalid value" erorr. Improve the error message when the peer list is empty to inform the user that hostnames cannot be resolved during startup, and advise them to use the exit node's Tailscale IP address instead. Also, clarify that hostnames must be peer hostnames, not arbitrary hostnames. Fixes #19882 Change-Id: I9390a427c2863d657cf46c5e33b43cb3c5363764 Signed-off-by: Alex Chan <alexc@tailscale.com>	2026-05-28 16:43:45 +01:00
dragondscv	4b8115bb2c	cmd/containerboot: clamp MSS to PMTU for proxy group pods (#19686 ) Single-pod ingress/egress proxies already called ClampMSSToPMTU when setting up forwarding rules, but the proxy group (HA) code paths in egressservices.go and ingressservices.go did not. This caused TCP connections through proxy group pods to suffer from MSS/MTU mismatch issues in environments where path MTU discovery is not working. Add ClampMSSToPMTU calls in the egress sync loop (alongside the existing EnsureSNATForDst call) and in addDNATRuleForSvc (alongside the existing EnsureDNATRuleForSvc call), mirroring what the single-pod forwarding rules already do. Also add MSS clamping assertions to TestSyncIngressConfigs and track ClampMSSToPMTU calls in FakeNetfilterRunner. Fixes issue #19812 https://github.com/tailscale/tailscale/issues/19812. Tracking internal ticket TSS-86326. Signed-off-by: Jay Tung <ltung@crusoeenergy.com> Co-authored-by: Jay Tung <ltung@crusoeenergy.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-28 12:57:38 +01:00
Brad Fitzpatrick	782c73bf41	cmd/containerboot: fix data race in TestContainerBoot Parallel subtests share ipn.Notify pointers (e.g. runningNotify). When multiple subtests reached the same phase concurrently, they all wrote to the shared notify's InitialStatus field without synchronization, triggering the race detector. Fix by shallow-copying ipn.Notify before setting InitialStatus, so each test iteration works on its own copy. Updates #19380 Change-Id: I9dd40037e02146166f006f4f7c1ddcc47adba191 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-27 18:40:03 -07:00
James Tucker	25b8ed8d9e	control/controlknobs,net/{batching,tstun},wgengine: add nodecaps to disable UDP & TUN GRO/GSO Add four control-plane node attributes that let us disable UDP GSO/GRO on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN device. These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate upstream Linux kernel regressions on a deployed fleet without requiring a client release, after two incidents (#13041, #19777) where buggy kernel patches landed upstream and the fix took an excessively long time to reach downstream distros. Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via a comparison against a cached "last applied" value and only an actual transition triggers work: magicsock Rebind()+ReSTUN for UDP, ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and is one-way (wireguard-go GRO disablement is sticky); re-enabling requires a client restart. Updates #13041 Updates #19777 Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-27 17:10:14 -07:00
Brad Fitzpatrick	94af1b00fb	cmd/testwrapper, tstest: move test sharding out of test code Previously, sharding required tests to opt in by calling tstest.Shard, which used a process-global counter to assign each test to a shard. This had two problems: most tests didn't call it, so they ran on every shard (defeating the purpose), and shard assignments were unstable (depended on call order, so adding a test could reshuffle others). Remove tstest.Shard and tstest.SkipOnUnshardedCI entirely. Instead, have testwrapper implement sharding automatically for all tests: when TS_TEST_SHARD=N/M is set, it uses "go list -json" (no compilation) to find test source files, scans them for top-level Test/Benchmark/ Example/Fuzz function names, and filters by fnv32a(name) % M == N-1. The filtered names are passed as an anchored -run regex to go test. Using go list instead of "go test -list" avoids linking the test binary twice (Go's build cache does not cache test binary linking). Fixes #19886 Change-Id: I62ab7b3d757324d4c5fd0b5de50c1e3742681791 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-27 16:53:17 -07:00
James Scott	db60aa8eca	logtail: gate "logtail started" behind TS_DEBUG_LOGTAIL envknob (#19891 ) Gates the unnecessary "logtail started" message behind the debug envknob TS_DEBUG_LOGTAIL. This is extra log spam that isn't needed unless we are debugging. Updates tailscale/corp#40908 Signed-off-by: James Scott <jim@tailscale.com>	2026-05-27 15:48:44 -07:00
kari-ts	1a17ec1988	net/netmon: in Android, replace system/bin/ip call with cached LinkProperties gateway (#19804 ) bind() on NETLINK_ROUTE sockets does not work on Android 11+ (https://developer.android.com/identity/user-data-ids#mac-11-plus) . Since system/bin/ip uses bind(), likelyHomeRouterIPHelper() always fails on Andoroid 11+, so that GatewayAndSelfIP never caches the result, causing repeated ip process spawns on every periodic ReSTUN. This replaces the system/bin/ip fallback with a cached gateway IP pushed from Android’s ConnectivityManager via LinkProperties.getRoutes(). This is the same patterm used by UpdateLastKnownDefaultRouteInterface for the interface name (see https://github.com/tailscale/tailscale/pull/11784/). We keep the proc/net/route path as a fallback for early startup before NetworkChangeCallback has fired. Updates tailscale/tailscale#18622 Updates tailscale/tailscale#13352 Signed-off-by: kari-ts <kari@tailscale.com>	2026-05-27 15:42:48 -07:00
Brad Fitzpatrick	c9fb05b6f5	ipn/ipnlocal: don't dup-suppress UserProfiles on IPNBus on profile switches Fixes #19889 Change-Id: I324a735c13772c0c79ed7392c0baa5064b34823b Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-27 14:47:02 -07:00
Brad Fitzpatrick	364b952d62	cmd/containerboot: track peers from IPN bus updates, stop using netmap.NetworkMap Some tests in another repo were broken by tailscale/tailscale#19607. This fixes them, by finishing off the rest of the migration away from netmap.NetworkMap on the IPN bus in containerboot. Containerboot used to rebuild a full NetworkMap-shaped view while reacting to IPN bus notifications. Now it insteads has its own netmapState type (immutable) of exactly what it needs to track, and sends those immutable values around, making cheap edits of new immutable values when an IPN bus edit arrives. This should make cmd/containerboot scale to much larger tailnets now too. Fixes #19852 Fixes tailscale/corp#42347 Updates #12542 Change-Id: I88adaf061f85f677f954a764935e6654329d75a6 Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>	2026-05-27 14:12:48 -07:00
Fran Bull	80dc7a8d07	feature/conn25: disallow addrs assignment overwriting. We don't want addr assignments to be lost from the collection before they can be returned to the IP pools, otherwise we will get orphan addresses marked inUse in the pools that will never be returned. Fixes tailscale/corp#39975 Signed-off-by: Fran Bull <fran@tailscale.com>	2026-05-27 13:54:40 -07:00
Patrick O'Doherty	8501be1990	go.mod: bump dependencies to resolve govulncheck warnings (#19884 ) Bump the following: go get -u github.com/moby/spdystream@v0.5.1 go get -u golang.org/x/crypto@v0.52.0 go get -u golang.org/x/net@v0.55.0 to resolve open govulncheck warnings. Updates #cleanup Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>	2026-05-27 12:24:59 -07:00
James Tucker	dea49bb4da	net/batching: add envknobs to disable UDP GRO & GSO It is sometimes useful when diagnosing subtle and specific performance problems to rule out GRO/GSO independently and/or toggle them to influence packet pacing. Updates #17835 Updates tailscale/corp#31164 Signed-off-by: James Tucker <james@tailscale.com>	2026-05-27 12:05:00 -07:00
James Tucker	d1912167dc	feature/taildrop: replace outgoing-file progress channel with synchronous reporter serveFilePut tracked outgoing-file progress through an unbuffered progressUpdates channel whose close was owned by the request goroutine while writers were spread across manifest parsing, the progresstracking.Reader callback, singleFilePut failure paths, and the success path. That writer-closes mismatch made the send-on-closed-channel panic effectively unfixable in place. Replace it with a request-scoped outgoingProgress reporter. Transfer code reports state by method call; the reporter coalesces hot-path updates and is flushed once via defer in serveFilePut. With no producer channel to close, the panic is structurally impossible. Fixes #19115 Fixes #19817 Change-Id: I8f00d982d2c79880dfc1f8104c5eed06e94b5a6c Signed-off-by: James Tucker <james@tailscale.com>	2026-05-27 12:00:34 -07:00

1 2 3 4 5 ...

10724 Commits