Commit Graph

10724 Commits

Author SHA1 Message Date
Simon Law
c898aeb0d8 .github/workflows: fix -run='^$' quoting when skipping all tests (#19962)
This bug was surfaced by #19960 because benchmarks shouldn’t have run
TestListenService, but they did because PowerShell interpreted match
empty string `"^$"` as beginning of string `'^'`.

This patch has the Windows build run `./tool/go` binaries with bash
and synchronizes it with the *nix `bench all` run.

Updates #18884
Updates #19960

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 21:20:10 -07:00
Charlotte Som
7ba49cbcbb words: add 'flops' to the list of scales
floating point operations per second is a measure
of computational throughput

Signed-off-by: Charlotte Som <charlotte@som.codes>
2026-06-01 18:18:54 -07:00
Simon Law
b47dd932f3 cmd/tailscale/cli: use tstime constant for tailscale routecheck (#19957)
Updates #19928

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 17:42:18 -07:00
ferrumclaudepilgrim
3f70abdc6f cmd/tailscaled, version/distro: default to userspace-networking on Crostini
cros-garcon NULL-derefs on cold-boot netlink enumeration when
tailscale0 is present, preventing the Crostini container and
ChromeOS Terminal from starting cleanly. This is an upstream
ChromiumOS bug in cros-garcon; tailscaled can work around it
by defaulting to userspace-networking mode on Crostini.

Tailscale SSH continues to work via tailscaled's netstack.
Users can override with --tun=tailscale0 on ChromeOS builds
where cros-garcon is fixed.

Crostini is detected via /opt/google/cros-containers/bin/garcon,
which is present in every Crostini penguin container.

ssh/tailssh extends the existing Debian default-PATH case to
cover Crostini, since Crostini is Debian-based and benefits
from the same SSH PATH defaults.

RELNOTE: Crostini now defaults to userspace-networking.

Fixes #19488
Updates #12090

Signed-off-by: ferrumclaudepilgrim <ferrumclaudepilgrim@users.noreply.github.com>
2026-06-01 17:40:07 -07:00
Brad Fitzpatrick
a6ab7efa4f ipn/ipnlocal, cmd/tailscale/cli: auto-renew TLS certs and warn while pending
The Tailscale daemon only refreshed TLS certs as a side effect of inbound
TLS handshakes or "tailscale cert" CLI calls. A node that doesn't see
inbound traffic during the renewal window silently rolls past expiry.

Add a once-per-hour background loop on LocalBackend that enumerates Serve
and Funnel HTTPS hostnames (filtered against the netmap's CertDomains so
we don't poke ACME for other nodes' service hostnames) and calls the
existing GetCertPEM path. The renewal decision (ARI window, then 2/3
expiry fallback) is unchanged; the loop just guarantees it runs.

For visibility during initial issuance or restart with a long-expired
cached cert, add a "tls-cert-pending" health Warnable that's set while
ACME is in flight and no usable cached cert exists. Async renewal of a
still-valid cert intentionally doesn't fire it. And then make the CLI "cert"
subcommand print out a warning if it's blocking due to a cert fetch
in flight, using that health info.

Fixes #19911
Fixes #19912

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I144e46c40e957b2e879587decace32a523a6eade
2026-06-01 16:31:54 -07:00
Simon Law
92bfda580c cmd/tailscale/cli: fix time in tailscale routecheck (#19956)
When running `tailscale netcheck`, the reported timestamp used to be
in UTC and formatted according to RFC 3339 with a `T` to separate the
date from the time:

	sfllaw@h2co3:~$ tailscale netcheck | head -n3

	Report:
		* Time: 2026-06-01T21:12:32.252620138Z

This is machine-readable time leaking out to the user interface. Times
in normal commands are formatted for humans to read:

	sfllaw@h2co3:~$ date
	Mon 01 Jun 2026 02:39:14 PM PDT
	sfllaw@h2co3:~$ journalctl -t tailscaled | tail -n1
	Jun 01 14:35:21 h2co3 tailscaled[3328921]: wgengine: sending TSMP disco key advertisement to 100.90.144.102
	sfllaw@h2co3:~$ timedatectl show
	Timezone=America/Los_Angeles
	LocalRTC=no
	CanNTP=yes
	NTP=yes
	NTPSynchronized=yes
	TimeUSec=Mon 2026-06-01 14:38:32 PDT
	RTCTimeUSec=Mon 2026-06-01 14:38:32 PDT
	sfllaw@h2co3:~$ uptime --since
	2026-05-15 07:37:45

This PR makes the times printed by the CLI commands consistent:

- For `tailscale routecheck`, it now prints local time as
  `2026-05-15 07:37:45-07:00`.
- For `netlogfmt`, it has always printed local time with a space,
  but now includes the time zone.
- All machine-readable outputs continue to be standard RFC 3339 in
  UTC, i.e. `--format=json`.

As part of a general cleanup, this PR also adds standard common
time.Format layouts as tstime constants.

Fixes #19928

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 16:12:08 -07:00
M. J. Fromberger
8a63c023f0 tailcfg: add a node attribute to explicitly disable netmap caching (#19947)
Add a new tailcfg.NodeCapability (NodeAttrDisableCacheNetworkMaps) to allow the
policy document to override whether a node will receive the cache-network-maps
attribute by default. The client does not interpret this attribute directly, it
is used to influence decisions by the control plane.

As of 2026-06-01, cache-network-maps is only sent when explicitly requested by
the policy. In a future version, we will send it by default for clients with a
sufficient capability version (to be added in a future commit), except to
ephemeral nodes, unless the policy sets disable-cached-network-maps.

Updates #12639
Updates tailscale/projects#28

Change-Id: I6376376d7898f7da8db977e457dcd45df9deef41
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-06-01 15:16:45 -07:00
Brad Fitzpatrick
d64aaffc06 control/controlclient: fix map context race
Capture Auto.mapCtx while holding Auto.mu before using it for
incremental map update forwarding. Pause and restart paths can replace
the context under the same mutex, so using it after unlocking races
with those writers.

Add a race regression test for the UserProfiles path that repeatedly
cancels the map context while incremental profile updates are
forwarded.

Fixes #19953

Change-Id: Icc55c4a0dffbc16d6507a2b446b3909d4d0a0278
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-01 13:44:19 -07:00
Brad Fitzpatrick
c234dcc2ef go.mod: bump wireguard-go
e3ac4a0afb...b48af7099c

Updates tailscale/tailscale#7053
Updates tailscale/corp#36989
Updates tailscale/tailscale#19820

Change-Id: I5652535bd32d3784702dcd2544abd430c2c95c96
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-01 12:53:18 -07:00
Achille Roussel
7f3bbc9865 net/netutil: add NewDefaultTransport to avoid http.DefaultTransport panics
Several packages built their HTTP transports with

    http.DefaultTransport.(*http.Transport).Clone()

The standard library only documents http.DefaultTransport as an
http.RoundTripper, so an application is free to replace it with a
RoundTripper that is not a *http.Transport (e.g. an instrumented or
tracing wrapper). When such an application embeds tsnet.Server, the
unchecked type assertion panics as soon as tsnet brings up its control
connection, DNS bootstrap, or log uploader.

Add netutil.NewDefaultTransport, which returns a clone of the global
when it is still the standard *http.Transport (preserving existing
behavior) and otherwise returns a fresh transport mirroring the stdlib
defaults. Route every clone site through it.

Updates #19937

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Achille Roussel <achille.roussel@gmail.com>
2026-06-01 12:28:36 -07:00
License Updater
5495eb7e1a licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-06-01 12:09:49 -07:00
Brad Fitzpatrick
0d92a69259 cmd/tailscale/cli: add "tailscale get" command
This adds @alexwlchan's proposed "tailscale get" command that reads
current preference values, complementing "tailscale set". It uses the
same flag names as set.

  tailscale get              # show all settings as a table
  tailscale get all          # same
  tailscale get accept-dns   # show a single value
  tailscale get --json       # output as JSON object
  tailscale get --set-flags  # output as tailscale set argv

Fixes #11389
Fixes tailscale/corp#38702

Change-Id: Ie366f27f11ccc56c76fff9a94ed8a9de9c835bd0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-01 11:59:33 -07:00
Simon Law
2d6844c565 cmd/tailscale/cli: add routecheck command (#19641)
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.

This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 11:50:24 -07:00
Naman Sood
da51072b98 feature/conn25: send TSMP message to client for no IP mapping on connector
When a connector receives a packet from a client on a transit IP that it
can't find a real IP mapping for, it drops the packet. This commit
starts notifying the client of this dropping over TSMP, so the client
can tell the connector to re-establish the transit IP-real IP binding.

Updates tailscale/corp#34256.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-06-01 14:46:27 -04:00
Evan Lowry
4f07a071e7 client/systray: don't repeat account name for single-user tailnets (#19930)
Single-user tailnets often have the same tailnet display name as login
name.

This change omits the duplication when matching, and skips the
user-switching submenu when only one account is configured, to clean up
the account display a little bit.

Fixes #16889

Signed-off-by: Evan Lowry <evan@tailscale.com>
2026-06-01 15:25:45 -03:00
Brad Fitzpatrick
d961e44856 cmd/testwrapper: auto-retry every failing test
Previously, testwrapper only retried tests explicitly annotated with
flakytest.Mark. Authors don't pre-emptively mark tests that haven't
flaked yet, so the first flake of a brand-new test failed CI even
when a re-run would have passed.

testwrapper now retries every failing test within a per-test wall-clock
budget (default: 5 minute per-attempt timeout capped at 1.5x the first
failure duration, 10 minute total). A test that fails and then passes
on retry is reported as flaky; a test that never passes within the
budget remains a real failure (exit non-zero).

For flakeapp's existing log scraping, the wire format is preserved:
the "flakytest failures JSON:" line is now emitted only for tests
that ultimately flaked (passed on retry). Unmarked tests get a fake
issue URL of the form https://github.com/{owner}/{repo}/issues/UNKNOWN
where owner/repo is detected from GITHUB_REPOSITORY, the local git
remote, or falls back to tailscale/tailscale. A new "permanent test
failures JSON:" line is emitted for tests that never passed; flakeapp
ignores it for now (a follow-up can teach it to record real failures
separately).

flakytest.Mark stays as an opt-in API: still useful for tracking a
known-flaky test against a real issue and for TS_SKIP_FLAKY_TESTS.

Updates tailscale/corp#38960

Change-Id: I56dfc9b023486d239f60793a53e9690578ce8017
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-01 11:07:56 -07:00
Simon Law
2ee9eacb94 client/local,ipn/localapi: add /localapi/v0/routecheck endpoint (#19640)
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 11:06:14 -07:00
Simon Law
28801674a6 net/routecheck: introduce new package for checking peer reachability (#19639)
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.

Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.

Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.

This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.

This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 10:33:08 -07:00
Patrick O'Doherty
651049ec19 ssh/tailssh: reject dangerous LD_/DYLD_ env vars in acceptEnv filtering (#19914)
Block dynamic linker environment variables (LD_PRELOAD, LD_LIBRARY_PATH,
DYLD_INSERT_LIBRARIES, and friends) from being forwarded regardless of
acceptEnv policy, preventing privilege escalation via wildcard patterns
like "*".

We are not aware of any legitimate use of these variables so they are
safe to exclude from being passed.

Thanks to Tim Sageser (dtrsecurity) for this report.

Updates tailscale/corp#42033

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-06-01 09:19:27 -07:00
Brad Fitzpatrick
2ba426802f ipn/ipnlocal: fix 'tailscale status --peers=false' missing user profile
Fixes #19894

Change-Id: I310504987170e0742480c8a02706eb0dbf4ec3dc
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-31 20:34:43 -07:00
Martin Zihlmann
3ef42d8b0b derp/derphttp: drop dial-only proxy port test
Signed-off-by: Martin Zihlmann <martizih@outlook.com>
2026-05-31 19:22:11 -07:00
Martin Zihlmann
48eba4e971 derp/derphttp: add tests for proxied CONNECT port selection
Adds two tests covering the fix in 0e4c8fc92:

TestDialNodeUsingProxyPort exercises dialNodeUsingProxy directly via a
stub CONNECT proxy, asserting the recorded target across four cases:
HTTPS/HTTP default fallback and explicit DERPPort override for each.

TestConnectThroughProxyHonorsDERPPort drives the full path end-to-end:
a real derpserver on an ephemeral TLS port, a real CONNECT proxy that
tunnels bytes bidirectionally, and a region client routed through it
via feature.HookProxyFromEnvironment. Without the fix, Connect fails
because the proxy is asked to dial :443.

Signed-off-by: Martin Zihlmann <martizih@outlook.com>
2026-05-31 19:22:11 -07:00
Martin Zihlmann
4c8c0baf2b derp/derphttp: honor DERPNode.DERPPort in proxied CONNECT dial
dialNode picks the destination port from n.DERPPort when non-zero,
falling back to 443 (or 3340 when useHTTPS is false). The proxy path,
dialNodeUsingProxy, hardcoded "443" in the CONNECT target, so a DERP
server reachable only on a custom port was unreachable through
HTTPS_PROXY: the proxy would faithfully tunnel to :443 at the DERP
hostname, and TLS would either fail cert validation or talk to the
wrong service.

Mirror dialNode's port selection so both paths behave the same.

Fixes #19748

Signed-off-by: Martin Zihlmann <martizih@outlook.com>
2026-05-31 19:22:11 -07:00
Jordan Whited
8a294e3c34 net/batching: reset Buffers len in WriteBatchTo
In case we land on this branch during a goto retry. Also, protect
Geneve offset from mutation across retries.

Fixes #19927

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-05-31 06:12:53 -07:00
Brad Fitzpatrick
3e34e721e8 tsnet: add opt-in SSH support (Server.ListenSSH)
This adds tsnet.Server.ListenSSH which, if the SSH feature is linked,
returns a net.Listener whose Accept yields *tailssh.Session values (as
net.Conn). This lets tsnet apps accept incoming SSH connections to
implement custom TUI applications.

Basic apps can use net.Conn directly (Read/Write/Close). Rich apps
import ssh/tailssh and type-assert for peer identity, PTY, signals,
etc. If feature/ssh isn't imported, ListenSSH returns an error.

Includes a demo guess-the-number game in tsnet/example/ssh-game.

Updates tailscale/corp#37839

Change-Id: I4e7c3c96afb030cdf4da8f2d8b2253820628129a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-30 14:17:50 -07:00
Fran Bull
c9333854fb appc,feature/conn25: use custom scheme resolvers for conn25
Currently we are picking a peer for the split dns routes when we get a
netmap. Use the new custom scheme resolvers, installed per app in the
config in the netmap, to allow us to choose which connector peer should
handle a DNS request at the time the request is made.

Fixes tailscale/corp#39858

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-29 12:23:47 -07:00
Simon Law
5d935c8900 net/traffic: add fuzz test for sorting nodes by traffic score (#19893)
In PR #19682, we introduced the traffic package which provides a
traffic.Scores.SortNodes method that uses rendezvous hashing to
break ties by equally distribute the “best” node for any given client.

This PR adds a fuzzer to make sure this algorithm is not wildly unfair.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-05-29 11:55:49 -07:00
Jordan Whited
8b58bd6c64 net/batching: implement NodeAttrNeverGSOEqualTail
This NodeCapability works around the UDP GSO bugs introduced by
torvalds/linux@b10b446 (v7.0-rc1). These bugs were later fixed by
torvalds/linux@78effd8 and torvalds/linux@5f17ae0 (v7.1-rc5). These
Linux kernel bugs cause mangled UDP headers and UDP checksums, resulting
in high levels of packet loss.

The aforementioned bugs have already made their way downstream into
various distros, e.g. Ubuntu 26.04 LTS. Impacted users are now dealing
with poor UDP performance in tailscaled, and in any other software that
makes use of UDP GSO.

Not all users of the affected kernels are impacted as the relevant
kernel code path sits between kernel and netdev driver, and behaviors
vary by driver/device capability.

We cannot detect impact at runtime, as this would require gathering all
netdevs, and performing loopback tests. This is invasive and in many
cases impossible.

So, we are left to choose between disabling UDP GSO for all users on
affected kernels, whether they experience real impact or not, or try
and work around the bugs. Disabling UDP GSO for a user that is not
impacted can cut max throughput in half, and consume more CPU cycles.

This commit attempts to workaround the bugs by avoiding UDP GSO when
batches are small, and injecting a 1-byte sentinel tail payload when
they are large. This tail payload is smaller than "GSO size", which
sidesteps the primary trigger of all fragments in a batch being
equal in length.

The end result is slightly increased payload and packet overhead, but
functional UDP GSO for all Linux 7.0-7.1.4 users, regardless of
netdev/driver.

Updates #19777

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-05-29 11:36:35 -07:00
kari-ts
7355116c05 ipn/store: make WriteState(id, nil) delete key instead of adding nil entry (#19920)
All StateStore implementations store a nil value in the cache map when WriteState is called with a nil byte slice instead of deleting the key. This causes ReadState to return (nil, nil) instead of (nil, ErrStateNotExist), since the key is still present in the map.

This breaks reset-auth in Windows, Linux, and Android, and the node can't log back in without manually editing the state file. (macOS uses a different state store)
DeleteProfile, DeleteAllProfilesForUser, setUnattendedModeAsConfigured are impacted but don't seem to break because the deleted keys are not reread.

This deletes the key from the cache instead.

Fixes tailscale/corp#42477

Signed-off-by: kari-ts <kari@tailscale.com>
2026-05-29 11:22:14 -07:00
Fran Bull
3d5102090f feature/conn25: use new pool nodeattr
We have been reading the pool config from the app nodeattr, but it is
global config, not per app, so it needs to be its own thing.

Updates tailscale/corp#39999

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-29 08:29:34 -07:00
Brad Fitzpatrick
412c812d76 ipn/ipnlocal: use ACME ALPN for authorized Funnel non-CertDomain domains
If a user explicitly adds a non-ts.net (not a CertDomain domain) domain
like "foo.com" to their serve config as a web target that's also an allowed
funnel domain (using raw "tailscale serve set-config"), then use the new
ALPN cert fetching (from b553969b) to get certs for that domain.

This is just plumbing; there's no new product functionality to
actually enable this easily client-side, and it also has no visible
product surface to enable it server-side.

Updates tailscale/corp#41736

Change-Id: Ie2e421ac9611bce64bba3de6a454b2d505ea0e8a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-28 13:33:45 -07:00
Tom Proctor
788a49eca5 .github/workflows: run vet on GitHub-hosted runners (#19913)
The github-ci-vm machine that runs our self-hosted CI for this repo is
only designed for the `vm` job in test.yml. That uses a different cache
dir which is causing github-ci-vm's small disk to fill up. Switch to
ubuntu 24.04 like the rest of our CI for this repo that doesn't require
anything special.

Updates tailscale/corp#40465

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-05-28 21:30:46 +01:00
James Tucker
524a374f01 tsnet: wait for peer in netmap before pinging in setupTwoClientTest
If we dispatch a ping too early (after a later patch removes a 250ms
blockage) then the ping may be lost due to the peers not yet knowing
about each other. The ping is retained in order to setup and ensure a
wireguard session prior to test flow.

Updates #19822

Change-Id: I6cfea28931646a9387b6ffc2654e72cd846f4e55
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-28 11:27:54 -07:00
Brad Fitzpatrick
c086992f4f cmd/tailscale/cli: add whoami subcommand
Add a "tailscale whoami" subcommand that is equivalent to running
"tailscale whois $(tailscale ip -4)" but more ergonomic. It supports
the --json flag just like whois, and shares the WhoIsResponse
rendering code with whois.

Fixes #19907

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I8f33ba7a5608bab7dffa8213303beb5f345936d3
2026-05-28 10:49:17 -07:00
Alex Chan
9d126aec34 all: remove network lock references from private method names
Updates tailscale/corp#37904

Change-Id: I312d46d958209ca3d1152d1877fb91a57c91798d
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-28 18:00:36 +01:00
Brendan Creane
8d90a6ab1e ipn/ipnlocal: add HTTP/2 Content-Type tests for serve reverse proxy (#19905)
Adds two tests exercising the HTTP/2-inbound -> plaintext HTTP/1.1 backend
path through serve's reverseProxy and through the full serveWebHandler
entry point (with a funnel serveHTTPContext).

Updates #19866

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-05-28 09:46:36 -07:00
Alex Chan
f4a280cdbd all: update a few more references to network/tailnet lock
Updates tailscale/corp#37904

Change-Id: I746b06328e080fa2b9ff28a2d099f95645aa3d0b
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-28 16:44:16 +01:00
Alex Chan
446ae97491 ipn: improve --exit-node hostname error during startup
When parsing the `tailscale up --exit-node=ARG` argument, we try to
resolve hostnames by searching the list of peers. However, at startup,
the peer list is empty, causing hostname lookups to trivially fail with
an unhelpful "invalid value" erorr.

Improve the error message when the peer list is empty to inform the user
that hostnames cannot be resolved during startup, and advise them to use
the exit node's Tailscale IP address instead.

Also, clarify that hostnames must be peer hostnames, not arbitrary
hostnames.

Fixes #19882

Change-Id: I9390a427c2863d657cf46c5e33b43cb3c5363764
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-28 16:43:45 +01:00
dragondscv
4b8115bb2c cmd/containerboot: clamp MSS to PMTU for proxy group pods (#19686)
Single-pod ingress/egress proxies already called ClampMSSToPMTU when
setting up forwarding rules, but the proxy group (HA) code paths in
egressservices.go and ingressservices.go did not. This caused TCP
connections through proxy group pods to suffer from MSS/MTU mismatch
issues in environments where path MTU discovery is not working.

Add ClampMSSToPMTU calls in the egress sync loop (alongside the existing
EnsureSNATForDst call) and in addDNATRuleForSvc (alongside the existing
EnsureDNATRuleForSvc call), mirroring what the single-pod forwarding
rules already do.

Also add MSS clamping assertions to TestSyncIngressConfigs and track
ClampMSSToPMTU calls in FakeNetfilterRunner.

Fixes issue #19812 https://github.com/tailscale/tailscale/issues/19812.
Tracking internal ticket TSS-86326.

Signed-off-by: Jay Tung <ltung@crusoeenergy.com>
Co-authored-by: Jay Tung <ltung@crusoeenergy.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 12:57:38 +01:00
Brad Fitzpatrick
782c73bf41 cmd/containerboot: fix data race in TestContainerBoot
Parallel subtests share *ipn.Notify pointers (e.g. runningNotify).
When multiple subtests reached the same phase concurrently, they
all wrote to the shared notify's InitialStatus field without
synchronization, triggering the race detector.

Fix by shallow-copying *ipn.Notify before setting InitialStatus,
so each test iteration works on its own copy.

Updates #19380

Change-Id: I9dd40037e02146166f006f4f7c1ddcc47adba191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 18:40:03 -07:00
James Tucker
25b8ed8d9e control/controlknobs,net/{batching,tstun},wgengine: add nodecaps to disable UDP & TUN GRO/GSO
Add four control-plane node attributes that let us disable UDP GSO/GRO
on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN
device.

These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and
TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate
upstream Linux kernel regressions on a deployed fleet without
requiring a client release, after two incidents (#13041, #19777) where
buggy kernel patches landed upstream and the fix took an excessively
long time to reach downstream distros.

Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via
a comparison against a cached "last applied" value and only an actual
transition triggers work: magicsock Rebind()+ReSTUN for UDP,
ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and
is one-way (wireguard-go GRO disablement is sticky); re-enabling
requires a client restart.

Updates #13041
Updates #19777

Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273
Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 17:10:14 -07:00
Brad Fitzpatrick
94af1b00fb cmd/testwrapper, tstest: move test sharding out of test code
Previously, sharding required tests to opt in by calling tstest.Shard,
which used a process-global counter to assign each test to a shard.
This had two problems: most tests didn't call it, so they ran on every
shard (defeating the purpose), and shard assignments were unstable
(depended on call order, so adding a test could reshuffle others).

Remove tstest.Shard and tstest.SkipOnUnshardedCI entirely. Instead,
have testwrapper implement sharding automatically for all tests: when
TS_TEST_SHARD=N/M is set, it uses "go list -json" (no compilation) to
find test source files, scans them for top-level Test/Benchmark/
Example/Fuzz function names, and filters by fnv32a(name) % M == N-1.
The filtered names are passed as an anchored -run regex to go test.

Using go list instead of "go test -list" avoids linking the test binary
twice (Go's build cache does not cache test binary linking).

Fixes #19886

Change-Id: I62ab7b3d757324d4c5fd0b5de50c1e3742681791
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 16:53:17 -07:00
James Scott
db60aa8eca logtail: gate "logtail started" behind TS_DEBUG_LOGTAIL envknob (#19891)
Gates the unnecessary "logtail started" message behind
the debug envknob TS_DEBUG_LOGTAIL. This is extra log spam that isn't
needed unless we are debugging.

Updates tailscale/corp#40908

Signed-off-by: James Scott <jim@tailscale.com>
2026-05-27 15:48:44 -07:00
kari-ts
1a17ec1988 net/netmon: in Android, replace system/bin/ip call with cached LinkProperties gateway (#19804)
bind() on NETLINK_ROUTE sockets does not work on Android 11+ (https://developer.android.com/identity/user-data-ids#mac-11-plus) . Since system/bin/ip uses bind(), likelyHomeRouterIPHelper() always fails on Andoroid 11+, so that GatewayAndSelfIP never caches the result, causing repeated ip process spawns on every periodic ReSTUN.

This replaces the system/bin/ip fallback with a cached gateway IP pushed from Android’s ConnectivityManager via LinkProperties.getRoutes(). This is the same patterm used by UpdateLastKnownDefaultRouteInterface for the interface name (see https://github.com/tailscale/tailscale/pull/11784/). We keep the proc/net/route path as a fallback for early startup before NetworkChangeCallback has fired.

Updates tailscale/tailscale#18622
Updates tailscale/tailscale#13352

Signed-off-by: kari-ts <kari@tailscale.com>
2026-05-27 15:42:48 -07:00
Brad Fitzpatrick
c9fb05b6f5 ipn/ipnlocal: don't dup-suppress UserProfiles on IPNBus on profile switches
Fixes #19889

Change-Id: I324a735c13772c0c79ed7392c0baa5064b34823b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 14:47:02 -07:00
Brad Fitzpatrick
364b952d62 cmd/containerboot: track peers from IPN bus updates, stop using netmap.NetworkMap
Some tests in another repo were broken by tailscale/tailscale#19607.
This fixes them, by finishing off the rest of the migration away from
netmap.NetworkMap on the IPN bus in containerboot.

Containerboot used to rebuild a full NetworkMap-shaped view while
reacting to IPN bus notifications. Now it insteads has its own
netmapState type (immutable) of exactly what it needs to track, and
sends those immutable values around, making cheap edits of new
immutable values when an IPN bus edit arrives.

This should make cmd/containerboot scale to much larger tailnets now too.

Fixes #19852
Fixes tailscale/corp#42347
Updates #12542

Change-Id: I88adaf061f85f677f954a764935e6654329d75a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 14:12:48 -07:00
Fran Bull
80dc7a8d07 feature/conn25: disallow addrs assignment overwriting.
We don't want addr assignments to be lost from the collection before
they can be returned to the IP pools, otherwise we will get orphan
addresses marked inUse in the pools that will never be returned.

Fixes tailscale/corp#39975

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-27 13:54:40 -07:00
Patrick O'Doherty
8501be1990 go.mod: bump dependencies to resolve govulncheck warnings (#19884)
Bump the following:
  go get -u github.com/moby/spdystream@v0.5.1
  go get -u golang.org/x/crypto@v0.52.0
  go get -u golang.org/x/net@v0.55.0

to resolve open govulncheck warnings.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-05-27 12:24:59 -07:00
James Tucker
dea49bb4da net/batching: add envknobs to disable UDP GRO & GSO
It is sometimes useful when diagnosing subtle and specific performance
problems to rule out GRO/GSO independently and/or toggle them to
influence packet pacing.

Updates #17835
Updates tailscale/corp#31164

Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 12:05:00 -07:00
James Tucker
d1912167dc feature/taildrop: replace outgoing-file progress channel with synchronous reporter
serveFilePut tracked outgoing-file progress through an unbuffered
progressUpdates channel whose close was owned by the request goroutine
while writers were spread across manifest parsing, the
progresstracking.Reader callback, singleFilePut failure paths, and the
success path. That writer-closes mismatch made the
send-on-closed-channel panic effectively unfixable in place.

Replace it with a request-scoped outgoingProgress reporter. Transfer
code reports state by method call; the reporter coalesces hot-path
updates and is flushed once via defer in serveFilePut. With no
producer channel to close, the panic is structurally impossible.

Fixes #19115
Fixes #19817

Change-Id: I8f00d982d2c79880dfc1f8104c5eed06e94b5a6c
Signed-off-by: James Tucker <james@tailscale.com>
2026-05-27 12:00:34 -07:00