Commit Graph

2773 Commits

Author SHA1 Message Date
Brendan Creane
c48f953840 cmd/tailscale/cli, ipn/conffile: accept legacy serve config in set-config (#20056)
tailscale serve set-config now also accepts the legacy raw ipn.ServeConfig
format (as emitted by `tailscale serve status --json` and consumed via
TS_SERVE_CONFIG, which has no "version" field), so the common
serve-status-edit-set workflow stops failing. Only the services-oriented
content is applied; any node-level fields are skipped with a warning to
stderr pointing users at get-config to migrate.

Fixes tailscale/corp#39793

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-06-12 18:52:17 -07:00
Alex Valiushko
7d18a06292 go.mod,wgengine/magicsock: pull wireguard-go fix for roaming endpoints (#20118)
Bumps wireguard-go pin to include the roaming endpoints fix, and
two internal enhancements.

Pulls stock wireguard-go for non-tailscale simulation in tests,
to use its endpoint discovery mechanism.

Updates #20082

Change-Id: I2ff282cb7fe4ab099ce5e780a1d40ae86a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
2026-06-12 10:50:35 -07:00
James Tucker
b6713e9bc8 cmd/tailscale/cli: check kubeconfig writability instead of refusing $KUBECONFIG (#20009)
When running under the macOS sandbox, "tailscale configure kubeconfig"
refused outright whenever $KUBECONFIG was set, assuming the path would
not be writable. Yet when $KUBECONFIG was unset it happily relied on the
home-relative-path entitlement to write to ~/.kube/config, so the two
paths made inconsistent assumptions about what the sandbox can reach.

Resolve the kubeconfig path first, then check whether the target file
(or the nearest existing parent directory) is actually writable. Only
report an error if it is not, and include macOS sandbox guidance in that
error since a path outside the home directory is the likely cause. This
lets a $KUBECONFIG that does point under the home directory work, rather
than being rejected unconditionally.

Fixes #20007

Change-Id: I9880363c38b981efaed7e97367851ddacf647be1

Signed-off-by: James Tucker <james@tailscale.com>
2026-06-12 10:48:07 +01:00
David Bond
7fb6751ddd cmd/k8s-operator: rework [unexpected] log lines (#20065)
* cmd/k8s-operator: rework [unexpected] log lines

This commit modifies several places in the operator logs where we
prepend `[unexpected]` to instead use an appropriate logging level.

The `[unexpected]` prefix is intended to be used when the program
violates some internal invariant (or for example, a database has
become corrupted). Many of these cases were simply log lines that
then fell back to a default value/behaviour. These have been releveled
to warnings.

Some of these log lines also seemed extraeneous as for the example of
service reconcilers logging when there is no proxy group annotation. As
far as I can tell we've never had any predicates for limiting the
services reconciled to ones with that annotation, so they can just
be removed to reduce log spam.

Fixes: #cleanup

Signed-off-by: David Bond <davidsbond93@gmail.com>

* Update cmd/k8s-operator/egress-services-readiness.go

Co-authored-by: BeckyPauley <64131207+BeckyPauley@users.noreply.github.com>
Signed-off-by: David Bond <davidsbond@users.noreply.github.com>

* Update cmd/k8s-operator/operator.go

Co-authored-by: BeckyPauley <64131207+BeckyPauley@users.noreply.github.com>
Signed-off-by: David Bond <davidsbond@users.noreply.github.com>

---------

Signed-off-by: David Bond <davidsbond93@gmail.com>
Signed-off-by: David Bond <davidsbond@users.noreply.github.com>
Co-authored-by: BeckyPauley <64131207+BeckyPauley@users.noreply.github.com>
2026-06-11 14:48:48 +01:00
Örjan Fors
be44e66e99 cmd/tailscale: stop defaulting ssh username to local username (#19358)
Prevent tailscale ssh from automatically adding a username when
connecting to a server, only forward one if provided. The previous
behaviour prevented username overrides in the ssh configuration, since
the provided username takes precedence to the configured one.

This also keeps the tailscale ssh a thin wrapper around ssh by not
adding any extra arguments unless required.

Fixes #19357

Signed-off-by: Örjan Fors <o@42mm.org>
2026-06-11 12:11:37 +01:00
Brad Fitzpatrick
6ab5d91071 go.mod: bump some deps to match corp
Updates tailscale/corp#43243
Updaets #20067

Change-Id: I27e19f34e2216f3ac1a4e2a6b38c0ac473b8c7ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-10 21:29:22 -05:00
David Bond
e4ea65d32d cmd/k8s-operator: workload identity support for multi-tailnet (#20016)
This commit modifies the reconciler for the `Tailnet` custom resource
to allow referenced secrets to specify an `audience` field. If a
referenced secret contains both an `audience` and `client_id` we assume
the user's intention is to use workload identity.

In that case, we configure the tailscale API client to authenticate
using the Kubernetes token request API against the operator's service
account. This requires the operator to be aware of its own service
account name.

A small change has also been made to the messages added to the `Tailnet`
CRD's status field in the even that it is missing scopes to make it
clearer that certain scopes may not be applied.

Closes: #19090
Updates: #19471

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-06-10 10:22:19 +01:00
Adriano Sela Aviles
913df7e6ea cmd/tailscale/cli: unit tests for tailscale ip
Updates #20035

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-06-09 11:43:24 -07:00
Anthony
819f3ba7c1 cmd/k8s-operator: allow custom annotations on deployment (#17143)
Fixes #17188

Signed-off-by: Anthony SCHWARTZ <antho.schwartz@gmail.com>
Signed-off-by: Anthony SCHWARTZ <anthony.schwartz@ext.ec.europa.eu>
2026-06-09 15:25:40 +01:00
Alex Chan
65a117184b all: rename NetworkLock functions/types to TailnetLock
To avoid breaking downstream code, add deprecated aliases for all the
old names.

Updates tailscale/corp#37904

Change-Id: I86d0b0d7da371946440b181c665448f91c3ef8d2
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-06-08 13:14:28 +01:00
Adriano Sela Aviles
83c8440834 cmd/tailscale/cli: add service support to tailscale ip
Fixes #20035

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-06-05 18:49:17 -07:00
Nick Khyl
c0d0621417 logpolicy,tsnet: remove syspolicy dependency
tsnet depends on logpolicy, which in turn depended on util/syspolicy
because of a single LogTarget policy setting it uses.

In this commit, we replace that dependency with a feature.Hook,
which only tailscaled or its platform-specific alternatives should set.

Updates #20031

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2026-06-05 16:21:27 -05:00
Andrew Lytvynov
c07bf57eba cmd/tailscaled: only warn about unsupported attestation when enabled (#20028)
We don't need to log if the policy doesn't actually say that hardware
attestation must be enabled.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-06-05 14:00:07 -07:00
Simon Law
84ffcd2759 cmd/tailscale/cli/jsonoutput: provide examples for jsonoutput.DNS* (#19998)
This patch adds examples for unmarshalling the JSON outputs of the
following commands:

	tailscale dns query --json
	tailscale dns status --json

It also adds an example usage of `tailscale dns` to both
jsonoutput.DNSQueryResult and jsonoutput.DNSStatusResult.

Updates #13326
Updates #18750

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-05 02:02:15 -07:00
Brad Fitzpatrick
772be1b0cc gokrazy, clientupdate: add start of Gokrazy auto-updates, tests
This adds support for Gokrazy GAF (Gokrazy Archive Format) zip
auto-updates, starting to wire up Tailscale's clientupdate mechanism
to Gokrazy's update mechanism.

Currently there's just a CLI command to update from a GAF URL,
with an --unsigned flag for use in a new natlab vmtest.

Next step would be publishing unstable track GAF files on
pkgs.tailscale.com, with detached signatures, and then making the
clientupdate mechanism also download those and check signatures.

Updates #20002

Change-Id: Ib03c56f17a57f8a4638398ef83549dac4813323d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-04 11:20:14 -07:00
Simon Law
6ff761c5f8 cmd/tailscale/cli/jsonoutput: fix flag parsing for boolean values (#19996)
SchemaVersion didn’t actually parse boolean values properly, so
calling `tailscale lock status --json=false` would fail with:

	invalid boolean value "false" for -json: invalid integer value passed to --json: "false"

This patch makes SchemaVersion.Set delegate to flag.FlagSet for its
argument parsing, with accompanying tests that ensure that both
boolean and integer values are parsed properly.

It also removes the restriction that prevented the flag from appearing
multiple times in the arguments list. Now, the final flag clobbers all
previous ones, aligning this behaviour with the standard flag package.

We also change the SchemaVersion.String output for the zero value to
"false", so that the default help message doesn’t change when we
switch other commands over from their boolean representations:

	user@host:~$ tailscale whoami --help
	FLAGS
	  --json, --json=false
	        output in JSON format (default false)

Updates #17613

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-04 10:18:47 -07:00
Simon Law
0bbaed6af4 cmd/tailscale/cli/jsonoutput: rename exported identifiers (#19994)
Since we don’t think anyone has actually imported the jsonoutput
package yet, we still have a chance to rename its fundamental types:

1. Rename the JSONSchemaVersion struct to SchemaVersion because
   it is a flag.Value that can represent any schema version.

2. Rename the JSONSchemaVersion.Value field to SchemaVersion.Version
   so the struct reads better:

	if args.json.IsSet && args.json.Version == 1 {
		// ...
	}

Updates #17613

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-04 09:55:48 -07:00
Simon Law
f05e145d7a cmd/tailscale/cli/jsonoutput: improve doc comments and add examples (#19993)
This patch:

1. Removes hardcoded mentions of a `--json` flag from the
   documentation for JSONSchemaVersion, because the type could be used
   for anything.

2. Removes `code` formatting because Go doc comments don’t support
   this syntax.

3. Fixes [links] in doc comments so they link to the types’
   online documentation.

4. Checks that JSONSchemaVersion satisfies the flag.Value interface.

5. Adds documentation examples for using both JSONSchemaVersion and
   ResponseEnvelope.

Updates #17613
Updates #18750

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-04 09:23:49 -07:00
Brad Fitzpatrick
dfb605db4a cmd/ssh-auth-none-demo: update SSH demo a bit
Per chat with an SSH client author who wanted a URL in the output.

And then make it more clear what parts are banners.

Updates #cleanup

Change-Id: If5033ad9dc0dba3d833f24ea39e117a455010492
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-03 19:36:19 -07:00
BeckyPauley
98f1ac0880 cmd/k8s-operator, net/netutil: revert 4via6 changes (#19990)
Reverts support 4via6 in egress proxy and connector (#19863)

Updates #19334

Signed-off-by: Becky Pauley <becky@tailscale.com>
2026-06-03 20:20:36 +01:00
Kabir
01c59d84a0 cmd/tailscale/cli: show services in serve status (#19600)
The "tailscale serve status" human-readable output previously showed
only serve-based proxies, not services.

Fixes https://github.com/tailscale/corp/issues/34163

Change-Id: Ie48858a8d8afd7184979d0fe2ab21ebd6fd0d4a0

Signed-off-by: Kabir Sikand <kabir@tailscale.com>
2026-06-02 17:09:54 -04:00
Simon Law
b47dd932f3 cmd/tailscale/cli: use tstime constant for tailscale routecheck (#19957)
Updates #19928

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 17:42:18 -07:00
ferrumclaudepilgrim
3f70abdc6f cmd/tailscaled, version/distro: default to userspace-networking on Crostini
cros-garcon NULL-derefs on cold-boot netlink enumeration when
tailscale0 is present, preventing the Crostini container and
ChromeOS Terminal from starting cleanly. This is an upstream
ChromiumOS bug in cros-garcon; tailscaled can work around it
by defaulting to userspace-networking mode on Crostini.

Tailscale SSH continues to work via tailscaled's netstack.
Users can override with --tun=tailscale0 on ChromeOS builds
where cros-garcon is fixed.

Crostini is detected via /opt/google/cros-containers/bin/garcon,
which is present in every Crostini penguin container.

ssh/tailssh extends the existing Debian default-PATH case to
cover Crostini, since Crostini is Debian-based and benefits
from the same SSH PATH defaults.

RELNOTE: Crostini now defaults to userspace-networking.

Fixes #19488
Updates #12090

Signed-off-by: ferrumclaudepilgrim <ferrumclaudepilgrim@users.noreply.github.com>
2026-06-01 17:40:07 -07:00
Brad Fitzpatrick
a6ab7efa4f ipn/ipnlocal, cmd/tailscale/cli: auto-renew TLS certs and warn while pending
The Tailscale daemon only refreshed TLS certs as a side effect of inbound
TLS handshakes or "tailscale cert" CLI calls. A node that doesn't see
inbound traffic during the renewal window silently rolls past expiry.

Add a once-per-hour background loop on LocalBackend that enumerates Serve
and Funnel HTTPS hostnames (filtered against the netmap's CertDomains so
we don't poke ACME for other nodes' service hostnames) and calls the
existing GetCertPEM path. The renewal decision (ARI window, then 2/3
expiry fallback) is unchanged; the loop just guarantees it runs.

For visibility during initial issuance or restart with a long-expired
cached cert, add a "tls-cert-pending" health Warnable that's set while
ACME is in flight and no usable cached cert exists. Async renewal of a
still-valid cert intentionally doesn't fire it. And then make the CLI "cert"
subcommand print out a warning if it's blocking due to a cert fetch
in flight, using that health info.

Fixes #19911
Fixes #19912

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I144e46c40e957b2e879587decace32a523a6eade
2026-06-01 16:31:54 -07:00
Simon Law
92bfda580c cmd/tailscale/cli: fix time in tailscale routecheck (#19956)
When running `tailscale netcheck`, the reported timestamp used to be
in UTC and formatted according to RFC 3339 with a `T` to separate the
date from the time:

	sfllaw@h2co3:~$ tailscale netcheck | head -n3

	Report:
		* Time: 2026-06-01T21:12:32.252620138Z

This is machine-readable time leaking out to the user interface. Times
in normal commands are formatted for humans to read:

	sfllaw@h2co3:~$ date
	Mon 01 Jun 2026 02:39:14 PM PDT
	sfllaw@h2co3:~$ journalctl -t tailscaled | tail -n1
	Jun 01 14:35:21 h2co3 tailscaled[3328921]: wgengine: sending TSMP disco key advertisement to 100.90.144.102
	sfllaw@h2co3:~$ timedatectl show
	Timezone=America/Los_Angeles
	LocalRTC=no
	CanNTP=yes
	NTP=yes
	NTPSynchronized=yes
	TimeUSec=Mon 2026-06-01 14:38:32 PDT
	RTCTimeUSec=Mon 2026-06-01 14:38:32 PDT
	sfllaw@h2co3:~$ uptime --since
	2026-05-15 07:37:45

This PR makes the times printed by the CLI commands consistent:

- For `tailscale routecheck`, it now prints local time as
  `2026-05-15 07:37:45-07:00`.
- For `netlogfmt`, it has always printed local time with a space,
  but now includes the time zone.
- All machine-readable outputs continue to be standard RFC 3339 in
  UTC, i.e. `--format=json`.

As part of a general cleanup, this PR also adds standard common
time.Format layouts as tstime constants.

Fixes #19928

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 16:12:08 -07:00
Achille Roussel
7f3bbc9865 net/netutil: add NewDefaultTransport to avoid http.DefaultTransport panics
Several packages built their HTTP transports with

    http.DefaultTransport.(*http.Transport).Clone()

The standard library only documents http.DefaultTransport as an
http.RoundTripper, so an application is free to replace it with a
RoundTripper that is not a *http.Transport (e.g. an instrumented or
tracing wrapper). When such an application embeds tsnet.Server, the
unchecked type assertion panics as soon as tsnet brings up its control
connection, DNS bootstrap, or log uploader.

Add netutil.NewDefaultTransport, which returns a clone of the global
when it is still the standard *http.Transport (preserving existing
behavior) and otherwise returns a fresh transport mirroring the stdlib
defaults. Route every clone site through it.

Updates #19937

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Achille Roussel <achille.roussel@gmail.com>
2026-06-01 12:28:36 -07:00
Brad Fitzpatrick
0d92a69259 cmd/tailscale/cli: add "tailscale get" command
This adds @alexwlchan's proposed "tailscale get" command that reads
current preference values, complementing "tailscale set". It uses the
same flag names as set.

  tailscale get              # show all settings as a table
  tailscale get all          # same
  tailscale get accept-dns   # show a single value
  tailscale get --json       # output as JSON object
  tailscale get --set-flags  # output as tailscale set argv

Fixes #11389
Fixes tailscale/corp#38702

Change-Id: Ie366f27f11ccc56c76fff9a94ed8a9de9c835bd0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-01 11:59:33 -07:00
Simon Law
2d6844c565 cmd/tailscale/cli: add routecheck command (#19641)
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.

This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 11:50:24 -07:00
Brad Fitzpatrick
d961e44856 cmd/testwrapper: auto-retry every failing test
Previously, testwrapper only retried tests explicitly annotated with
flakytest.Mark. Authors don't pre-emptively mark tests that haven't
flaked yet, so the first flake of a brand-new test failed CI even
when a re-run would have passed.

testwrapper now retries every failing test within a per-test wall-clock
budget (default: 5 minute per-attempt timeout capped at 1.5x the first
failure duration, 10 minute total). A test that fails and then passes
on retry is reported as flaky; a test that never passes within the
budget remains a real failure (exit non-zero).

For flakeapp's existing log scraping, the wire format is preserved:
the "flakytest failures JSON:" line is now emitted only for tests
that ultimately flaked (passed on retry). Unmarked tests get a fake
issue URL of the form https://github.com/{owner}/{repo}/issues/UNKNOWN
where owner/repo is detected from GITHUB_REPOSITORY, the local git
remote, or falls back to tailscale/tailscale. A new "permanent test
failures JSON:" line is emitted for tests that never passed; flakeapp
ignores it for now (a follow-up can teach it to record real failures
separately).

flakytest.Mark stays as an opt-in API: still useful for tracking a
known-flaky test against a real issue and for TS_SKIP_FLAKY_TESTS.

Updates tailscale/corp#38960

Change-Id: I56dfc9b023486d239f60793a53e9690578ce8017
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-01 11:07:56 -07:00
Simon Law
2ee9eacb94 client/local,ipn/localapi: add /localapi/v0/routecheck endpoint (#19640)
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 11:06:14 -07:00
Simon Law
28801674a6 net/routecheck: introduce new package for checking peer reachability (#19639)
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.

Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.

Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.

This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.

This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-01 10:33:08 -07:00
Brad Fitzpatrick
c086992f4f cmd/tailscale/cli: add whoami subcommand
Add a "tailscale whoami" subcommand that is equivalent to running
"tailscale whois $(tailscale ip -4)" but more ergonomic. It supports
the --json flag just like whois, and shares the WhoIsResponse
rendering code with whois.

Fixes #19907

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I8f33ba7a5608bab7dffa8213303beb5f345936d3
2026-05-28 10:49:17 -07:00
Alex Chan
9d126aec34 all: remove network lock references from private method names
Updates tailscale/corp#37904

Change-Id: I312d46d958209ca3d1152d1877fb91a57c91798d
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-28 18:00:36 +01:00
Alex Chan
446ae97491 ipn: improve --exit-node hostname error during startup
When parsing the `tailscale up --exit-node=ARG` argument, we try to
resolve hostnames by searching the list of peers. However, at startup,
the peer list is empty, causing hostname lookups to trivially fail with
an unhelpful "invalid value" erorr.

Improve the error message when the peer list is empty to inform the user
that hostnames cannot be resolved during startup, and advise them to use
the exit node's Tailscale IP address instead.

Also, clarify that hostnames must be peer hostnames, not arbitrary
hostnames.

Fixes #19882

Change-Id: I9390a427c2863d657cf46c5e33b43cb3c5363764
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-28 16:43:45 +01:00
dragondscv
4b8115bb2c cmd/containerboot: clamp MSS to PMTU for proxy group pods (#19686)
Single-pod ingress/egress proxies already called ClampMSSToPMTU when
setting up forwarding rules, but the proxy group (HA) code paths in
egressservices.go and ingressservices.go did not. This caused TCP
connections through proxy group pods to suffer from MSS/MTU mismatch
issues in environments where path MTU discovery is not working.

Add ClampMSSToPMTU calls in the egress sync loop (alongside the existing
EnsureSNATForDst call) and in addDNATRuleForSvc (alongside the existing
EnsureDNATRuleForSvc call), mirroring what the single-pod forwarding
rules already do.

Also add MSS clamping assertions to TestSyncIngressConfigs and track
ClampMSSToPMTU calls in FakeNetfilterRunner.

Fixes issue #19812 https://github.com/tailscale/tailscale/issues/19812.
Tracking internal ticket TSS-86326.

Signed-off-by: Jay Tung <ltung@crusoeenergy.com>
Co-authored-by: Jay Tung <ltung@crusoeenergy.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-28 12:57:38 +01:00
Brad Fitzpatrick
782c73bf41 cmd/containerboot: fix data race in TestContainerBoot
Parallel subtests share *ipn.Notify pointers (e.g. runningNotify).
When multiple subtests reached the same phase concurrently, they
all wrote to the shared notify's InitialStatus field without
synchronization, triggering the race detector.

Fix by shallow-copying *ipn.Notify before setting InitialStatus,
so each test iteration works on its own copy.

Updates #19380

Change-Id: I9dd40037e02146166f006f4f7c1ddcc47adba191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 18:40:03 -07:00
Brad Fitzpatrick
94af1b00fb cmd/testwrapper, tstest: move test sharding out of test code
Previously, sharding required tests to opt in by calling tstest.Shard,
which used a process-global counter to assign each test to a shard.
This had two problems: most tests didn't call it, so they ran on every
shard (defeating the purpose), and shard assignments were unstable
(depended on call order, so adding a test could reshuffle others).

Remove tstest.Shard and tstest.SkipOnUnshardedCI entirely. Instead,
have testwrapper implement sharding automatically for all tests: when
TS_TEST_SHARD=N/M is set, it uses "go list -json" (no compilation) to
find test source files, scans them for top-level Test/Benchmark/
Example/Fuzz function names, and filters by fnv32a(name) % M == N-1.
The filtered names are passed as an anchored -run regex to go test.

Using go list instead of "go test -list" avoids linking the test binary
twice (Go's build cache does not cache test binary linking).

Fixes #19886

Change-Id: I62ab7b3d757324d4c5fd0b5de50c1e3742681791
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 16:53:17 -07:00
Brad Fitzpatrick
364b952d62 cmd/containerboot: track peers from IPN bus updates, stop using netmap.NetworkMap
Some tests in another repo were broken by tailscale/tailscale#19607.
This fixes them, by finishing off the rest of the migration away from
netmap.NetworkMap on the IPN bus in containerboot.

Containerboot used to rebuild a full NetworkMap-shaped view while
reacting to IPN bus notifications. Now it insteads has its own
netmapState type (immutable) of exactly what it needs to track, and
sends those immutable values around, making cheap edits of new
immutable values when an IPN bus edit arrives.

This should make cmd/containerboot scale to much larger tailnets now too.

Fixes #19852
Fixes tailscale/corp#42347
Updates #12542

Change-Id: I88adaf061f85f677f954a764935e6654329d75a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-27 14:12:48 -07:00
Patrick O'Doherty
8501be1990 go.mod: bump dependencies to resolve govulncheck warnings (#19884)
Bump the following:
  go get -u github.com/moby/spdystream@v0.5.1
  go get -u golang.org/x/crypto@v0.52.0
  go get -u golang.org/x/net@v0.55.0

to resolve open govulncheck warnings.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-05-27 12:24:59 -07:00
Jordan Whited
4aef023765 cmd/tailscaled,types/logger: remove TS_DEBUG_MEMORY and associated logger
Commit e5a8cf3b1 added feature/runtimemetrics, which emits heap bytes
and total process memory as clientmetrics when the
NodeAttrEmitRuntimeMetrics capability is set. That subsumes the job of
the TS_DEBUG_MEMORY envknob, whose only effect is to prefix every log
line with Go heap+stack and Maxrss via logger.RusagePrefixLog.

Updates tailscale/corp#39434

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-05-27 09:09:05 -07:00
Artem Leshchev
5652b6c9c0 cmd/k8s-operator: fix token exchange for identity federation (#19845)
tailscale-client-go-v2 natively supports identity federation authentication,
and in #19010 the required authentication provider is used, but the manual
token exchange was never removed, so we were exchanging JWT token to an auth
token, and then were trying to use that auth token for exchange once again.
This commit removes the legacy mechanism, fully relying on
tailscale-client-go-v2 to handle authentication.

Fixes #19844

Signed-off-by: Artem Leshchev <matshch@avride.ai>
2026-05-27 16:45:07 +01:00
Jason Dillingham
0e2b3f31af cmd/k8s-operator: stabilize StaticEndpoints order in ProxyGroup reconciles (#19755)
findStaticEndpoints built its return slice by iterating nodes.Items in
the order returned by r.List, which is not guaranteed to be stable
across calls. When the resulting set of addresses already matched the
existing config Secret, the slice could still permute between
reconciles, making the marshalled config Secret differ byte-for-byte.
That tripped the DeepEqual check on the config Secret, which rewrote
the Secret, which fired a watch event, which re-enqueued the
ProxyGroup, looping forever.

Detect this case and return the existing currAddrs slice unchanged
when the resulting set is the same, preserving the "use the currently
used IPs first" intent without spurious writes.

Fixes #19700

Signed-off-by: Jason Dillingham <jasonmdillingham@gmail.com>
2026-05-27 14:28:04 +01:00
Erisa A
e2a0d45418 cmd/tailscale/cli: fix time parsing in debug daemon-logs (#19875)
Fixes #19874

Signed-off-by: Erisa A <erisa@tailscale.com>
2026-05-27 12:30:28 +01:00
BeckyPauley
0ed6da2826 cmd/k8s-operator, net/netutil: support 4via6 in egress proxy and connector (#19863)
Add support for configuring egress to destinations reachable via 4via6
subnet routes. This change affects standalone egress proxy only- egress
ProxyGroup needs IPv6 support before being able to support 4via6. Egress may
be configured using either the synthesized 4via6 address or the MagicDNS
name (in the form
<IPv4-address-with-hyphens-instead-of-dots>-via-<siteid>[.*]).

Also update the Connector to validate and advertise 4via6 subnet routes.
Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.

Updates #19334

Signed-off-by: Becky Pauley <becky@tailscale.com>
2026-05-27 10:54:35 +01:00
Jordan Whited
e5a8cf3b18 control/controlknobs,feature/*,ipn/ipnlocal,tailcfg: add runtimemetrics
Emit runtime metrics as clientmetrics when the
NodeAttrEmitRuntimeMetrics NodeCapability is present.

We start small with just 2 metrics: heap bytes and total process memory.

Updates tailscale/corp#39434

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-05-26 16:02:01 -07:00
Simon Law
7dabebc691 net/traffic: switch rendezvous hashing from SHA256 to FNV-1a (#19821)
In PR tailscale/corp#30448, we originally decided to break ties using
SHA256 for our rendezvous hashing algorithm. Now that we’ve had some
experience with it, we think that FNV-1a is a better choice. It
distributes bits evenly, it’s much faster, and it doesn’t need to be
cryptographically secure. The FNV designers recommend FNV-1a over the
deprecated FNV-1.

This PR makes the switch and updates the related tests, since changing
the algorithm changes which stable pick gets selected. As of 2026-05,
this is the best time to make this change, since there are almost no
clients in the wild with traffic steering enabled.

Updates #17366
Updates tailscale/corp#29964
Updates tailscale/corp#29966
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-05-21 10:11:59 -07:00
Brad Fitzpatrick
aa5da2e5f2 ipn/ipnlocal, control/controlclient: process node adds/removes in constant time
For large tailnets (~50k+ nodes) with frequent peer churn (ephemeral
GitHub Actions workers etc.), tailscaled used to rebuild the full
netmap and fan it out on the IPN bus on every MapResponse that
added or removed a peer. There were two O(N) costs per delta: the
full netmap rebuild + every Notify.NetMap encode to every bus watcher.

This change tackles both:

  1. Plumb O(1) peer add/remove through the delta path. PeersChanged
     and PeersRemoved no longer prevent the delta happy path; instead,
     they mutate the per-node-backend peer map in place.

  2. Restrict ipn.Notify.NetMap emission to the platforms whose host
     GUIs still depend on it (Windows, macOS, iOS) and migrate
     in-tree consumers off it everywhere else:

     - Migrate reactive consumers (containerboot, kube agents,
       sniproxy, tsconsensus, etc.) off Notify.NetMap to the
       previously-added Notify.SelfChange signal so they no longer
       have to subscribe to the full netmap.
     - Add ipn.NotifyNoNetMap so GUI clients on "legacy-emit" platforms
       that have already migrated can opt out of the per-watcher
       NetMap encode.
     - Gate Notify.NetMap emission on the producer side by a compile-
       time GOOS check, so the supporting code is dead-code-eliminated
       on Linux and other geese where no GUI consumer needs it.

Re-running BenchmarkGiantTailnet from tstest/largetailnet, which was
added along with baseline numbers on unmodified main in ad5436af0d,
the per-delta cost (one peer add+remove pair) is now ~O(1) regardless
of tailnet size N:

    N         no-watcher (ms/op)            bus-watcher (ms/op)
              before    now     factor      before    now     factor
     10000        32   0.11       300x         166   0.13      1300x
     50000       222   0.11      2000x         865   0.13      6700x
    100000       504   0.12      4100x        1765   0.13     13400x
    250000      1551   0.12     12500x        4696   0.15     32400x

Updates #12542

Change-Id: I94e34b37331d1a8ec74c299deffadf4d061fda9e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-21 09:26:19 -07:00
Simon Law
7ebca58042 net/traffic,ipn/ipnlocal: extract traffic steering utilities (#19682)
The traffic package contains helpers for evaluating traffic steering
scores and picking appropriate nodes. These were extracted from
ipnlocal.suggestExitNodeUsingTrafficSteering so they can be reused by
the new routecheck package to probe exit nodes in priority order.

Updates #17366
Updates tailscale/corp#33033

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-05-21 08:28:27 -07:00
Aria Stewart
61277e3ad4 Construct IPv6 ingress URLs correctly
Fixes #19338

Signed-off-by: Aria Stewart <aredridel@dinhe.net>
2026-05-20 17:21:35 -07:00
Brad Fitzpatrick
04ae61fe4b tstest/integration/jswasmtest: add headless-Chromium tests for @tailscale/connect
Add Go tests that drive a real headless Chromium (via chromedp) against
the built cmd/tsconnect/pkg/ artifact and verify the @tailscale/connect
public API surface end-to-end. The package has not been republished in
three years, in part because no test exercises the produced artifact at
runtime — only tsc --noEmit and a Go build run in CI.

TestCreateIPN loads pkg.js into the browser, calls createIPN with a junk
auth key, and asserts that pkg.createIPN / pkg.runSSHSession are
functions and that createIPN() returns an IPN with the documented
run/login/logout/ssh/fetch methods. No control-plane traffic.

TestFetchTailnetPeer stands up a full local tailnet (testcontrol +
DERP + a tsnet.Server peer) and verifies that the browser-side WASM
client can join over WebSocket-noise to the same control, connect to
DERP over WSS, and then ipn.fetch() an HTTP service hosted on the tsnet
peer through the tailnet. The test asserts the response body matches a
known string. Browser state transitions are logged: NoState -> NeedsLogin
-> Starting -> Running.

Tests are opt-in via --run-headless-browser-tests (matching the existing
--run-vm-tests pattern in tstest/natlab/vmtest) so they never fire in
casual `go test ./...` runs. When the flag is set, a test is skipped if
cmd/tsconnect/pkg/ has not been built, and fails with t.Error if no
chromium binary is found on $PATH (honoring $CHROME_BIN as an override).
findChromium also falls back to /Applications/Google Chrome.app and
/Applications/Chromium.app on darwin, since macOS Chrome's executable
lives inside an .app bundle and is not on $PATH by default. The
.github/workflows/test.yml wasm job is extended to install
google-chrome-stable and run the tests with the flag after build-pkg.

To prevent silently testing a stale pkg/main.wasm (built from an older
checkout than the rest of the test invocation), build-pkg now writes
pkg/build-info.json recording the sha256 of the raw (pre-wasm-opt)
go-build output. The test does its own `go build` of
cmd/tsconnect/wasm with the same -tags/-trimpath/-ldflags (factored
into a new cmd/tsconnect/wasmbuild package shared by both call sites)
and t.Fatalfs with a "rebuild" instruction on mismatch. Cost is
near-zero because the Go build cache from the prior build-pkg makes
the rebuild a cache hit.

The new wasmbuild package also replaces cmd/tsconnect's hardcoded -tags
string with a minimal-feature-set computation. wasmbuild.Keep names the
small set of feature/featuretags entries the browser client actually
needs (netstack, logtail, dns, health, c2n, ipnbus); wasmbuild.Tags()
emits a ts_omit_<f> for every other
omittable feature in feature/featuretags.Features, with transitive deps
expanded via featuretags.Requires. An init() panics if Keep references
a feature unknown to feature/featuretags so a rename there fails
loudly. Net effect on size: 32M raw / 9.4M brotli before this change,
25M raw / 4.4M brotli after — vs the last-published 1.39.98 at 21M /
3.8M. The transitive package-import graph is unchanged (176
tailscale.com/* packages either way): featuretags omits eliminate
dead code via `const HasX = false`, not imports. Trimming the import
graph would require a separate, larger refactor splitting interface
packages by build tag.

Writing TestFetchTailnetPeer surfaced several real issues, all fixed
here:

  * cmd/tsconnect built the wasm with the nethttpomithttp2 tag, but
    control/ts2021 (since commit 1d93bdce2, "control/controlclient:
    remove x/net/http2, use net/http", Oct 2025) requires HTTP/2 from
    net/http's bundled implementation. With nethttpomithttp2 set, the
    bundle is excluded and the wasm client cannot speak HTTP/2 to any
    control plane, including production. Drop the tag. Wasm size grows
    ~1 MB raw / ~300 KB brotli (more than offset by the feature
    pruning above). The last published @tailscale/connect (1.39.98,
    early 2023) pre-dates the regression, which is why no consumer has
    reported the breakage.

  * tstest/integration/testcontrol.Server's /ts2021 noise upgrade
    endpoint rejected anything but POST. WebSocket clients (the only
    transport available to browser-WASM) come in as GET. Allow both;
    the controlhttp AcceptHTTP path dispatches on the Upgrade header,
    so the websocket library still enforces GET for WS upgrades.
    This matches production, where the same controlhttpserver.AcceptHTTP
    routes purely on the Upgrade header without checking method.

  * derp/derphttp's urlString built the DERP URL from node.HostName
    only, dropping node.DERPPort. Non-WS clients use a separate code
    path (connectToHost) that honors DERPPort, but WebSocket-only
    clients (browser-WASM) went through urlString and so could not
    reach a DERP running on any port other than 443. Include the port
    when it differs from the scheme default.

Also move addWebSocketSupport from cmd/derper (where it was main-only)
to derp/derpserver.AddWebSocketSupport so tstest/integration.RunDERPAndSTUN
can wrap its DERP handler with WebSocket support — without that, the
test DERP would not accept the browser's wss connection.

Fixes #9394

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Iff9cdee303e3b239924249b5bffb2fd04e02f391
2026-05-20 10:48:29 -07:00