Commit Graph

10550 Commits

Author SHA1 Message Date
Alex Valiushko
01d0bdd253 cmd/derper,derp: add metrics for rate limit hits (#19560)
Expvars track count of rate limiters exceeding their threshold.
Covers (1) global rate limiter and (2) total of local rate limiters.

Also publish optional rate-limit metrics during ExpVar() call
if -rate-config is specified. Fixes current rate-limit metrics
being published outside of "derp" in /debug/vars.

Updates tailscale/corp#38509

Change-Id: Ic7f5a1e890d0d7d3d7b679daa4b5f8926a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
2026-04-29 10:29:09 -07:00
Claus Lensbøl
be7cce74ba wgengine/userspace: do not fall back to old key on tsmpLearned mismatch (#19575)
The mismatch behaviour of falling back to a previous key could end up
breaking connections when the netmap update took longer than the 2
seconds allowed in controlClient.auto for netmap updates, or if the
controlClient context was canceled. This could end up breaking
legitimate updates to the netmap for disco keys coming from control.

Instead, log the event, and let the connection be reset to that of the
key as that is safer.

Issue found by @bradfitz.

Updates #19574

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-29 13:23:04 -04:00
Brad Fitzpatrick
fd6ae2fad4 tstest/natlab/vmtest: serialize per-platform setup with sync.Once
Two cloud-platform nodes (e.g. sr-a and sr-b in TestSiteToSite) boot in
parallel via errgroup and both call ensureCompiled and the inline image
preparation block, racing to Begin() the same shared *Step (which is
deduped by name in Env.Step). The second goroutine panics:

    panic: Step "Compile linux_amd64 binaries": Begin called in state running
    panic: Step "Prepare ubuntu-24.04 image": Begin called in state done

ensureCompiled had a TOCTOU dedup attempt (released compileMu before
doing the work, only added to the compiled set at the end), and image
preparation had no dedup at all.

Replace the compiled set with a per-key map[string]*sync.Once for each
of compile and image preparation, so concurrent callers serialize on
the Once and only the first executes Begin/work/End.

Fixes commit 02ffe5baa8.

Updates #13038

Change-Id: If710bcc9e0aafebf0ad5b61553bae11458d976d7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 09:54:58 -07:00
Brad Fitzpatrick
02ffe5baa8 tstest/natlab/vmtest: add macOS VM snapshot caching for fast test starts
Cache a pre-booted macOS VM snapshot on disk so subsequent test runs
restore from the snapshot instead of cold-booting. The snapshot is keyed
by the Tart base image digest and a code version constant
(macOSSnapshotCodeVersion); bumping either invalidates the cache.

Snapshot preparation (one-time):
- Boot the Tart base image with a NAT NIC (--nat-nic flag)
- Wait for SSH, compile and install cmd/tta as a LaunchDaemon
- TTA polls the host via AF_VSOCK for an IP assignment; during prep
  the host replies "wait"
- Disconnect NIC, save VM state via SIGINT

Test fast path (cached, ~7s to agent connected):
- APFS clone the snapshot, write test-specific config.json
- Launch Host.app with --disconnected-nic --attach-network --assign-ip
- VZ restores from SaveFile.vzvmsave (~5s with 4GB RAM)
- TTA's vsock poll gets the IP config, sets static IP via ifconfig
  (bypasses DHCP entirely), switches driver addr to the IP directly
  (bypasses DNS), and resets the dial context so the reverse-dial
  reconnects immediately
- TTA agent connects to test driver within ~2s of IP assignment

Key optimizations:
- 4GB RAM instead of 8GB: halves SaveFile.vzvmsave (1.4GB vs 2.4GB),
  halves restore time (5.5s vs 11s)
- AF_VSOCK IP assignment: bypasses macOS DHCP (~5-7s saved)
- Direct IP dial: bypasses DNS resolution for test-driver.tailscale
- Dial context reset: cancels stale in-flight dials from snapshot
- Kill instead of SIGINT for test VM cleanup (no state save needed)
- Parallel VM launches

Also:
- Add TestDriverIPv4/TestDriverPort constants to vnet
- Add --nat-nic and --assign-ip flags to Host.app
- Fix SIGINT handler: retain DispatchSource globally, use dispatchMain()
- Add vsock listener (port 51011) to Host.app for IP config protocol
- Add disconnectNetwork() to VMController for clean snapshot state
- Fix Makefile: set -o pipefail so xcodebuild failures aren't swallowed

Updates #13038

Change-Id: Icbab73b57af7df3ae96136fb49cda2536310f31b
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 08:17:13 -07:00
M. J. Fromberger
7b53550fe6 control/controlclient: fix a nil-indirection bug in DERP key pruning (#19565)
Upon deciding to update the LastSeen timestamp, we weren't checking that the
field we are replacing into was non-nil. Rather than add an additional check,
just allocate a fresh pointer for the updated time.

Updates #19564

Change-Id: I589ebe65175fc7677c04a31dd6c4670e2531ee62
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-04-29 07:57:38 -07:00
David Bond
a29e42135b cmd/k8s-operator: add nodeSelector to DNSConfig resource (#19429)
This commit modifies the `DNSConfig` resource to allow customisation of
the `spec.nodeSelector` field in the nameserver pods.

Closes: https://github.com/tailscale/tailscale/issues/19419

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-29 15:56:33 +01:00
Brad Fitzpatrick
4cec06b8f2 tstest/natlab/vmtest: add macOS VM screenshot streaming to web UI
When --vmtest-web is set, Host.app is launched with --screenshot-port 0
to start a localhost HTTP server that captures the VZVirtualMachineView
display. The Go test harness parses the SCREENSHOT_PORT=<port> line from
stdout, then polls every 2 seconds for JPEG thumbnails and pushes them
over WebSocket to the web dashboard.

Clicking a screenshot thumbnail opens a full-resolution image proxied
through the web UI's /screenshot/{node} endpoint.

Screenshot events are excluded from the EventBus history (they're large
and only the latest matters, stored in NodeStatus.Screenshot).

Updates #13038

Change-Id: I9bc67ddd1cc72948b33c555d4be3d8db06a41f6d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 07:48:26 -07:00
Claus Lensbøl
78627c132f wgengine/magicsock,ipn/ipnlocal: store and load homeDERP from cache (#19491)
With netmap caching, the home DERP of the self node was neither saved to
the cache or loaded from it, making nodes not stick to a DERP when
starting without a connection to control.

Instead, make sure that when a cache is available, load that cache,
before looking for DERP servers. This is implemented by allowing a skip
of ReSTUN in setting the DERP map (we must have a DERP map before
setting the home DERP), so the DERP from cache will set itself and be
sticky until a connection to control is established.

Making DERP only change when connected to control is handled by existing
code from f072d017bd.

Updates #19490

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-29 10:24:09 -04:00
Alex Chan
1841a93ab2 ssh/tailssh: mark TestSSHRecordingCancelsSessionsOnUploadFailure as flaky (again)
This test is still flaking on macOS, so mark it as such so we can track
and investigate further.

Updates #7707

Change-Id: I640da3c1068a90a9815caab2df9431bceb01f846
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-29 14:22:09 +01:00
Alex Chan
bb91bb842c all: remove everything related to non-seamless key renewal
Seamless key renewal has been the default in all clients since 1.90.
We retained the ability to disable it from the control plane as a
precaution, but we haven't seen any issues that require us to disable it.

We're now removing all the code for non-seamless key renewal, because we
don't expect to turn it on again, and indeed it's been untested in the
field for three releases so might contain latent bugs!

Updates tailscale/corp#33042

Change-Id: I4b80bf07a3a50298d1c303743484169accc8844b
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-29 10:03:26 +01:00
Noel O'Brien
40088602c9 cmd/hello: remove hello.ipn.dev (#19567)
Fixes #19566

Signed-off-by: Noel O'Brien <noel@tailscale.com>
2026-04-28 17:54:29 -07:00
Brad Fitzpatrick
b2d4ba04b6 tstest/natlab/vmtest: add macOS VM support using Tart base images
Add macOS VM support to the vmtest framework using Tart's pre-built
macOS images (ghcr.io/cirruslabs/macos-tahoe-base) instead of building
from IPSW. The Tart image has SIP disabled and SSH enabled.

At test time, the Tart base image's disk, NVRAM, and hardware identity
are APFS-cloned into a tailmac-compatible directory layout, and the VM
is booted headlessly via tailmac's Host.app (Virtualization.framework)
with its NIC connected to vnet's dgram socket.

New features:
- tailmac.go: ensureTartImage (auto-pull), cloneTartToTailmac (format
  conversion), startTailMacVM (launch + cleanup)
- NoAgent() node option for VMs without TTA installed
- LANPing() for ICMP reachability testing via TTA's /ping endpoint
- IsMacOS field on OSImage, with GOOS/GOARCH support
- Dgram socket listener in Start() for macOS VMs
- Fix ReadFromUnix error spam on dgram socket close in vnet

TestMacOSAndLinuxCanPing verifies a macOS Tart VM and a gokrazy Linux
VM can ping each other on the same vnet LAN.

Updates #13038

Change-Id: I5e73a27878abf009f780fdf11a346fc857711cff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 12:51:40 -07:00
Brad Fitzpatrick
ec7b11d986 tstest/natlab/vmtest, cmd/tta: add TestTaildrop
Add a vmtest that brings up two Ubuntu nodes, each behind its own
EasyNAT, joined to the tailnet. The sender pushes a small file via
"tailscale file cp" and the receiver fetches it via "tailscale file
get --wait", asserting that the filename and contents round-trip
unchanged.

To make Taildrop work in vmtest, three small pieces were needed:

The Linux/FreeBSD cloud-init now starts tailscaled with --statedir as
well as --state=mem:, so the daemon has a VarRoot to host Taildrop's
incoming-files directory. State itself remains in-memory (so nothing
persists across reboots); only the var-root scratch space is on disk.

vmtest.New grows a variadic EnvOption parameter and a SameTailnetUser
helper. When the option is passed, Start sets AllNodesSameUser=true
on the embedded testcontrol.Server. Cross-node Taildrop requires the
sender and receiver to share a Tailnet user (or have an explicit
PeerCapabilityFileSharingTarget granted between them, which we don't
plumb here), so TestTaildrop opts in. Existing tests don't.

cmd/tta gains /taildrop-send and /taildrop-recv handlers that wrap
"tailscale file cp" and "tailscale file get --wait", plus
Env.SendTaildropFile and Env.RecvTaildropFile helpers in vmtest that
drive them.

Updates #13038

Change-Id: I8f5f70f88106e6e2ee07780dd46fe00f8efcfdf1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 12:27:55 -07:00
Brad Fitzpatrick
4b8e0ede6d tstest/natlab/{vmtest,vnet}, cmd/tta: add TestMullvadExitNode
Add a vmtest that brings up a Tailscale client, an Ubuntu VM acting
as a Mullvad-style plain-WireGuard exit node, and a non-Tailscale
webserver, each on its own NAT'd vnet network with a distinct WAN
IP. The test exercises Tailscale's IsWireGuardOnly peer code path:
the way the control plane wires Mullvad exit nodes into a client's
netmap, including the per-client SelfNodeV4MasqAddrForThisPeer
source-IP rewrite that lets a Tailscale CGNAT IP egress through a
plain-WireGuard tunnel that has no idea what Tailscale is.

The mullvad VM doesn't run wireguard-tools or kernel WireGuard;
instead, a new TTA endpoint /wg-server-up creates a real Linux TUN
named wg0, drives it with wireguard-go (already vendored), and
configures the kernel side (ip addr/up, ip_forward, iptables NAT
MASQUERADE) so decrypted traffic from the peer egresses with the
mullvad VM's WAN IP. Userspace vs kernel WireGuard makes no
difference on the wire — what's being tested is Tailscale's
plain-WireGuard exit-node code path, not the kernel module — and
this lets the test avoid downloading and installing .deb packages
inside the VM.

Adds Env.BringUpMullvadWGServer (calls /wg-server-up, returns the
generated WG public key as a key.NodePublic), Env.SetExitNodeIP
(EditPrefs ExitNodeIP directly, for exit nodes whose IPs aren't
discoverable via TTA), Env.ControlServer (exposes the underlying
testcontrol.Server so tests can UpdateNode / SetMasqueradeAddresses
to inject custom peers), and Env.Status (fetches a node's tailscale
status, used to read the client's pubkey so we can pin it as the
WG server's only allowed peer).

The test verifies that the webserver's echoed source IP is the
client's WAN with no exit node selected, the mullvad VM's WAN with
the WG-only peer selected as exit, and the client's WAN again after
clearing.

Updates #13038

Change-Id: I5bac4e0d832f05929f12cb77fa9946d7f5fb5ef1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 11:31:48 -07:00
Andrew Lytvynov
da0a277565 client/web: fail /api/routes requests with empty flags (#19548)
If both ExitNode and AdvertiseRoutes flags are empty, then the request
is invalid and should fail. Previously it would wipe out any existing
values configured for these prefs because of the assumption in the
handler that exactly one of them is set.

Updates https://github.com/tailscale/corp/issues/40851

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-28 11:16:47 -07:00
Brad Fitzpatrick
f7f8b0a0a5 cmd/tailscale/cli: drive "file cp" progress and offline warning from peerAPI
The Online bit in PeerStatus comes from control's last-known state and
can lag reality, so gating "tailscale file cp" on it is both unreliable
and pushes correctness onto the server. Just try the push directly.

In runCp, when the target's PeerStatus says it's offline, no longer
fail upfront; getTargetStableID returns the StableID anyway. Replace
the static "is offline" warning with a 3-second timer armed for the
first file: if the timer fires before peerAPI bytes have flowed, we
print a warning to stderr. The wording depends on whether control
reported the peer offline ("is reportedly offline; trying anyway") or
online ("is not replying; trying anyway"). The warning is printed with
a leading vt100 clear-line and a trailing newline so it doesn't get
painted over by the progress redraw and so the next progress redraw
lands on a fresh line below it.

Both the timer disarm and the progress display now read from
tailscaled's OutgoingFile.Sent (subscribed via WatchIPNBus) instead of
the local-body counter. That's the difference between bytes-acked-by-
local-tailscaled (what countingReader.n was measuring; useless for
detecting an unreachable peer because for small files net/http buffers
the entire body into the unix-socket conn before the peerAPI dial has
even started) and bytes-pulled-toward-peerAPI (what tailscaled is
actually doing, reflected in OutgoingFile.Sent). The previous code
reported 100% within milliseconds for a 3 KiB file even when the peer
was unreachable.

Add --update-interval (default 250ms) to control the progress repaint
cadence; zero or negative disables the progress display entirely. The
printer now also stops repainting once it observes Sent at full size
with a near-zero rate for >2s, so a stuck transfer doesn't keep
clobbering whatever the rest of runCp is trying to print.

Updates #18740

Change-Id: I189bd1c2cd8e094d372c4fee23114b1d2f8024b4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 11:03:58 -07:00
Brad Fitzpatrick
88cb6f58f8 tool/updateflakes, cmd/nardump: replace update-flake.sh with Go tool
Consolidate go.mod.sri and go.toolchain.rev.sri into a single
flakehashes.json file at the repo root, owned by a new Go program at
tool/updateflakes. The JSON is consumed by flake.nix via
builtins.fromJSON and by any future Go code via the FlakeHashes
struct that defines its schema.

Each block records its input fingerprint alongside the SRI it
produced: the goModSum (a sha256 over go.mod and go.sum) for the
vendor block, and the literal rev string from go.toolchain.rev for
the toolchain block. updateflakes regenerates a block only when its
recorded fingerprint disagrees with the current input.

Doing the gating by content rather than file mtimes avoids the usual
mtime hazards across git checkouts, clones, and merges. It also
means re-runs with no input changes are essentially free, and a
re-run that touches only one input pays only for that one block.

The two blocks have no shared state -- vendor invokes go mod vendor
into one tempdir, toolchain fetches and extracts a tarball into
another -- so they run concurrently via errgroup. Cold time is
bounded by the slower of the two rather than their sum.

Also takes the opportunity to fold the toolchain fetch into a single
curl|tar pipeline (no intermediate .tar.gz on disk).

Split cmd/nardump into a thin package main and a new package nardump
library at cmd/nardump/nardump that holds the NAR encoder and SRI
helper. tool/updateflakes imports the library directly rather than
building and exec'ing the nardump binary at runtime. The library
uses fs.ReadLink (Go 1.25+) instead of os.Readlink, so it no longer
requires the caller to chdir into the FS root for symlink targets to
resolve. WriteNAR now wraps its writer in a bufio.Writer internally
(unless the caller already passed one) and flushes on return, so
callers don't pay for tiny writes against slow underlying writers.

The cache-busting line in flake.nix and shell.nix is known to live
at end of file, so updateCacheBust walks the lines in reverse.

make tidy timings on this machine, before: ~14s every run.
After:

  warm (no input changes):       0.05s
  vendor block stale only:       1.4s
  toolchain block stale only:    5.0s
  cold (no flakehashes.json):    5.0s

Updates #6845

Change-Id: I0340608798f1614abf147a491bf7c68a198a0db4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 10:18:32 -07:00
Andrew Dunham
33714211c8 net/dns: use os.Root to prevent path traversal in darwin resolver
The darwinConfigurator writes split DNS resolver files to
/etc/resolver/$SUFFIX using os.WriteFile with string concatenation.
A crafted MatchDomain value containing path traversal sequences
(e.g. "../evil") could write files outside the resolver directory.

Use os.OpenRoot to confine all file operations in SetDNS and
removeResolverFiles to the resolver directory. os.Root rejects any
path component that escapes the root, returning an error instead of
following the traversal.

Also parametrize the resolver directory path on the struct to enable
testing with t.TempDir(), and add tests.

As far as I can tell, this would require a malicious controlplane to
exploit, but still worth fixing.

Updates tailscale/corp#39751

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
2026-04-28 11:08:22 -04:00
Brad Fitzpatrick
b9eac14ef9 tstest/natlab/vmtest: add web UI for watching VM tests live
Add an optional --vmtest-web flag that starts an HTTP server showing a
live dashboard for vmtest runs. The dashboard includes:

- Step progress tracker showing all test phases (compile, image prep,
  QEMU launch, agent connect, tailscale up, test-specific steps)
  with status icons and elapsed times
- Per-VM "virtual monitor" cards showing serial console output
  streamed in realtime via WebSocket
- Per-NIC DHCP status (supporting multi-homed VMs like subnet routers)
- Per-node Tailscale status (hidden for non-tailnet VMs)
- Test status badge (Running/Passed/Failed) with live elapsed timer
- Event log showing all lifecycle events chronologically

Architecture follows the existing util/eventbus HTMX+WebSocket pattern:
the server pushes HTML fragments with hx-swap-oob attributes over a
WebSocket, and HTMX routes them to the correct DOM elements by ID.

Key components:
- vmstatus.go: Step tracker (Begin/End lifecycle), EventBus (pub/sub
  with history for late joiners), VMEvent types, NodeStatus tracking
- web.go: HTTP server, WebSocket handler, template loading, ANSI-to-HTML
  conversion via robert-nix/ansihtml, deterministic port selection
- assets/: HTML templates, CSS, HTMX library (copied from eventbus)
- vnet/vnet.go: DHCP event callback on Server for observing DHCP lifecycle
- qemu.go: Console log file tailing with manual offset-based reading

Usage:
  go test ./tstest/natlab/vmtest/ --run-vm-tests --vmtest-web=:0 -v

When using :0, a deterministic port based on the test name is tried
first so re-runs get the same URL, falling back to OS-assigned on
conflict.

Updates #13038

Change-Id: I45281347b3d7af78ed9f4ff896033984f84dcb4d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 07:46:04 -07:00
Alex Chan
0ac09721df tka: reduce boilerplate code in the tests
Updates #cleanup

Change-Id: Id69d509f5e470fb5fb50b5c5c4ca61f000389c53
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-28 16:42:48 +02:00
Brad Fitzpatrick
cb239808a6 tstest/natlab/vmtest: add --test-version flag
Add a --test-version flag to run the natlab VM tests against
released tailscale/tailscaled binaries downloaded from
pkgs.tailscale.com instead of building from the source tree.

The value can be a concrete release like "1.97.255", or "stable" /
"unstable" which resolve to the latest TarballsVersion on that track
via pkgs.tailscale.com/<track>/?mode=json. The track for a concrete
version is derived from its minor (even=stable, odd=unstable). The
host architecture (amd64 or arm64) selects the tarball.

Tarballs are cached + extracted under
~/.cache/tailscale-vmtest/builds/<version>_<arch>/ so they are not
re-fetched per test. tta is still always built from the local tree.
Cloud VMs (Ubuntu, Debian) pick up the downloaded binaries via the
existing files.tailscale file server. Non-Linux GOOS (FreeBSD) falls
back to building from source since pkgs.tailscale.com only ships
Linux tarballs. Gokrazy nodes continue to use binaries baked into
the gokrazy image; --test-version is a no-op for them.

Updates #13038

Change-Id: I213ef7db362dd17bf69d2685cbf2ab0ec5a3fee1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-28 06:59:26 -07:00
Daniel Pañeda
7735b15de3 cmd/k8s-operator: truncate long label values in metrics resources (#18895)
* cmd/k8s-operator: truncate long label values in metrics resources

Kubernetes label values have a 63-character limit, but resource names
can be up to 253 characters. When a Service or Ingress with a long
name is exposed via Tailscale, the operator fails to reconcile because
it uses the parent resource name directly as label values on metrics
Services.

Truncate label values that may exceed the limit by keeping the first
54 characters and appending a SHA256-based hash suffix to preserve
uniqueness.

Fixes #18894

Signed-off-by: Daniel Pañeda <daniel.paneda@clickhouse.com>
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/k8s-operator: move TruncateLabelValue to shared k8s-operator package

Move the label truncation helper to k8s-operator/utils.go so it can be
reused by other components that need to produce valid Kubernetes labels.

Signed-off-by: Daniel Pañeda <daniel.paneda@clickhouse.com>
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/k8s-operator: truncate long domain label values in cert resources

Applies TruncateLabelValue to certResourceLabels in order to prevent API
server validation failures. This covers both the HA Ingress and kube-apiserver
proxy reconcilers, as both flow through certResourceLabels.

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

* cmd/k8s-operator: remove empty metrics_resources_test.go, use hyphens in test names to satisfy go vet

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>

---------

Signed-off-by: Daniel Pañeda <daniel.paneda@clickhouse.com>
Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-04-28 14:11:59 +01:00
Kristoffer Dalby
384b7fb561 release/dist/qnap: preserve .codesigning files as build artifacts
Stop deleting .qpkg.codesigning files in build-qpkg.sh and include
them in the returned artifact list from buildQPKG.

These files contain the last 32 characters of the base64-encoded CMS
signature produced by QDK code signing. They are consumed by pkgserve
to populate <signature> entries in the QNAP repository XML, matching
the format used by myqnap.org and qnapclub.eu.

Updates corp#33203

Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
2026-04-28 12:29:56 +01:00
Will Norris
2d85f37f39 client/systray: support several different color themes
Currently we only have a dark theme icon with white and grey dots over
a black background. For some desktops, a logo with black and grey dots
over a white background might be preferable. And for desktops where the
bar is *almost* black or white, but not quite, an option to render the
logo with dots only and no background can look really nice.

Add a new -theme flag to the systray command with the default staying
the same as it is today.

Updates #18303

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-04-27 18:54:14 -07:00
License Updater
325f52c654 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-04-27 18:38:06 -07:00
Brad Fitzpatrick
d0ae993334 tstest/natlab/vmtest: add more subnet router tests
Add two tests building on TestExitNode's framework:

TestSubnetRouterPublicIP brings up a client, a subnet router, and a
webserver, each on its own NAT'd network with distinct WAN IPs. The
subnet router advertises the webserver's network as a route. The test
toggles the client's --accept-routes preference and asserts that the
webserver's echoed source IP switches between the client's own WAN
(direct dial) and the subnet router's WAN (forwarded through the
router and SNAT'd).

TestSubnetRouterAndExitNode adds a fourth node, an exit node that
advertises 0.0.0.0/0 + ::/0, and uses a table-driven layout with
subtests to cover the four combinations of (exit on/off, subnet
on/off). The case where both are on confirms longest-prefix match
wins: the subnet router's /24 takes precedence over the exit node's
/0. The exit node itself is configured with --accept-routes=off so
that, in the exit-only case, it forwards directly to the simulated
internet rather than re-routing the forwarded traffic via the subnet
router (which would otherwise mask the exit node's WAN as the
observed source).

Adds an Env.SetAcceptRoutes helper for toggling the RouteAll pref via
EditPrefs, used by both tests.

Updates #13038

Change-Id: Ifc2726db1df2f039c477c222484f535bebc40445
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 17:06:17 -07:00
Brad Fitzpatrick
c0e6ffed0d tstest/tailmac: add NIC hot-swap, disconnected NIC, and screenshot server
Add NIC attachment hot-swap support to Host.app: VZNetworkDevice.attachment
is writable at runtime, so --disconnected-nic creates a NIC with no
attachment, and --attach-network hot-swaps it to a vnet dgram socket
after boot/restore. macOS detects link-up and does DHCP.

Refactor TailMacConfigHelper: extract createDgramAttachment() and
createDisconnectedNetworkDeviceConfiguration() from the monolithic
createSocketNetworkDeviceConfiguration().

Add --screenshot-port flag for headless mode. Host.app serves GET
/screenshot as JPEG via a localhost HTTP server, capturing the
VZVirtualMachineView via CGWindowListCreateImage. The Go test harness
polls these to push live thumbnails to the web dashboard.

Also: SIGINT handler in headless mode for clean VM state save.

Updates #13038

Change-Id: I42fba0ecd760371b4ec5b26a0557e3dd0ba9ecae
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 17:03:09 -07:00
Brad Fitzpatrick
5c1738fd56 tstest/natlab/{vmtest,vnet}, cmd/tta: add TestExitNode
Add a vmtest TestExitNode that brings up a client, two exit nodes, and a
non-Tailscale webserver, each on its own NAT'd vnet network with a
distinct WAN IP. The test cycles the client's exit node setting between
off, exit1, and exit2 and asserts that the webserver echoes the expected
post-NAT source IP for each.

Three pieces were needed to make this work:

vnet now forwards TCP between simulated networks at the packet level,
mirroring the existing UDP path. When a guest VM sends TCP to another
simulated network's WAN IP, the source network's gateway rewrites src
via doNATOut and routeTCPPacket hands the packet off to the destination
network, which rewrites dst via doNATIn and writes the rewritten frame
onto the destination LAN. The TCP stacks of the two guest VM kernels
talk end-to-end; vnet just NATs the IP/port headers in flight, so all
TCP semantics (handshakes, options, sequence numbers, payload) are
preserved without a gvisor TCP termination in the middle. Adds a
focused TestInterNetworkTCP that exercises this path without any
Tailscale machinery.

cmd/tta binds its outbound dial to the default route's interface using
SO_BINDTODEVICE. Without that, the moment tailscaled installs
0.0.0.0/0 → tailscale0 in response to setting an exit node, TTA's
existing TCP connection to test-driver gets rerouted through the exit
node. From the test driver's perspective the connection's packets then
arrive with the exit node's WAN IP as the source rather than the
client's, so they don't match the existing flow and the connection is
dead — manifesting in the test as a hang on EditPrefs (which had
actually completed in milliseconds on the daemon side, but whose
response never made it back). Pinning the socket to the underlying NIC
keeps TTA's agent connection on a real interface regardless of any
policy routing tailscaled installs later. We bind rather than carry the
Tailscale bypass fwmark because the fwmark approach is conditional on
tailscaled having configured SO_MARK-based policy routing, while
binding is unconditional.

vmtest grows an Env.SetExitNode helper that sets ExitNodeIP via
EditPrefs through the agent, used by the new test.

Updates #13038

Change-Id: I9fc8f91848b7aa2297ef3eaf71fed9d96056a024
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 16:54:20 -07:00
Alex Chan
10b63f27ce tstest/clock: explain what happens if you don't set a Start time
While working on #19444, I assumed that omitting `Start` would return a
clock that started at January 1, year 1, because that's the zero value
for a `time.Time`, but actually it uses the current UTC time instead.

This behaviour is non-obvious, so document it.

Updates #cleanup

Change-Id: Id91400778578655953ff3e1671ce470db97cfe91
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-04-28 00:15:46 +02:00
Brad Fitzpatrick
ad5436af0d tstest/largetailnet, tstest/integration/testcontrol: add in-process large-tailnet benchmark
Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server]
running in the test process) against a synthetic large initial netmap and
a stream of caller-driven peer add/remove deltas, all in-process.

The harness is split in two parts:

  - tstest/largetailnet, a reusable package containing a [Streamer]
    that hijacks the map long-poll on a [testcontrol.Server] via the new
    AltMapStream hook, sends one initial MapResponse with N synthetic
    peers, and forwards caller-supplied delta MapResponses on the same
    stream. Helpers like MakePeer / AllocPeer build synthetic peers with
    unique IDs and addresses derived from the Tailscale ULA range.

  - tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet
    (headless tailscaled workload, no IPN bus subscriber) and
    BenchmarkGiantTailnetBusWatcher (GUI-client workload with one
    Notify subscriber attached). Both are gated on
    --actually-test-giant-tailnet (skipped by default), stand up an
    in-process testcontrol + tsnet.Server, let Up block until the
    initial N-peer netmap has been processed, then ResetTimer and run
    add+remove pairs via b.Loop. Per-delta sync is via a test-only
    [ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes
    once the just-added peer key appears in the netmap (no-watcher
    variant) or via bus-Notify drain (bus-watcher variant).

To support the hijack, [testcontrol.Server] grows an AltMapStream hook
and a small MapStreamWriter interface for benchmarks/stress tests that
need to drive a controlled MapResponse sequence; the normal serveMap
path is untouched when AltMapStream is nil. The streamer answers
non-streaming "lite" map polls (which controlclient issues before the
streaming long-poll to push HostInfo) with an empty MapResponse and
returns immediately, so the streaming poll that follows is the one
that gets the initial netmap.

The benchmark is intended for before/after comparisons of netmap- and
delta-handling changes targeted at large tailnets. CPU profiles on
unmodified main show the expected O(N) hotspots:
setControlClientStatusLocked / authReconfigLocked /
userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the
full Notify.NetMap to bus watchers (which dominates the BusWatcher
variant).

Median ms/op over 10 runs on unmodified main, by tailnet size N:

       N      no-watcher   bus-watcher
   10000          32          166
   50000         222          865
  100000         504         1765
  250000        1551         4696

Recommended invocation:

	go test ./tstest/largetailnet/ -run=^$ \
	    -bench='BenchmarkGiantTailnet(BusWatcher)?$' \
	    -benchtime=2000x -timeout=10m \
	    --actually-test-giant-tailnet \
	    --giant-tailnet-n=250000 \
	    -cpuprofile=/tmp/giant.cpu.pprof

Updates #12542

Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 11:47:12 -07:00
Mike O'Driscoll
33342aec32 The connmark save/restore rules in mangle/PREROUTING restore the Tailscale bypass fwmark (0x80000) onto reply packets so that rp_filter's reverse-path check routes through the main table instead of table 52. However, the kernel only uses the packet's fwmark during the rp_filter lookup when net.ipv4.conf.all.src_valid_mark=1. (#19537)
On systems where this sysctl defaults to 0 (including GCP VMs), rp_filter performs its lookup with fwmark=0, hits rule 5270 then table 52 and routes to 0.0.0.0/0 dev tailscale0, and drops every reply packet arriving on the physical interface as a martian. This breaks all connectivity when using an exit node: DERP, DNS, control plane, and even the cloud metadata service.

Set src_valid_mark=1 when enabling the connmark rules so the rp_filter workaround actually works in these cases.

Updates #3310
Updates tailscale/corp#37846

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-04-27 13:52:45 -04:00
Brad Fitzpatrick
0e10a3f580 net/tsdial, ipn/localapi, client/local: let clients dial non-Tailscale addresses directly
Add a tsdial.Dialer.UserDialPlan method that resolves an address and
reports whether the dialer would route it via Tailscale. The LocalAPI
/dial handler now uses this to skip proxying for addresses that aren't
Tailscale routes (e.g. localhost), returning a Dial-Self response with
the resolved address so the client can dial it directly. This avoids
an unnecessary round-trip through the daemon for local connections.

The client's UserDial handles the new response by dialing the resolved
address itself, and the server passes the pre-resolved IP:port for
Tailscale dials to avoid redundant DNS lookups.

Thanks to giacomo and Moyao for pointing this out!

Updates tailscale/corp#39702

Change-Id: I78d640f11ccd92f43ddd505cbb0db8fee19f43a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-27 09:33:27 -07:00
Andrew Lytvynov
649781df84 util/pidowner: remove unused package (#19521)
Added in 2020, this appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-27 09:25:46 -07:00
Andrew Lytvynov
a70629eae3 util/topk: remove unsued package (#19524)
Added in 2024 and appears unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-27 09:13:40 -07:00
Andrew Lytvynov
346d6bb04c util/sysresources: remove unused package (#19523)
Added a few years ago and appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-27 09:13:30 -07:00
Andrew Lytvynov
64bb40b45b util/pool: remove unused package (#19522)
Added in 2024 and appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-27 09:13:14 -07:00
BeckyPauley
7477a6ee47 cmd/k8s-operator: use dynamic resource names in e2e ingress tests (#19536)
Replace hardcoded resource names with dynamically generated names in
k8s-operator-e2e ingress tests to avoid collisions with stale resources.

Updates #tailscale/corp#40612

Signed-off-by: Becky Pauley <becky@tailscale.com>
2026-04-27 13:40:46 +01:00
Evan Lowry
3a05c450ce posture: add HealthTracker for serial number retrieval (#19181)
Device posture checking can fail while enabled if tailscaled does not
have access to smbios. Previously, this was only observable by looking
in the tailscaled logs.

Fixes tailscale/corp#39314

Signed-off-by: Evan Lowry <evan@tailscale.com>
2026-04-25 15:42:47 -03:00
Brad Fitzpatrick
f3b2f9b0ef all: fix duplicate package docs and tighten TestPackageDocs
TestPackageDocs walked into directories starting with "." (such as
.claude worktrees) and only logged warnings on duplicate package docs
across files in a directory. Skip dot-directories (which covers the
old .git but also .claude), ignore files with "//go:build ignore" so
command files don't falsely trip the duplicate check, and promote the
duplicate-doc warning to a t.Errorf.

While here, deduplicate the package docs that were previously only
logged: drop the redundant comment from client/systray/startup-creator.go,
move the comprehensive taildrop doc into feature/taildrop/doc.go, and
remove a leftover doc fragment from feature/condlite/expvar/omit.go.

The tstest/integration/vms allowlist is no longer needed since the
//go:build ignore filter now handles its dns_tester.go and udp_tester.go
files generically.

Fixes #19526

Change-Id: Id794d96bd728826a1883a054e4a244f90fa05d3d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-24 19:01:43 -07:00
Andrew Lytvynov
873b8b8e2e maths: remove unused package (#19516)
Added in 2025 and appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-24 16:17:10 -07:00
Andrew Lytvynov
d64ed4af89 util/expvarx: remove unused package (#19519)
Added in 2024 and appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-24 16:16:42 -07:00
Andrew Lytvynov
4195e34f79 util/cstruct: remove unused package (#19518)
Added in 2022 and appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-24 16:09:54 -07:00
Andrew Lytvynov
323198b348 envknob/logknob: remove unused package (#19515)
Added in 2023 and appears to be unused.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-24 15:48:06 -07:00
James Tucker
1b40911611 wgengine/netstack: absorb all quad-100 traffic locally, never leak to peers
Previously, handleLocalPackets intercepted traffic to the Tailscale
service IP (100.100.100.100 / fd7a:115c:a1e0::53) only for an allow-list
of ports: TCP 53/80/8080 and UDP 53. Any other port returned
filter.Accept, letting the packet fall through to the ACL filter and
wireguard-go, which would attempt a peer lookup. No peer owns the
quad-100 AllowedIP, so after ~5s pendopen.go would log:

    open-conn-track: timeout opening ...; no associated peer node

This is the common "conntrack error no peer found for 100.100.100.100:853"
log spam seen in the wild (e.g. from systemd-resolved or another
resolver speculatively trying DoT on quad-100). It also leaks quad-100
packets onto the tailnet.

Remove the port allow-list so handleLocalPackets absorbs every quad-100
packet into netstack regardless of IP protocol or port. Traffic never
reaches the conntrack / peer-routing layers.

With the allow-list gone, acceptTCP needs a corresponding guard: on a
quad-100 TCP port we don't serve, execution used to fall through to the
isTailscaleIP case (quad-100 is in the tailscale IP range), which
rewrote the dial target to 127.0.0.1:<port> and forwardTCP'd the
connection to whatever happened to be listening on the host's loopback
at that port. Add a hittingServiceIP case that RSTs cleanly instead,
placed before the isTailscaleIP fallthrough.

TestQuad100UnservedTCPPortDoesNotForward is a new integration test that
injects a TCP SYN to 100.100.100.100:853 via handleLocalPackets, stubs
forwardDialFunc, and asserts the dialer is not invoked; it catches
regressions of the acceptTCP recursion/loopback-redirection case.

Fixes #15796
Fixes #19421
Updates #3261
Updates #11305

Signed-off-by: James Tucker <james@tailscale.com>
2026-04-24 12:42:16 -07:00
Brad Fitzpatrick
006d7e180e version: use debug.ReadBuildInfo in CmdName on non-Windows
CmdName was re-opening the running executable and scanning it in
64KiB chunks for the Go modinfo markers on every call. The same
modinfo is already parsed at startup and exposed via
runtime/debug.ReadBuildInfo, so prefer that on non-Windows. Windows
still takes the scanning path because its GUI-binary override keys
off the on-disk executable name.

benchstat of BenchmarkCmdName (Linux, before vs after):

    goos: linux
    goarch: amd64
    pkg: tailscale.com/version
    cpu: Intel(R) Xeon(R) 6975P-C
               │  /tmp/old.txt  │            /tmp/new.txt             │
               │     sec/op     │   sec/op     vs base                │
    CmdName-16   556045.5n ± 1%   825.6n ± 1%  -99.85% (p=0.000 n=10)

               │ /tmp/old.txt  │             /tmp/new.txt             │
               │     B/op      │     B/op      vs base                │
    CmdName-16   64.587Ki ± 0%   1.156Ki ± 0%  -98.21% (p=0.000 n=10)

               │ /tmp/old.txt │            /tmp/new.txt            │
               │  allocs/op   │ allocs/op   vs base                │
    CmdName-16     8.000 ± 0%   7.000 ± 0%  -12.50% (p=0.000 n=10)

Fixes #19486

Change-Id: I925c5e28b64815a602459beb6c8dab8779339a6c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-24 09:48:11 -07:00
Fran Bull
306fab796c feature/conn25: add the ability to return addresses to the IP Pools
This will be used as part of the address assignment expiry work.

Updates tailscale/corp#39975

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-04-24 08:48:48 -07:00
kari-ts
aa740cb393 ipnlocal/drive: reduce noisey per-peer remote logs (#19493)
This drops the per peer "appending remote" log while constructing the remote list, which can get noisy on big tailnets, and keeps logs around remote availability checks, including whether a peer is missing, offline, lacks PeerAPI reachability, lacks sharing permission, or is available.

Updates tailscale/corp#40580

Signed-off-by: kari-ts <kari@tailscale.com>
2026-04-24 08:26:33 -07:00
Andrew Lytvynov
ad9e6c1925 go.mod: bump github.com/google/go-containerregistry (#19500)
This drops an indirect dependency on the old github.com/docker/docker
(which was replaced with github.com/moby/moby) and fixes a couple recent
CVEs.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-04-23 10:39:27 -07:00
Claus Lensbøl
ee76a7d3f8 wgengine/magicsock: do not send TSMP disco when connected (#19497)
When there is an active connection between devices, do not send new
disco keys via TSMP.

Updates #12639

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-23 12:23:57 -04:00
Brad Fitzpatrick
a7d8aeb8ae misc/genreadme,tempfork/pkgdoc,tsnet: generate README.md files from godoc
Adds a CI check to keep opted-in directories' README.md files in sync
with their package godoc. For now tsnet (and its sub-packages under
tsnet/example) is the only opted-in tree. The list of directories
lives in misc/genreadme/genreadme.go as defaultRoots, so CI and humans
both just run `./tool/go run ./misc/genreadme` with no arguments.

The check piggybacks on the existing go_generate job in test.yml and
fails if any README.md is out of date, pointing the user at the same
command.

Along the way:

 - tempfork/pkgdoc now emits Markdown instead of plain text: headings
   become level-2 with no {#hdr-...} anchors, and [Symbol] doc links
   resolve to pkg.go.dev URLs, including for symbols in the current
   package (which the default Printer would otherwise emit as bare
   #Name fragments with no backing anchor in a README). Parsing no
   longer uses parser.ImportsOnly, so doc.Package knows the package's
   symbols and can resolve [Symbol] links at all.

 - genreadme also emits a pkg.go.dev Go Reference badge at the top of
   a library package's README; suppressed for package main.

 - tsnet/tsnet.go's package godoc is expanded in idiomatic godoc
   syntax — [Type], [Type.Method], reference-style [link]: URL
   definitions — rather than Markdown-flavored [text](url) or
   backtick-quoted identifiers, so that both pkg.go.dev and the
   generated README.md render cleanly from a single source.

Fixes #19431
Fixes #19483
Fixes #19470

Change-Id: I8ca37e9e7b3bd446b8bfa7a91ac548f142688cb1
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Walter Poupore <walterp@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-22 15:13:09 -07:00