When MagicDNS is enabled but no global upstream resolvers are configured,
the forwarder only handles specific suffixes and defers other names to the
system resolver. A query it has no resolver for is expected in that case, so
don't raise the dns-forward-failing warning unless a default "." route makes
Tailscale the default resolver.
Fixes#19931
Signed-off-by: Brendan Creane <bcreane@gmail.com>
This resolves a local privilege escalation (LPE). Prior to this change,
a non-admin user could utilize serve to access local Unix sockets they
otherwise should not be able to access. For example,
tailscale serve --http 80 unix:/var/run/docker.sock
would give the user access to the Docker socket (usually root only).
This works because tailscaled has root access and implements the proxy
to the socket (see also: 'the confused deputy problem').
We resolve the problem by refusing to serve Unix targets altogether
unless instructed to by a root user.
Thanks to Tim Sageser (dtrsecurity) for this report.
Fixestailscale/corp#41998
Signed-off-by: Harry Harpham <harry@tailscale.com>
The test was asserting that a tailnet ping between two nodes traversed
DERP rather than going direct. But that wasn't really the point of the test,
and I kept forgetting ways that magicsock could find direct paths and
thus break this test.
So loosen it.
We really just want to see whether DERP worked at all and was used in the process
of getting a ping through, whether it was direct or not.
And that "tailscale debug derp" worked at all, which was what the bug
was about to begin with.
No need for all the "must be over DERP" stuff.
Updates #15579
Change-Id: I70ca63dc10919efa3d193b7af1d31a4a3b9d3950
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
vnet only ever sent IPv6 RAs in response to a Router Solicitation. In
practice this meant gokrazy VMs running with a dual-stack LAN never
installed vnet's IPv6 default route: gokrazy brings the link up via
DHCPv4 and the kernel never emits an RS on its own under that init
path. Off-link IPv6 destinations like the fake DERP servers were
therefore unreachable from any gokrazy test node that also had v4
on the same interface. (Pure-v6 nodes happened to work because the
kernel sends an RS as part of v6-only autoconf.)
Fix this in two complementary ways:
- Send an unsolicited RA every 5s to the link-local all-nodes group
on every v6-enabled network. This matches what real routers do
(RFC 4861 §6.2.1, MaxRtrAdvInterval; we use a much shorter
interval than the spec's 200s default so short-lived tests don't
have to wait).
- Send a unicast RA to a newly-registered MAC as soon as a client
first transmits on the wire. Without this the first periodic RA
can land before any VM has connected and the next one isn't
until the next tick, which can be longer than the test runs.
Factor the RA serialization out into buildIPv6RouterAdvertisement so
the solicited, periodic, and per-client paths all share one body.
Update TestSelfSignedDERPHashPinning to use a dual-stack hard-NAT
builder and assert zero errors from DebugDERPRegion (instead of
filtering "over IPv6" errors as it had to before this change). The
new builder also sets TS_DEBUG_STRIP_ENDPOINTS=1 on tailscaled so
disco can't find a direct path: without endpoint stripping, the now-
working non-NATted IPv6 LAN gives the two hard-NAT'd nodes a direct
route, defeating the test's "must traverse DERP" assertion. (Hard
NAT alone was enough before this change because v6 routing was
broken.) Also update sendBetweenClients in the vnet unit tests to
tolerate the new on-register RA noise on its read path.
Updates #13038
Updates #19973
Change-Id: Ic281dc53702a25fa773c46313f453837814233e8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
serveDebugDERPRegion built its TLS config with
ServerName: cmp.Or(derpNode.CertName, derpNode.HostName), which for a
"sha256-raw:<hex>" CertName passed the raw fingerprint to Go's stock
verifier as a hostname; the handshake always failed with a hostname
mismatch. This is the second half of #15579; the first half (tailscaled
itself failing with "unexpected multiple certs presented") was fixed in
Extract a tlsConfigForNode helper that mirrors derphttp.Client.tlsClient
so that sha256-raw and domain-fronting CertName values are dispatched
to tlsdial.SetConfigExpectedCertHash and tlsdial.SetConfigExpectedCert
respectively, falling back to HostName when CertName is empty.
The core fix here was originally written by @imnuke in #19965; that PR
also added a unit test in ipn/localapi/debugderp_test.go which is
replaced in this commit by a new vmtest that exercises the whole stack:
vnet now serves a self-signed cert valid for each fake DERP node's
HostName and exposes its SHA-256 fingerprint, and vmtest grows a new
SelfSignedDERPCertPinning EnvOption that swaps the test DERP map's
nodes to CertName="sha256-raw:<hex>" with InsecureForTests cleared.
TestSelfSignedDERPHashPinning then stands up two hard-NAT'd nodes, has
them communicate over DERP, and calls DebugDERPRegion on each. Before
this fix the test fails with the exact x509 hostname-mismatch error
from the original bug; after, it passes.
Updates #15579
Change-Id: I61f38ffebc7ac5abc962639db1ae88f5cd8633b1
Co-authored-by: Nuke <nuke@imnuke.dev>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Commit 2b338dd6a8 removed watchdogEngine because it was weird
(so many methods) and increasingly unnecessary after we'd cleaned up
and simplified so much of the locking.
This adds back a watchdog, but an easier to maintain one that's more
idiomatic.
Updates #19759
Change-Id: I86c458473e126c0809f37696446ce7acf4cc4eb9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add support for --strip option to strip symbols.
Building a rather custom binary with custom flags needs some additional work, and thought to contribute this back up.
Signed-off-by: Jamie Sinn <james.sinn@sinndevelopment.com>
In order to allow us to measure the performance effects of client-side netmap
caching, both with and without the feature enabled, add logs to record how long
it takes after a client restart or profile switch for the node to establish
contact with peers, relative to the first uncached netmap.
We do this by keeping track of a timestamp when the connection is constructed,
and logging a record for "new" peer contacts that records how long (in
microseconds) it took from the time the peer was recorded as a candidate. The
message includes whether the contact was via DERP or direct, and whether a
cached netmap was in use at the time.
This builds on and extends the counters from #19699, but here we include new
contacts whether or not a cached netmap is in use, so that we can establish a
baseline for comparison.
Updates #12639
Updates tailscale/projects#27
Change-Id: I4f6d050e221f3881848d05a0425c4a5d1a59294c
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
This bug was surfaced by #19960 because benchmarks shouldn’t have run
TestListenService, but they did because PowerShell interpreted match
empty string `"^$"` as beginning of string `'^'`.
This patch has the Windows build run `./tool/go` binaries with bash
and synchronizes it with the *nix `bench all` run.
Updates #18884
Updates #19960
Signed-off-by: Simon Law <sfllaw@tailscale.com>
cros-garcon NULL-derefs on cold-boot netlink enumeration when
tailscale0 is present, preventing the Crostini container and
ChromeOS Terminal from starting cleanly. This is an upstream
ChromiumOS bug in cros-garcon; tailscaled can work around it
by defaulting to userspace-networking mode on Crostini.
Tailscale SSH continues to work via tailscaled's netstack.
Users can override with --tun=tailscale0 on ChromeOS builds
where cros-garcon is fixed.
Crostini is detected via /opt/google/cros-containers/bin/garcon,
which is present in every Crostini penguin container.
ssh/tailssh extends the existing Debian default-PATH case to
cover Crostini, since Crostini is Debian-based and benefits
from the same SSH PATH defaults.
RELNOTE: Crostini now defaults to userspace-networking.
Fixes#19488
Updates #12090
Signed-off-by: ferrumclaudepilgrim <ferrumclaudepilgrim@users.noreply.github.com>
The Tailscale daemon only refreshed TLS certs as a side effect of inbound
TLS handshakes or "tailscale cert" CLI calls. A node that doesn't see
inbound traffic during the renewal window silently rolls past expiry.
Add a once-per-hour background loop on LocalBackend that enumerates Serve
and Funnel HTTPS hostnames (filtered against the netmap's CertDomains so
we don't poke ACME for other nodes' service hostnames) and calls the
existing GetCertPEM path. The renewal decision (ARI window, then 2/3
expiry fallback) is unchanged; the loop just guarantees it runs.
For visibility during initial issuance or restart with a long-expired
cached cert, add a "tls-cert-pending" health Warnable that's set while
ACME is in flight and no usable cached cert exists. Async renewal of a
still-valid cert intentionally doesn't fire it. And then make the CLI "cert"
subcommand print out a warning if it's blocking due to a cert fetch
in flight, using that health info.
Fixes#19911Fixes#19912
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I144e46c40e957b2e879587decace32a523a6eade
When running `tailscale netcheck`, the reported timestamp used to be
in UTC and formatted according to RFC 3339 with a `T` to separate the
date from the time:
sfllaw@h2co3:~$ tailscale netcheck | head -n3
Report:
* Time: 2026-06-01T21:12:32.252620138Z
This is machine-readable time leaking out to the user interface. Times
in normal commands are formatted for humans to read:
sfllaw@h2co3:~$ date
Mon 01 Jun 2026 02:39:14 PM PDT
sfllaw@h2co3:~$ journalctl -t tailscaled | tail -n1
Jun 01 14:35:21 h2co3 tailscaled[3328921]: wgengine: sending TSMP disco key advertisement to 100.90.144.102
sfllaw@h2co3:~$ timedatectl show
Timezone=America/Los_Angeles
LocalRTC=no
CanNTP=yes
NTP=yes
NTPSynchronized=yes
TimeUSec=Mon 2026-06-01 14:38:32 PDT
RTCTimeUSec=Mon 2026-06-01 14:38:32 PDT
sfllaw@h2co3:~$ uptime --since
2026-05-15 07:37:45
This PR makes the times printed by the CLI commands consistent:
- For `tailscale routecheck`, it now prints local time as
`2026-05-15 07:37:45-07:00`.
- For `netlogfmt`, it has always printed local time with a space,
but now includes the time zone.
- All machine-readable outputs continue to be standard RFC 3339 in
UTC, i.e. `--format=json`.
As part of a general cleanup, this PR also adds standard common
time.Format layouts as tstime constants.
Fixes#19928
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Add a new tailcfg.NodeCapability (NodeAttrDisableCacheNetworkMaps) to allow the
policy document to override whether a node will receive the cache-network-maps
attribute by default. The client does not interpret this attribute directly, it
is used to influence decisions by the control plane.
As of 2026-06-01, cache-network-maps is only sent when explicitly requested by
the policy. In a future version, we will send it by default for clients with a
sufficient capability version (to be added in a future commit), except to
ephemeral nodes, unless the policy sets disable-cached-network-maps.
Updates #12639
Updates tailscale/projects#28
Change-Id: I6376376d7898f7da8db977e457dcd45df9deef41
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Capture Auto.mapCtx while holding Auto.mu before using it for
incremental map update forwarding. Pause and restart paths can replace
the context under the same mutex, so using it after unlocking races
with those writers.
Add a race regression test for the UserProfiles path that repeatedly
cancels the map context while incremental profile updates are
forwarded.
Fixes#19953
Change-Id: Icc55c4a0dffbc16d6507a2b446b3909d4d0a0278
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Several packages built their HTTP transports with
http.DefaultTransport.(*http.Transport).Clone()
The standard library only documents http.DefaultTransport as an
http.RoundTripper, so an application is free to replace it with a
RoundTripper that is not a *http.Transport (e.g. an instrumented or
tracing wrapper). When such an application embeds tsnet.Server, the
unchecked type assertion panics as soon as tsnet brings up its control
connection, DNS bootstrap, or log uploader.
Add netutil.NewDefaultTransport, which returns a clone of the global
when it is still the standard *http.Transport (preserving existing
behavior) and otherwise returns a fresh transport mirroring the stdlib
defaults. Route every clone site through it.
Updates #19937
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Achille Roussel <achille.roussel@gmail.com>
This adds @alexwlchan's proposed "tailscale get" command that reads
current preference values, complementing "tailscale set". It uses the
same flag names as set.
tailscale get # show all settings as a table
tailscale get all # same
tailscale get accept-dns # show a single value
tailscale get --json # output as JSON object
tailscale get --set-flags # output as tailscale set argv
Fixes#11389Fixestailscale/corp#38702
Change-Id: Ie366f27f11ccc56c76fff9a94ed8a9de9c835bd0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.
This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
When a connector receives a packet from a client on a transit IP that it
can't find a real IP mapping for, it drops the packet. This commit
starts notifying the client of this dropping over TSMP, so the client
can tell the connector to re-establish the transit IP-real IP binding.
Updates tailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
Single-user tailnets often have the same tailnet display name as login
name.
This change omits the duplication when matching, and skips the
user-switching submenu when only one account is configured, to clean up
the account display a little bit.
Fixes#16889
Signed-off-by: Evan Lowry <evan@tailscale.com>
Previously, testwrapper only retried tests explicitly annotated with
flakytest.Mark. Authors don't pre-emptively mark tests that haven't
flaked yet, so the first flake of a brand-new test failed CI even
when a re-run would have passed.
testwrapper now retries every failing test within a per-test wall-clock
budget (default: 5 minute per-attempt timeout capped at 1.5x the first
failure duration, 10 minute total). A test that fails and then passes
on retry is reported as flaky; a test that never passes within the
budget remains a real failure (exit non-zero).
For flakeapp's existing log scraping, the wire format is preserved:
the "flakytest failures JSON:" line is now emitted only for tests
that ultimately flaked (passed on retry). Unmarked tests get a fake
issue URL of the form https://github.com/{owner}/{repo}/issues/UNKNOWN
where owner/repo is detected from GITHUB_REPOSITORY, the local git
remote, or falls back to tailscale/tailscale. A new "permanent test
failures JSON:" line is emitted for tests that never passed; flakeapp
ignores it for now (a follow-up can teach it to record real failures
separately).
flakytest.Mark stays as an opt-in API: still useful for tracking a
known-flaky test against a real issue and for TS_SKIP_FLAKY_TESTS.
Updates tailscale/corp#38960
Change-Id: I56dfc9b023486d239f60793a53e9690578ce8017
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.
Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.
Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.
This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.
This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Block dynamic linker environment variables (LD_PRELOAD, LD_LIBRARY_PATH,
DYLD_INSERT_LIBRARIES, and friends) from being forwarded regardless of
acceptEnv policy, preventing privilege escalation via wildcard patterns
like "*".
We are not aware of any legitimate use of these variables so they are
safe to exclude from being passed.
Thanks to Tim Sageser (dtrsecurity) for this report.
Updates tailscale/corp#42033
Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
Adds two tests covering the fix in 0e4c8fc92:
TestDialNodeUsingProxyPort exercises dialNodeUsingProxy directly via a
stub CONNECT proxy, asserting the recorded target across four cases:
HTTPS/HTTP default fallback and explicit DERPPort override for each.
TestConnectThroughProxyHonorsDERPPort drives the full path end-to-end:
a real derpserver on an ephemeral TLS port, a real CONNECT proxy that
tunnels bytes bidirectionally, and a region client routed through it
via feature.HookProxyFromEnvironment. Without the fix, Connect fails
because the proxy is asked to dial :443.
Signed-off-by: Martin Zihlmann <martizih@outlook.com>
dialNode picks the destination port from n.DERPPort when non-zero,
falling back to 443 (or 3340 when useHTTPS is false). The proxy path,
dialNodeUsingProxy, hardcoded "443" in the CONNECT target, so a DERP
server reachable only on a custom port was unreachable through
HTTPS_PROXY: the proxy would faithfully tunnel to :443 at the DERP
hostname, and TLS would either fail cert validation or talk to the
wrong service.
Mirror dialNode's port selection so both paths behave the same.
Fixes#19748
Signed-off-by: Martin Zihlmann <martizih@outlook.com>
In case we land on this branch during a goto retry. Also, protect
Geneve offset from mutation across retries.
Fixes#19927
Signed-off-by: Jordan Whited <jordan@tailscale.com>
This adds tsnet.Server.ListenSSH which, if the SSH feature is linked,
returns a net.Listener whose Accept yields *tailssh.Session values (as
net.Conn). This lets tsnet apps accept incoming SSH connections to
implement custom TUI applications.
Basic apps can use net.Conn directly (Read/Write/Close). Rich apps
import ssh/tailssh and type-assert for peer identity, PTY, signals,
etc. If feature/ssh isn't imported, ListenSSH returns an error.
Includes a demo guess-the-number game in tsnet/example/ssh-game.
Updates tailscale/corp#37839
Change-Id: I4e7c3c96afb030cdf4da8f2d8b2253820628129a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Currently we are picking a peer for the split dns routes when we get a
netmap. Use the new custom scheme resolvers, installed per app in the
config in the netmap, to allow us to choose which connector peer should
handle a DNS request at the time the request is made.
Fixestailscale/corp#39858
Signed-off-by: Fran Bull <fran@tailscale.com>
In PR #19682, we introduced the traffic package which provides a
traffic.Scores.SortNodes method that uses rendezvous hashing to
break ties by equally distribute the “best” node for any given client.
This PR adds a fuzzer to make sure this algorithm is not wildly unfair.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This NodeCapability works around the UDP GSO bugs introduced by
torvalds/linux@b10b446 (v7.0-rc1). These bugs were later fixed by
torvalds/linux@78effd8 and torvalds/linux@5f17ae0 (v7.1-rc5). These
Linux kernel bugs cause mangled UDP headers and UDP checksums, resulting
in high levels of packet loss.
The aforementioned bugs have already made their way downstream into
various distros, e.g. Ubuntu 26.04 LTS. Impacted users are now dealing
with poor UDP performance in tailscaled, and in any other software that
makes use of UDP GSO.
Not all users of the affected kernels are impacted as the relevant
kernel code path sits between kernel and netdev driver, and behaviors
vary by driver/device capability.
We cannot detect impact at runtime, as this would require gathering all
netdevs, and performing loopback tests. This is invasive and in many
cases impossible.
So, we are left to choose between disabling UDP GSO for all users on
affected kernels, whether they experience real impact or not, or try
and work around the bugs. Disabling UDP GSO for a user that is not
impacted can cut max throughput in half, and consume more CPU cycles.
This commit attempts to workaround the bugs by avoiding UDP GSO when
batches are small, and injecting a 1-byte sentinel tail payload when
they are large. This tail payload is smaller than "GSO size", which
sidesteps the primary trigger of all fragments in a batch being
equal in length.
The end result is slightly increased payload and packet overhead, but
functional UDP GSO for all Linux 7.0-7.1.4 users, regardless of
netdev/driver.
Updates #19777
Signed-off-by: Jordan Whited <jordan@tailscale.com>
All StateStore implementations store a nil value in the cache map when WriteState is called with a nil byte slice instead of deleting the key. This causes ReadState to return (nil, nil) instead of (nil, ErrStateNotExist), since the key is still present in the map.
This breaks reset-auth in Windows, Linux, and Android, and the node can't log back in without manually editing the state file. (macOS uses a different state store)
DeleteProfile, DeleteAllProfilesForUser, setUnattendedModeAsConfigured are impacted but don't seem to break because the deleted keys are not reread.
This deletes the key from the cache instead.
Fixestailscale/corp#42477
Signed-off-by: kari-ts <kari@tailscale.com>
We have been reading the pool config from the app nodeattr, but it is
global config, not per app, so it needs to be its own thing.
Updates tailscale/corp#39999
Signed-off-by: Fran Bull <fran@tailscale.com>
If a user explicitly adds a non-ts.net (not a CertDomain domain) domain
like "foo.com" to their serve config as a web target that's also an allowed
funnel domain (using raw "tailscale serve set-config"), then use the new
ALPN cert fetching (from b553969b) to get certs for that domain.
This is just plumbing; there's no new product functionality to
actually enable this easily client-side, and it also has no visible
product surface to enable it server-side.
Updates tailscale/corp#41736
Change-Id: Ie2e421ac9611bce64bba3de6a454b2d505ea0e8a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The github-ci-vm machine that runs our self-hosted CI for this repo is
only designed for the `vm` job in test.yml. That uses a different cache
dir which is causing github-ci-vm's small disk to fill up. Switch to
ubuntu 24.04 like the rest of our CI for this repo that doesn't require
anything special.
Updates tailscale/corp#40465
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
If we dispatch a ping too early (after a later patch removes a 250ms
blockage) then the ping may be lost due to the peers not yet knowing
about each other. The ping is retained in order to setup and ensure a
wireguard session prior to test flow.
Updates #19822
Change-Id: I6cfea28931646a9387b6ffc2654e72cd846f4e55
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a "tailscale whoami" subcommand that is equivalent to running
"tailscale whois $(tailscale ip -4)" but more ergonomic. It supports
the --json flag just like whois, and shares the WhoIsResponse
rendering code with whois.
Fixes#19907
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I8f33ba7a5608bab7dffa8213303beb5f345936d3
Adds two tests exercising the HTTP/2-inbound -> plaintext HTTP/1.1 backend
path through serve's reverseProxy and through the full serveWebHandler
entry point (with a funnel serveHTTPContext).
Updates #19866
Signed-off-by: Brendan Creane <bcreane@gmail.com>