tsdial.Dialer.SetNetMap rebuilt an O(n peers) map of MagicDNS names on
every netmap change. As we move toward per-peer incremental deltas,
this becomes quadratic. This removes it and replaces it with
SetResolveMagicDNS, a callback into LocalBackend that looks up
hostnames from nodeBackend's new nodeByName index (populated alongside
nodeByAddr/nodeByKey on both full and delta paths). The index stores
both FQDNs and short names as keys.
This is the same treatment applied to netlog (8f210454d), wglog
(988b0905b), and drive (1d6989408): stop pushing *netmap.NetworkMap
into subsystems and instead have them pull from LocalBackend's live
data via callbacks.
Updates #12542
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I24557ab0c8a27636e08e4779bcfd3ec633db0a78
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.
Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The Logger previously took a *netmap.NetworkMap at Startup and on every
ReconfigNetworkMap call, denormalizing it into per-IP and self lookup
maps. That denormalization is O(n) over all peers and ran on every
netmap update, contributing to the broader quadratic behavior we want
to eliminate when a single peer is added or removed.
Instead, this makes netlog ask LocalBackend (well, nodeBackend) for
the info it needs, letting us remove the netmap.NetworkMap type
entirely from the netlog package.
This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Ib5f2de96e788a667332c0a6f7ac833b3d0053b5c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
util/def: add def.Bool and def.Duration default parse helpers
Replace multiple instances of def.Bool and def.Duration with a new util/def
package.
Updates #20018
Co-authored-by: Bobby <boby@codelabs.co.id>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
Signed-off-by: Bobby <boby@codelabs.co.id>
Signed-off-by: Simon Law <sfllaw@tailscale.com>
tsnet depends on logpolicy, which in turn depended on util/syspolicy
because of a single LogTarget policy setting it uses.
In this commit, we replace that dependency with a feature.Hook,
which only tailscaled or its platform-specific alternatives should set.
Updates #20031
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We don't need to log if the policy doesn't actually say that hardware
attestation must be enabled.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This adds support for Gokrazy GAF (Gokrazy Archive Format) zip
auto-updates, starting to wire up Tailscale's clientupdate mechanism
to Gokrazy's update mechanism.
Currently there's just a CLI command to update from a GAF URL,
with an --unsigned flag for use in a new natlab vmtest.
Next step would be publishing unstable track GAF files on
pkgs.tailscale.com, with detached signatures, and then making the
clientupdate mechanism also download those and check signatures.
Updates #20002
Change-Id: Ib03c56f17a57f8a4638398ef83549dac4813323d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cros-garcon NULL-derefs on cold-boot netlink enumeration when
tailscale0 is present, preventing the Crostini container and
ChromeOS Terminal from starting cleanly. This is an upstream
ChromiumOS bug in cros-garcon; tailscaled can work around it
by defaulting to userspace-networking mode on Crostini.
Tailscale SSH continues to work via tailscaled's netstack.
Users can override with --tun=tailscale0 on ChromeOS builds
where cros-garcon is fixed.
Crostini is detected via /opt/google/cros-containers/bin/garcon,
which is present in every Crostini penguin container.
ssh/tailssh extends the existing Debian default-PATH case to
cover Crostini, since Crostini is Debian-based and benefits
from the same SSH PATH defaults.
RELNOTE: Crostini now defaults to userspace-networking.
Fixes#19488
Updates #12090
Signed-off-by: ferrumclaudepilgrim <ferrumclaudepilgrim@users.noreply.github.com>
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.
This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.
Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.
Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.
This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.
This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Commit e5a8cf3b1 added feature/runtimemetrics, which emits heap bytes
and total process memory as clientmetrics when the
NodeAttrEmitRuntimeMetrics capability is set. That subsumes the job of
the TS_DEBUG_MEMORY envknob, whose only effect is to prefix every log
line with Go heap+stack and Maxrss via logger.RusagePrefixLog.
Updates tailscale/corp#39434
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Emit runtime metrics as clientmetrics when the
NodeAttrEmitRuntimeMetrics NodeCapability is present.
We start small with just 2 metrics: heap bytes and total process memory.
Updates tailscale/corp#39434
Signed-off-by: Jordan Whited <jordan@tailscale.com>
In PR tailscale/corp#30448, we originally decided to break ties using
SHA256 for our rendezvous hashing algorithm. Now that we’ve had some
experience with it, we think that FNV-1a is a better choice. It
distributes bits evenly, it’s much faster, and it doesn’t need to be
cryptographically secure. The FNV designers recommend FNV-1a over the
deprecated FNV-1.
This PR makes the switch and updates the related tests, since changing
the algorithm changes which stable pick gets selected. As of 2026-05,
this is the best time to make this change, since there are almost no
clients in the wild with traffic steering enabled.
Updates #17366
Updates tailscale/corp#29964
Updates tailscale/corp#29966
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The traffic package contains helpers for evaluating traffic steering
scores and picking appropriate nodes. These were extracted from
ipnlocal.suggestExitNodeUsingTrafficSteering so they can be reused by
the new routecheck package to probe exit nodes in priority order.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The Engine watchdog wrapped every wgengine.Engine method call in a
goroutine with a 45s timeout and crashed the process on timeout. It
was added years ago to surface deadlocks during development, but the
underlying deadlocks have long since been fixed, and even when it did
fire it produced obscure stack traces (from inside the watchdog
goroutine, not the original caller) without buying much.
Audit of userspaceEngine's methods shows none have cyclic locking or
unbounded blocking now that ResetAndStop no longer loops waiting for
DERPs to drain (fa49009ee). The watchdog is dead weight; remove it
along with the TS_DEBUG_DISABLE_WATCHDOG escape hatch.
Updates #19759
Change-Id: Iba9d718fe1f8718a6631296e336b138c31b99ff1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
RouteCheck, which checks that overlapping routers are reachable, is
enabled by default for both tailscaled and tsnet.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
There are only a couple endpoints that check peer capabilities. Keeping
permission checks with the code that assumes they were performed, rather
than with the routing layer, feels easier to reason about.
Check that the caller is actually a peer and pass their capabilities via
a context value for handlers that want to check them.
Along with this, simplify the helper handler wrappers that are not
needed for most of the endpoints.
Updates #40851
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This drops an indirect dependency on the old github.com/docker/docker
(which was replaced with github.com/moby/moby) and fixes a couple recent
CVEs.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
fixestailscale/corp#39422
Updates tailscale/certstore for properly macOS support and
builds the request signing support into macOS builds. iOS and builds
that do not use cGo are omitted.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.
tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.
The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.
Updates #8085
Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Move the ipn/desktop blank import from cmd/tailscaled/tailscaled_windows.go
into feature/condregister/maybe_desktop_sessions.go, consistent with how
all other modular features are registered. tailscaled already imports
feature/condregister, so it still gets ipn/desktop on Windows.
Updates #12614
Change-Id: I92418c4bf0e67f0ab40542e47584762ac0ffa2b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Brings in a newer version of Gliderlabs SSH with added socket forwarding support.
Fixes#12409Fixes#5295
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.
The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
slice/map literal with bad name field values
Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.
Updates #19242
Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
By polling RTM_GETSTATS via netlink. RTM_GETSTATS is a relatively
efficient and targeted (single device) polling method available since
Linux v4.7.
The tundevstats "feature" can be extended to other platforms in the
future, and it's trivial to add new rtnl_link_stats64 counters on
Linux.
Updates tailscale/corp#38181
Signed-off-by: Jordan Whited <jordan@tailscale.com>
Replace byte-at-a-time ReadByte loops with Peek+Discard in the DERP
read path. Peek returns a slice into bufio's internal buffer without
allocating, and Discard advances the read pointer without copying.
Introduce util/bufiox with a BufferedReader interface and ReadFull
helper that uses Peek+copy+Discard as an allocation-free alternative
to io.ReadFull.
- derp.ReadFrameHeader: replace 5× ReadByte with Peek(5)+Discard(5),
reading the frame type and length directly from the peeked slice.
Remove now-unused readUint32 helper.
name old ns/op new ns/op speedup
ReadFrameHeader-8 24.2 12.4 ~2x
(0 allocs/op in both)
- key.NodePublic.ReadRawWithoutAllocating: replace 32× ReadByte with
bufiox.ReadFull. Addresses the "Dear future" comment about switching
away from byte-at-a-time reads once a non-escaping alternative exists.
name old ns/op new ns/op speedup
NodeReadRawWithoutAllocating-8 140 43.6 ~3.2x
(0 allocs/op in both)
- derpserver.handleFramePing: replace io.ReadFull with bufiox.ReadFull.
Updates tailscale/corp#38509
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
When a client starts up without being able to connect to control, it
sends its discoKey to other nodes it wants to communicate with over
TSMP. This disco key will be a newer key than the one control knows
about.
If the client that can connect to control gets a full netmap, ensure
that the disco key for the node not connected to control is not
overwritten with the stale key control knows about.
This is implemented through keeping track of mapSession and use that for
the discokey injection if it is available. This ensures that we are not
constantly resetting the wireguard connection when getting the wrong
keys from control.
This is implemented as:
- If the key is received via TSMP:
- Set lastSeen for the peer to now()
- Set online for the peer to false
- When processing new keys, only accept keys where either:
- Peer is online
- lastSeen is newer than existing last seen
If mapSession is not available, as in we are not yet connected to
control, punt down the disco key injection to magicsock.
Ideally, we will want to have mapSession be long lived at some point in
the near future so we only need to inject keys in one location and then
also use that for testing and loading the cache, but that is a yak for
another PR.
Updates #12639
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
Currently IP forwarding health check is done on sending MapRequests.
Move ip forwarding to the health service to gain the benefits
of the health tracker and perodic monitoring out of band from
the MapRequest path. ipnlocal now provides a closure to
the health service to provide the check if forwarding is broken.
Removed `skipIPForwardingCheck` from controlclient/direct.go,
it wasn't being used as the comments describe it, that check
has moved to ipnlocal for the closure to the health tracker.
Updates #18976
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Introduce a datapathHandler that implements hooks that will
receive packets from the tstun.Wrapper. This commit does not wire
those up just yet.
Perform DNAT from Magic IP to Transit IP on outbound flows on clients,
and reverse SNAT in the reverse direction.
Perform DNAT from Transit IP to final destination IP on outbound flows
on connectors, and reverse SNAT in the reverse direction.
Introduce FlowTable to cache validated flows by 5-tuple for fast lookups
after the first packet.
Flow expiration is not covered, and is intended as future work before
the feature is officially released.
Fixestailscale/corp#34249Fixestailscale/corp#35995
Co-authored-by: Fran Bull <fran@tailscale.com>
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.
It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.
Updates #12614
Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
After we intercept a DNS response and assign magic and transit addresses
we must communicate the assignment to our connector so that it can
direct traffic when it arrives.
Use the recently added peerapi endpoint to send the addresses.
Updates tailscale/corp#34258
Signed-off-by: Fran Bull <fran@tailscale.com>
This commit adds `--json` output mode to dns debug commands.
It defines structs for the data that is returned from:
`tailscale dns status` and `tailscale dns query <DOMAIN>` and
populates that as it runs the diagnostics.
When all the information is collected, it is serialised to JSON
or string built into an output and returned to the user.
The structs are defined and exported to golang consumers of this command
can use them for unmarshalling.
Updates #13326
Signed-off-by: Kristoffer Dalby <kristoffer@tailscale.com>
PR #18860 adds firewall rules in the mangle table to save outbound packet
marks to conntrack and restore them on reply packets before the routing
decision. When reply packets have their marks restored, the kernel uses
the correct routing table (based on the mark) and the packets pass the
rp_filter check.
This makes the risk check and reverse path filtering warnings unnecessary.
Updates #3310Fixestailscale/corp#37846
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
Using the new wait command from #18574 provide a tailscale-online.target
that has a similar usage model to the conventional
`network-online.target`.
Updates #3340
Updates #11504
Signed-off-by: James Tucker <james@tailscale.com>
The new version of app connector (conn25) needs to read DNS responses
for domains it is interested in and store and swap out IP addresses.
Add a hook to dns manager to enable this.
Give the conn25 updated netmaps so that it knows when to assign
connecting addresses and from what pool.
Assign an address when we see a DNS response for a domain we are
interested in, but don't do anything with the address yet.
Updates tailscale/corp#34252
Signed-off-by: Fran Bull <fran@tailscale.com>
This commit is based on ff0978ab, and extends #18497 to connect network map
caching to the LocalBackend. As implemented, only "whole" netmap values are
stored, and we do not yet handle incremental updates. As-written, the feature must
be explicitly enabled via the TS_USE_CACHED_NETMAP envknob, and must be
considered experimental.
Updates #12639
Co-Authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I48a1e92facfbf7fb3a8e67cff7f2c9ab4ed62c83
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
bart has gained a bunch of purported performance and usability
improvements since the current version we are using (0.18.0,
from 1y ago)
Updates tailscale/corp#36982
Signed-off-by: Amal Bansode <amal@tailscale.com>
This updates the URL shown by systemd to the new URL used by the docs
after the recent migration.
Fixes#18646
Signed-off-by: Tim Walters <tim@tailscale.com>
Add new "webbrowser" and "colorable" feature tags so that the
github.com/toqueteos/webbrowser and mattn/go-colorable packages
can be excluded from minbox builds.
Updates #12614
Change-Id: Iabd38b242f5a56aa10ef2050113785283f4e1fe8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Package feature/conn25 is excludeable from a build via the featuretag.
Test it is excluded for minimal builds.
Updates #12614
Signed-off-by: Fran Bull <fran@tailscale.com>
We already had a featuretag for clientupdate, but the CLI wasn't using
it, making the "minbox" build (minimal combined tailscaled + CLI
build) larger than necessary.
Updates #12614
Change-Id: Idd7546c67dece7078f25b8f2ae9886f58d599002
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Running a command like `tailscale up --auth-key tskey-foo --auth-key tskey-bar` used to print
```
invalid value "tskey-bar" for flag -auth-key: flag provided multiple times
```
but now we print
```
invalid value "tskey-REDACTED" for flag -auth-key: flag provided multiple times
```
Fixes#18562
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>