Mappings from transit IPs to real IPs are stored ephemerally in the
connector, so they're lost on restart. When we send a packet to the
connector with a transit IP it does not recognize, it sends us a TSMP
message saying so (see #19883). If we (the client) know of such a
mapping, we now re-send it to the connector so that a connection can
proceed.
Fixestailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
util/def: add def.Bool and def.Duration default parse helpers
Replace multiple instances of def.Bool and def.Duration with a new util/def
package.
Updates #20018
Co-authored-by: Bobby <boby@codelabs.co.id>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
Signed-off-by: Bobby <boby@codelabs.co.id>
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The hook fires when a flow is removed for any reason (LRU capacity eviction,
tuple-collision displacement, or idle-time expiry). The hook is invoked
exactly once per flow, after the flow table mutex is released, so callbacks
may safely acquire other locks.
We rename the IPMapper interface to Conn25Datapath, and add
ClientFlowCreated/ClientFlowRemoved methods so *Conn25 can keep client-side
address assignments alive while traffic is in flight. Those methods are
currently stubbed for future work.
Connector flows do not currently call these methods.
Updates tailscale/corp#38630
Updates tailscale/corp#43180
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
The returned error in the signature is left over from previous
implementations and was only returning nil.
If we know NewFlow will succeed we can fire a create hook (implemented
in a future commit) before NewFlow, which will prevent a remove hook for
a flow from firing before the create hook for the same flow.
Updates tailscale/corp#38630
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Track lastSeen on each cached flow and add a sweeper goroutine
that periodically removes flows idle past the idle timeout.
Introduce tunables for idle timeout, maximum flows removed per sweep (to
limit mutex hold time), and the sweeper interval.
Also cap the previously-unlimited tables: 10k client flows, 100k
connector flows.
Updates tailscale/corp#38630
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
To avoid breaking downstream code, add deprecated aliases for all the
old names.
Updates tailscale/corp#37904
Change-Id: I86d0b0d7da371946440b181c665448f91c3ef8d2
Signed-off-by: Alex Chan <alexc@tailscale.com>
When a connector receives a packet from a client on a transit IP that it
can't find a real IP mapping for, it drops the packet. This commit
starts notifying the client of this dropping over TSMP, so the client
can tell the connector to re-establish the transit IP-real IP binding.
Updates tailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.
Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.
Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.
This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.
This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Currently we are picking a peer for the split dns routes when we get a
netmap. Use the new custom scheme resolvers, installed per app in the
config in the netmap, to allow us to choose which connector peer should
handle a DNS request at the time the request is made.
Fixestailscale/corp#39858
Signed-off-by: Fran Bull <fran@tailscale.com>
All StateStore implementations store a nil value in the cache map when WriteState is called with a nil byte slice instead of deleting the key. This causes ReadState to return (nil, nil) instead of (nil, ErrStateNotExist), since the key is still present in the map.
This breaks reset-auth in Windows, Linux, and Android, and the node can't log back in without manually editing the state file. (macOS uses a different state store)
DeleteProfile, DeleteAllProfilesForUser, setUnattendedModeAsConfigured are impacted but don't seem to break because the deleted keys are not reread.
This deletes the key from the cache instead.
Fixestailscale/corp#42477
Signed-off-by: kari-ts <kari@tailscale.com>
We have been reading the pool config from the app nodeattr, but it is
global config, not per app, so it needs to be its own thing.
Updates tailscale/corp#39999
Signed-off-by: Fran Bull <fran@tailscale.com>
Add four control-plane node attributes that let us disable UDP GSO/GRO
on the magicsock UDP socket and UDP/TCP GRO on the Tailscale TUN
device.
These complement the pre-existing TS_DEBUG_DISABLE_UDP_{GRO,GSO} and
TS_TUN_DISABLE_{UDP,TCP}_GRO envknobs. They exist so we can mitigate
upstream Linux kernel regressions on a deployed fleet without
requiring a client release, after two incidents (#13041, #19777) where
buggy kernel patches landed upstream and the fix took an excessively
long time to reach downstream distros.
Knob changes are reacted to in setNetworkMapInternal / SetNetworkMap via
a comparison against a cached "last applied" value and only an actual
transition triggers work: magicsock Rebind()+ReSTUN for UDP,
ApplyGROKnobs for TUN. The TUN side is gated by buildfeatures.HasGRO and
is one-way (wireguard-go GRO disablement is sticky); re-enabling
requires a client restart.
Updates #13041
Updates #19777
Change-Id: I802993070afa659cc06809bb0bfbb7f8a0cdb273
Signed-off-by: James Tucker <james@tailscale.com>
We don't want addr assignments to be lost from the collection before
they can be returned to the IP pools, otherwise we will get orphan
addresses marked inUse in the pools that will never be returned.
Fixestailscale/corp#39975
Signed-off-by: Fran Bull <fran@tailscale.com>
serveFilePut tracked outgoing-file progress through an unbuffered
progressUpdates channel whose close was owned by the request goroutine
while writers were spread across manifest parsing, the
progresstracking.Reader callback, singleFilePut failure paths, and the
success path. That writer-closes mismatch made the
send-on-closed-channel panic effectively unfixable in place.
Replace it with a request-scoped outgoingProgress reporter. Transfer
code reports state by method call; the reporter coalesces hot-path
updates and is flushed once via defer in serveFilePut. With no
producer channel to close, the panic is structurally impossible.
Fixes#19115Fixes#19817
Change-Id: I8f00d982d2c79880dfc1f8104c5eed06e94b5a6c
Signed-off-by: James Tucker <james@tailscale.com>
Emit runtime metrics as clientmetrics when the
NodeAttrEmitRuntimeMetrics NodeCapability is present.
We start small with just 2 metrics: heap bytes and total process memory.
Updates tailscale/corp#39434
Signed-off-by: Jordan Whited <jordan@tailscale.com>
When we use assigned addresses in response to a DNS request, extend the
expiry on the assignment.
Updates tailscale/corp#39975
Signed-off-by: Fran Bull <fran@tailscale.com>
Previously we had two maps keyed on a direction-specific tuple, with
distinct values containing the data (action) for that direction.
Values pointed at each other across maps to ensure they were removed
at the same time in the case of tuple overwrite, but LRU eviction
was per-map. So if LRU was turned on, it was possible for one
direction's data (action) to be evicted and leave the other direction
dangling.
NewFlow replaces the two direction-specific flow constructors, and
lookups return the direction-specific PacketAction directly.
Now the values in each map point to the same element, with data for both
directions in the element. A linked list also points to the elements to
implement LRU. The previous flowtrack.Cache is removed.
The single LRU structure will allow us to implement idle time expiration
by walking the list backward starting with the least recently used flow, and
stopping after a fixed number of flows, or at the first non-expired flow.
We add commented-out unused placeholder fields for tracking the
"last seen" timestamp, and an on-removal hook, to document the intent for
the follow-up expiry work.
Updates tailscale/corp#38630
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
RouteCheck, which checks that overlapping routers are reachable, is
enabled by default for both tailscaled and tsnet.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Make it possible to remove the least recently used expired address
assignment from addrAssignments.
Before checking out a new address from the IP pools, return a handful of
expired addresses.
Updates tailscale/corp#39975
Signed-off-by: Fran Bull <fran@tailscale.com>
If a DNS query for a domain that should be routed through a connector
results in CNAME records in the response, collapse the CNAME chain to an
A/AAAA record for the domain -> magic IP.
Fixestailscale/corp#39978
Signed-off-by: Fran Bull <fran@tailscale.com>
The `CreateStateForTest` helper reduces boilerplate in cases where the test
only cares about the trusted keys and not the disablement values (and makes
it more obvious where the disablement values are meaningful).
The `setupChonkStorage` helper reduces the boilerplate when creating on-disk
TKA storage in tests.
The `fakeLocalBackend` helper reduces the boilerplate when setting up a
`LocalBackend` instance in the IPN tests.
Updates #cleanup
Change-Id: Iacfba1be5f7fab208eec11e4369d63c7d7519da5
Signed-off-by: Alex Chan <alexc@tailscale.com>
Installed SplitDNS routes are always treated as wildcard domains,
so the domains that we pass to the local resolver should be normalized
and have any leading *. wildcard prefix removed.
When looking at DNS responses to see if the domain matches, we need to
consider both exact matches and wildcard matches. We now keep separate
maps of exact-match domains and wildcard domains, and when we match we
check to see if there's a match in the exact-match map, otherwise we
check against the wild card match map until we find a match, removing
a label after each check.
Rather than looking for matching self-hosted domains (domains serviced
by the connector being run on the self-node), the apps that are being
serviced by the connector on the self-node are tracked instead. When
checking to see if a DNS response should be rewritten, it is ignored
if any of the matching apps for the domain are in the self-hosted apps set.
Fixestailscale/corp#39272
Signed-off-by: George Jones <george@tailscale.com>
We have two sources of truth for configuration state: the node view
(from the netmap/policy) and prefs (the --advertise-connector option).
These come with two independent update paths: onSelfChange for node view
changes and profileStateChange for pref changes.
Centralize config on Conn25 so that onSelfChange and profileStateChange
can update their independent parts without bundling changes together.
The old bundled approach required read-modify-write, which opened the
door to potential TOCTOU bugs. The node view config is
stored as an atomic.Pointer[config] and the prefs-derived field
(advertise-connector) becomes an independent atomic.Bool. onSelfChange
creates a fresh config and stores it atomically. profileStateChange sets
the bool.
This also establishes clearer lines of responsibility:
- Configuration state lives on Conn25. Methods that need to read
config (isConnectorDomain, mapDNSResponse, the IPMapper methods)
are on Conn25, and use the atomics for synchronization.
- "Active" state (address allocations, transit IP mappings) lives on
client and connector, and use a mutex for synchronization on that
state, without conflicting with configuration synchronization.
It's fine for active state to be out of sync with config — e.g. a
transit IP allocated for an app should still be tracked, and gracefully
expired, even if the app is removed from the node view.
Removing config responsibility from client/connector makes these
cases clearer to handle.
- In cases where the client or connector does need access to
config-derived state, e.g. a client reconfiguring its IP pools from
the IPSets in the config, we can use closures for the
client or connector to get just the latest state it needs from the
config. See getIPSets() in this commit.
- As of this commit, the connector doesn't need config-derived state at
all.
Fixestailscale/corp#40872
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Add two narrower accessors alongside the existing
[LocalBackend.NetMap], with docs that distinguish their semantics:
- NetMapNoPeers: cheap (returns the cached *netmap.NetworkMap with
a possibly-stale Peers slice). For callers that only read non-Peers
fields like SelfNode, DNS, PacketFilter, capabilities.
- NetMapWithPeers: documented as returning an up-to-date Peers slice.
For callers that genuinely need to iterate Peers or call
PeerByXxx.
Mark the existing NetMap deprecated and point readers at the two new
accessors. NetMap, NetMapNoPeers, and NetMapWithPeers all currently
return the same value (b.currentNode().NetMap()): this commit is a
no-op behaviorally, just a renaming and migration of in-tree callers.
A subsequent change in the same series will switch
NetMapWithPeers to actually rebuild the Peers slice from the live
per-node-backend peers map (O(N) per call), at which point the
distinction between the two new accessors becomes load-bearing.
Migrate in-tree callers to the appropriate accessor based on what
fields they read:
- NetMapNoPeers (most common): localapi handlers, peerapi accept,
GetCertPEMWithValidity, web client noise request, doctor DNS
resolver check, tsnet CertDomains/TailscaleIPs, ssh/tailssh
SSH-policy/cap reads, several LocalBackend internals
(isLocalIP, allowExitNodeDNSProxyToServeName, pauseForNetwork
nil-check, serve config).
- NetMapWithPeers: writeNetmapToDiskLocked (persist full netmap to
disk for fast restart), PeerByTailscaleIP lookup.
Tests still call the legacy NetMap; they'll see the deprecation
warning but otherwise behave identically.
Also add two pieces of plumbing the next change in this series will
need, but which are already useful on their own:
- [client/local.GetDebugResultJSON]: a generic [Client.DebugResultJSON]
that decodes directly into a target type T, avoiding the
marshal/unmarshal roundtrip callers otherwise need.
- localapi "current-netmap" debug action: returns the current
netmap (with peers) as JSON. Documented as debug-only — the
netmap.NetworkMap shape is internal and may change without notice.
This commit is part of a series breaking up a larger change for
review; on its own it is a no-op refactor.
Updates #12542
Change-Id: Idbb30707414f8da3149c44ca0273262708375b02
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Replace the UAPI text protocol-based wireguard configuration with
wireguard-go's new direct callback API (SetPeerLookupFunc,
SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey).
Instead of computing a trimmed wireguard config ahead of time upon
control plane updates and pushing it via UAPI, install callbacks so
wireguard-go creates peers on demand when packets arrive. This removes
all the LazyWG trimming machinery: idle peer tracking, activity maps,
noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the
ts_omit_lazywg build tag.
For incoming packets, PeerLookupFunc answers wireguard-go's questions
about unknown public keys by looking up the peer in the full config.
For outgoing packets, PeerByIPPacketFunc (installed from
LocalBackend.lookupPeerByIP) maps destination IPs to node public keys
using the existing nodeByAddr index.
Updates tailscale/corp#12345
Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Device posture checking can fail while enabled if tailscaled does not
have access to smbios. Previously, this was only observable by looking
in the tailscaled logs.
Fixestailscale/corp#39314
Signed-off-by: Evan Lowry <evan@tailscale.com>
TestPackageDocs walked into directories starting with "." (such as
.claude worktrees) and only logged warnings on duplicate package docs
across files in a directory. Skip dot-directories (which covers the
old .git but also .claude), ignore files with "//go:build ignore" so
command files don't falsely trip the duplicate check, and promote the
duplicate-doc warning to a t.Errorf.
While here, deduplicate the package docs that were previously only
logged: drop the redundant comment from client/systray/startup-creator.go,
move the comprehensive taildrop doc into feature/taildrop/doc.go, and
remove a leftover doc fragment from feature/condlite/expvar/omit.go.
The tstest/integration/vms allowlist is no longer needed since the
//go:build ignore filter now handles its dns_tester.go and udp_tester.go
files generically.
Fixes#19526
Change-Id: Id794d96bd728826a1883a054e4a244f90fa05d3d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
And use it to allow overwrites of old address assignments in the conn25 client.
The magic and transit address pools from which the addresses come are limited
resources and we want to reuse them. This commit is a small part of that bigger
need.
We expect to follow soon:
* Extending expiry if assignments are still in use.
* Returning expired addresses back to the pools so they can be reallocated.
Updates tailscale/corp#39975
Signed-off-by: Fran Bull <fran@tailscale.com>
addrAssignments is a table of addrs with lookup indices, representing
the assignments of magic+destination+transit IP addresses the client has
made dut to the domain being routed because of an app
.
byConnKey is a map of node public key to prefixes of transit IPs, so it
is associated with, but not that data itself, and can be its own thing.
Updates tailscale/corp#39975
Signed-off-by: Fran Bull <fran@tailscale.com>
Currently, clientupdate.NewUpdater().Update() is called directly inside tailscaled, which fatals. There is also a failure that doesn't return, causing a panic.
This fix allows us to use the same approach as startAutoUpdate, which is to find tailscale.exe and run tailscale.exe --update, though since it's calling the updater library directly, we get progress messages.
Fixes tailscale/corp#40430s
Signed-off-by: kari-ts <kari@tailscale.com>
modifying DNS responses for domains they are also connectors for
For Connectors 2025, determine if a client is configured as a
connector and what domains it is a connector for. When acting as a
client, don't install Split DNS routes to other connectors for those
domains, and don't alter DNS responses for those domains. The responses
are forwarded back to the original client, which in turn does the alteration,
swapping the real IP for a Magic IP.
A client is also a connector for a domain if it has tags that overlap
with tags in the configured policy, and --advertise-connector=true
in the prefs (not in the self-node Hostinfo from the netmap). We use the prefs
as the source of truth because control only gets a copy from the prefs, and
may drift. And the AppConnector field is currently zeroed out in the
self-node Hostinfo from control.
The extension adds a ProfileStateChange hook to process prefs changes,
and the config type is split into prefs and nodeview sub-configs.
Fixestailscale/corp#39317
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Move the ipn/desktop blank import from cmd/tailscaled/tailscaled_windows.go
into feature/condregister/maybe_desktop_sessions.go, consistent with how
all other modular features are registered. tailscaled already imports
feature/condregister, so it still gets ipn/desktop on Windows.
Updates #12614
Change-Id: I92418c4bf0e67f0ab40542e47584762ac0ffa2b2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a new "ipnbus" build feature tag so the watch-ipn-bus LocalAPI
endpoint can be independently controlled, rather than being gated
behind HasDebug || HasServe. Minimal/embedded builds that omit both
debug and serve were getting 404s on watch-ipn-bus, breaking
"tailscale up --authkey=..." and other CLI flows that depend on
WatchIPNBus.
In the CLI, check buildfeatures.HasIPNBus before attempting to watch
the IPN bus in "tailscale up"/"tailscale login", and exit early with
an informational message when the feature is omitted.
Also add the missing NewCounterFunc stub to clientmetric/omit.go,
which caused compilation errors when building with
ts_omit_clientmetrics and netstack enabled.
Fixes#19240
Change-Id: I2e3c69a72fc50fa02542b91b8a54859618a463d1
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.
The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
slice/map literal with bad name field values
Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.
Updates #19242
Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Install the previously uninstalled hooks for the filter and tstun
intercepts. Move the DNS manager hook installation into Init() with all
the others. Protect all implementations with a short-circuit if the node
is not configured to use Connectors 2025. The short-circuit pattern
replaces the previous pattern used in managing the DNS manager hook, of
setting it to nil in response to CapMap changes.
Fixestailscale/corp#38716
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
The hook calls into the client assigned addresses to return a view of
the transit IPs associated with that connector.
Fixestailscale/corp#38125
Signed-off-by: George Jones <george@tailscale.com>
The client needs to know the set of transit IPs that are assigned
to each connector, so when we register transit IPs with the connector
we also need to assign them to that connector in the addrAssignments.
We identify the connector by node public key to match the peer information
that is available when the ExtraWireguardAllowedIPs hook will be invoked.
Fixestailscale/corp#38127
Signed-off-by: George Jones <george@tailscale.com>
conn25 needs to add routes to the operating system to direct handling
of the addresses in the magic IP range to the tailscale0 TUN and
tailscaled.
The way we do this for exit nodes and VIP services is that we add routes
to the Routes field of router.Config, and then the config is passed to
the WireGuard engine Reconfig.
conn25 is implemented as an ipnext.Extension and so this commit adds a
hook to ipnext.Hooks to allow any extension to provide routes to the
config. The hook if provided is called in routerConfigLocked, similarly
to exit nodes and VIP services.
Fixestailscale/corp#38123
Signed-off-by: Fran Bull <fran@tailscale.com>
When the client of a connector assigns transit IP addresses for a
connector we need to let wireguard know that packets for the transit IPs
should be sent to the connector node. We do this by:
* keeping a map of node -> transit IPs we've assigned for it
* setting a callback hook within wireguard reconfig to ask us for these
extra allowed IPs.
* forcing wireguard to do a reconfig after we have assigned new transit
IPs.
And this commit is the last part: forcing the wireguard reconfig after a
new address assignment.
Fixestailscale/corp#38124
Signed-off-by: Fran Bull <fran@tailscale.com>
By polling RTM_GETSTATS via netlink. RTM_GETSTATS is a relatively
efficient and targeted (single device) polling method available since
Linux v4.7.
The tundevstats "feature" can be extended to other platforms in the
future, and it's trivial to add new rtnl_link_stats64 counters on
Linux.
Updates tailscale/corp#38181
Signed-off-by: Jordan Whited <jordan@tailscale.com>