10605 Commits

Author SHA1 Message Date
Fran Bull
2f45a6a9d8 feature/conn25: return expired assignments to address pools
Make it possible to remove the least recently used expired address
assignment from addrAssignments.
Before checking out a new address from the IP pools, return a handful of
expired addresses.

Updates tailscale/corp#39975

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-08 14:33:06 -07:00
Fran Bull
82346f3882 feature/conn25: move addrAssignments to their own file
Updates tailscale/corp#39975

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-08 14:33:06 -07:00
Claus Lensbøl
469d356ed8 tstest/natlab/vmtest: add test for direct conn with cached netmap (#19660)
When a peer is not able to connect to control after a restart and is
using a cached netmap, that nodes should be able to connect to another
peer in its tailnet (given that the home DERP of that peer has not
changed in the meantime).

Add test that starts two peers and connects them to a tailnet with
caching enabled. Then blackhole traffic to control from one peer and
restart it. Verify that the connection between the two ends up direct.

Adds facilities for expecting a certain path type between nodes.

Updates: #19597

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-05-08 16:57:27 -04:00
Fran Bull
ee2378b141 feature/conn25: follow CNAMEs when rewriting DNS response
If a DNS query for a domain that should be routed through a connector
results in CNAME records in the response, collapse the CNAME chain to an
A/AAAA record for the domain -> magic IP.

Fixes tailscale/corp#39978

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-08 08:12:24 -07:00
Brad Fitzpatrick
24eb157448 go.toolchain.rev: bump to Go 1.26.3
Updates tailscale/corp#41490

Change-Id: I35b67bdbcd71468fea03b033b17aeefe1319dc45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-07 15:33:05 -07:00
Alex Chan
d6ffc0d986 tka,ipn: reduce boilerplate in Tailnet Lock tests
The `CreateStateForTest` helper reduces boilerplate in cases where the test
only cares about the trusted keys and not the disablement values (and makes
it more obvious where the disablement values are meaningful).

The `setupChonkStorage` helper reduces the boilerplate when creating on-disk
TKA storage in tests.

The `fakeLocalBackend` helper reduces the boilerplate when setting up a
`LocalBackend` instance in the IPN tests.

Updates #cleanup

Change-Id: Iacfba1be5f7fab208eec11e4369d63c7d7519da5
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-07 21:49:27 +01:00
Fernando Serboncini
495d3acc7b tstest/natlab/vmtest: kill QEMU when test process dies (#19676)
Re-exec the test binary as a thin wrapper that holds a pipe inherited
from the test. When the test goes away (any reason, including SIGKILL,
panic, or OOM), the kernel closes the pipe write end; the wrapper sees
EOF and SIGKILLs itself, taking QEMU and its children with it.

Updates #13038

Change-Id: Ib2151098193551396c1d7bb51b07da3bd6b2cfb4

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-05-07 16:14:27 -04:00
Claus Lensbøl
76248a68b2 tstest/natlab/vnet: close gonet sockets when test is done (#19677)
Running all vmtests in tstest/natlab/vmtest locally was breaking later
tasks in the queue. The goroutine dump on timeout had goroutines hanging
around for 9 minutes, meaning that something was not getting cleaned up.

  goroutine 262 [select, 9 minutes]:
  gvisor.dev/gvisor/pkg/tcpip/adapters/gonet.commonRead({...})

Add a timeout of Now() to gonet TCP connections when the test ends
(inspired by ServeUnixConn()), and wait for them to shut down before
exiting the test.

Updates #13038

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-05-07 14:57:07 -04:00
Hazel T
33b9579c21 scripts/installer.sh: add openSUSE Slowroll as a Tumbleweed derivative (#19662)
Fixes: #14927

Signed-off-by: Hazel T <hazel@tailscale.com>
2026-05-07 12:43:55 +01:00
Erisa A
76712b32d9 .github: install ca-certificates on Kali to fix installer tests (#19673)
Updates #cleanup

Signed-off-by: Erisa A <erisa@tailscale.com>
2026-05-07 12:20:09 +01:00
James Tucker
0def0f19bd util/eventbus: extract SubscriberFunc.dispatch loop to a non-generic helper
The (*SubscriberFunc[T]).dispatch method body — a ~40-line select
loop with slow-subscriber timer, snapshot handling, ctx-cancel
draining, and a CI stack-dump branch — was previously fully
duplicated by the Go compiler for every distinct GC shape of T.
None of that body actually depends on T except for the type
assertion and the user callback invocation.

This change moves the loop body into a non-generic dispatchFunc()
helper, leaving (*SubscriberFunc[T]).dispatch as a tiny wrapper
that:

  - performs the vals.Peek().Event.(T) type assertion
  - spawns the callback goroutine via `go runFuncCallback(s.read,
    t, callDone)` — a regular generic function call, not a closure,
    so that `go` binds the args to the goroutine's frame instead of
    allocating a closure on the heap. This preserves the
    zero-extra-allocation behavior of the original
    (*SubscriberFunc[T]).runCallback method.
  - resolves T's name via reflect.TypeFor[T]().String() (cached on
    the stack rather than recomputed on each %T formatting)
  - calls dispatchFunc with the callDone channel

The %T formatting in the original logf calls is replaced with %s
on the resolved name string, removing per-T fmt instantiations.

A new BenchmarkBasicFuncThroughput is added alongside the existing
BenchmarkBasicThroughput so per-event allocation behavior on the
SubscribeFunc dispatch path is covered by the benchmark suite.

Measured impact (util/eventbus/sizetest):

  SubscriberFunc per-flow attribution:
    linux/amd64:  912.5 B/flow -> 840.8 B/flow  (-71.7 B/flow)
    linux/arm64:  917.5 B/flow -> 849.9 B/flow  (-67.6 B/flow)

The total per-flow size delta on amd64 dropped from 3,096.6 B to
3,039.2 B (-57 B/flow). The arm64 total stayed at 3,145.7 B
because the linker's page-aligned section sizing absorbed the
improvement on this binary; the symcost-attributed per-receiver
number is the real signal.

Behavior is unchanged: BenchmarkBasicThroughput stays at 0
allocs/op and BenchmarkBasicFuncThroughput holds at the same 2
allocs/op, 144 B/op as the prior eventbus implementation. All
eventbus tests pass.

Updates #12614

Change-Id: I85f933f50f58cd25bbfe5cc46bdda7aab22f0bf7
Signed-off-by: James Tucker <james@tailscale.com>
2026-05-06 18:56:09 -07:00
Brad Fitzpatrick
87a74c3aa2 tsnet: make workload identity federation opt-in
The tailscale.com/wif package brings in the AWS SDK
(github.com/aws/aws-sdk-go-v2/{config,sts,...} and github.com/aws/smithy-go)
to support fetching ID tokens from AWS IMDS for workload identity
federation. Until now, tsnet pulled this in unconditionally via
feature/condregister/identityfederation, costing ~70 unwanted deps for
every tsnet program whether or not it uses workload identity federation.

These AWS SDK deps were originally removed from tsnet on 2025-09-29 by
commit 69c79cb9f ("ipn/store, feature/condregister: move AWS + Kube
store registration to condregister"). They were then accidentally added
back on 2026-01-14 by commit 6a6aa805d ("cmd,feature: add identity
token auto generation for workload identity", PR #18373) when the new
wif package was wired into tsnet via feature/identityfederation.

Drop the blanket import. tsnet programs that want workload identity
federation now opt in with:

    import _ "tailscale.com/feature/identityfederation"

The hook lookup in resolveAuthKey already uses GetOk and degrades
gracefully when the feature isn't linked, so existing programs that
don't use workload identity federation see no behavior change. The
tailscale CLI still imports the condregister wrapper directly, so its
behavior is also unchanged.

Lock this in with TestDeps additions: tailscale.com/wif as a BadDep,
plus substring checks in OnDep that fail on any github.com/aws/ or
k8s.io/ dependency creeping back in.

Also, switch cmd/gitops-pusher from the condregister wrapper to a
direct import of feature/identityfederation: gitops-pusher's auth flow
calls HookExchangeJWTForTokenViaWIF directly, so it shouldn't be
subject to the ts_omit_identityfederation build tag.

Updates #12614

Change-Id: I70599f2bdd4d3666b26a859d5b76caa5d6b94507
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-06 18:43:45 -07:00
Adriano Sela Aviles
daddb14b8f control/controlhttp: use ws:// when HTTPSPort is NoPort in JS dialer
When HTTPS is explicitly disabled (HTTPSPort == NoPort), the JS WebSocket
dialer should use ws:// instead of wss://. This matches the behavior of
the non-JS client and fixes connections to development control servers
e.g. http://localhost:31544.

Updates tailscale/corp#40944

Signed-off-by: Adriano Sela Aviles <adriano@tailscale.com>
2026-05-06 15:58:58 -07:00
Brad Fitzpatrick
d06cc56987 wgengine/magicsock: add more docs, checks to Test32bitAlignment
Per recent chat with @raggi about all this, I went and looked at this
test again.

Updates #cleanup

Change-Id: Icb7d87b1ed2cebf481ee4e358a3aa603e63fb8a4
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-06 15:29:44 -07:00
Brad Fitzpatrick
15bb10dbce tsnet: ban awsstore and kubestore as deps in TestDeps
Commit 69c79cb9f (Sep 2025) moved awsstore and kubestore registration
behind condregister build tags so tsnet wouldn't pull in the AWS SDK
and Kubernetes client by default. The accompanying TestDeps BadDeps
entry was missed, so PR #19667 (which re-added those imports) wasn't
caught by the test.

Add the two packages to BadDeps so future regressions fail the test.

Updates #19667
Updates #12614

Change-Id: I903b7c976e5e122cc0c0b956dc73740f5d474fac
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-06 14:57:47 -07:00
Tom Proctor
b74eeda055 cmd/testwrapper: print unit for package duration (#19663)
Include the unit (s) when printing the time taken to test each package.

Updates #cleanup

Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
2026-05-06 22:31:48 +01:00
kari-ts
c721189cef ipn/ipnlocal: prefer one CGNAT route on Android (#19652)
Android rebuilds its VpnService interface when the VPN route
configuration changes, which tears down long lived TCP connections
through the tunnel. Use the same automatic OneCGNATRoute behavior as
macOS on Android, and prefer the single CGNAT route when no other
interface is using the CGNAT, falling back to fine grained peer routes
otherwise.

Updates tailscale/tailscale#19591

Signed-off-by: kari <kari@tailscale.com>
2026-05-05 19:11:17 -07:00
Brad Fitzpatrick
f844c8bc32 util/winutil/gp: deflake TestGroupPolicyReadLockClose
The test goroutine read lockCnt immediately after Lock returned, racing
with Close: close(lk.closing) wakes lockSlow's select, whose deferred
Add(-2) on lockCnt can run before Close's CAS clears the LSB. When that
happens, lockCnt is briefly 1 (3 - 2) instead of 0 (1 + 2 - 2 - 1),
producing "lockCnt: got 1; want 0".

Move the lockCnt assertion into the main test goroutine, after both
Close has returned and the Lock goroutine has finished, so both updates
have settled before we read.

Fixes #19647

Change-Id: Ia67036ff73a1beb528cbd621460db9048f3066ad
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-05 14:02:35 -07:00
Jonathan Nobels
872d79089e VERSION.txt: this is v1.99.0 (#19645)
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
v1.99.0-pre
2026-05-05 15:07:20 -04:00
Evan Lowry
aa21b0c008 client/systray: fix recommended exit node not showing as selected (#19627)
When an exit node was set before launching systray, the recommended row
in exit nodes rendered as not selected even when the active exit node
was at the same location.

This looks to be two different things:

- suggestExitNode takes its own suggestion into account, and not the
  users active exit node. When a mullvad city is reached via the picker
  rather than the recommended row, the suggester's pick and
  prefs.ExitNodeID end up as distinct peers in the same city, resulting
  in an ID-only equality check missing the match.
- Toggle state was constructed and mutated via .Check(), which for newly
  created elements may be cached (such as when launching systray, with
  an already active node).

Fixes #19626

Signed-off-by: Evan Lowry <evan@tailscale.com>
2026-05-05 10:49:38 -03:00
Alex Chan
eac531da8e cmd/tailscale/cli: unhide --report posture flag in up
This was originally hidden during the beta period in both `up` and `set`,
then when device posture went GA we unhid the flag in `set` but not in
`up`.

This is confusing for users, because an error message can direct them to
run `tailscale up` with this flag if they've set it previously, but the
help text won't tell them what it does.

Updates #5902
Updates #17972

Change-Id: I9a31946f4b3bb411feed0f5a6449d7ff9a5ba9d3
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-05 10:12:36 +01:00
Brad Fitzpatrick
883d4fd2cd wgengine/netstack, net/ping: stop using pro-bing and use our net/ping instead
Fixes #19633
Fixes #13760

Change-Id: I0fa9423523a3a0fb1dfcde57de0f26e51723ff97
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-04 14:05:24 -07:00
Brad Fitzpatrick
81569e891f tstest/iosdeps: update import list to mirror ipn-go-bridge
The purpose of this package is to test the iOS dependency closure, but
it had drifted from the actual import list of the ipn-go-bridge package
in the corp repo (the Go side of the iOS / macOS app).

Update the imports to match ipn-go-bridge's GOOS=ios import list,
adding many missing packages including wgengine/netstack,
feature/{taildrop,syspolicy,condregister}, the util/syspolicy/*
subpackages, types/{key,lazy,logid,netmap}, tsd, safesocket,
util/{eventbus,must,set}, and several net/* and ipn/* packages.

Drop two now-stale BadDeps entries (for now!): database/sql/driver and
github.com/google/uuid are reached via wgengine/netstack ->
github.com/prometheus-community/pro-bing, which netstack imports on
darwin || ios for ICMP user-ping, so the iOS app already ships them.
But we should fix that later.

Updates #19633

Change-Id: Ic50779fdb195685a2e8ccd7c513eee91b0feeaf8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-04 14:05:24 -07:00
Brad Fitzpatrick
9bb7ca6116 cmd/vet/lowerell, drive/driveimpl: forbid variables named "l" or "I"
Add a new vet checker that rejects variables, parameters, named
return values, receivers, range/type-switch bindings, type
parameters, struct fields, and constants named "l" (lowercase ell)
or "I" (uppercase i). Both are hard to distinguish from the digit
"1" and from each other in too many fonts.

Rename the two pre-existing struct fields named "l" (both of type
net.Listener) in drive/driveimpl/drive_test.go to "ln", matching the
convention used elsewhere for net.Listener locals.

Rename the test-fixture struct fields "I" (single int label) to
"Int" in metrics/multilabelmap_test.go and util/deephash/deephash_test.go,
preserving the "first letters of types" convention used alongside
neighboring fields like I8/I16/U/U8.

Also teach pkgdoc_test.go to skip testdata/ directories, which
the go tool ignores; they are not real packages.

Fixes #19631

Change-Id: I71ad2fa990705f7a070406ebcdb8cefa7487d849
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-04 14:03:28 -07:00
Andrew Lytvynov
0cf899610c util/linuxfw/linuxfwtest: remove unused package (#19520)
Added in 2022, this appears to be unused now.

Updates #cleanup

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-05-04 12:33:12 -07:00
License Updater
ca2317439d licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-05-04 10:34:27 -07:00
Jordan Whited
ce76f44df2 derp/derpserver: remove global rate limiter
Which can be unfair around varying packet sizes.

Updates tailscale/corp#40962

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-05-04 09:41:14 -07:00
Fernando Serboncini
29122506be misc/git_hook: propagate shared HOOK_VERSION (#19476)
Move HOOK_VERSION into the githook package and export it as
githook.HookVersion, so tailscale/corp can reference it via
the shared-code bump instead of having to bump HOOK_VERSION
by hand.

New launcher.sh composes the wanted version from 2 sources:
the shared HOOK_VERSION and an optional repo local version,
misc/git_hook/HOOK_VERSION, for repo-specific config bumps.

Updates tailscale/corp#40381

Change-Id: I7cf16889ba53cb564cc2df7dfd7588748f542c55

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-05-04 12:38:28 -04:00
George Jones
290a6cc03c appc, feature/conn25: handle exact and wildcard domains correctly (#19202)
Installed SplitDNS routes are always treated as wildcard domains,
so the domains that we pass to the local resolver should be normalized
and have any leading *. wildcard prefix removed.

When looking at DNS responses to see if the domain matches, we need to
consider both exact matches and wildcard matches. We now keep separate
maps of exact-match domains and wildcard domains, and when we match we
check to see if there's a match in the exact-match map, otherwise we
check against the wild card match map until we find a match, removing
a label after each check.

Rather than looking for matching self-hosted domains (domains serviced
by the connector being run on the self-node), the apps that are being
serviced by the connector on the self-node are tracked instead. When
checking to see if a DNS response should be rewritten, it is ignored
if any of the matching apps for the domain are in the self-hosted apps set.

Fixes tailscale/corp#39272

Signed-off-by: George Jones <george@tailscale.com>
2026-05-01 17:33:21 -04:00
Fran Bull
bdf3419e7d net/dns: add custom scheme resolvers
If another part of the client code registers a custom scheme with the
forwarder, the forwarder will check resolver addresses to see if they
match the scheme. If they do, the corresponding custom scheme handler
will be called to find the actual address for the resolver at this
moment. If the handler returns the empty string then that resolver will
be ignored.

This is useful if you want to dynamically determine where to send
certain DNS requests. It is being added to support new app connector
(conn25) work that would like to make sure it sends DNS requests to the
current connector peer in a high availability configuration.

Updates tailscale/corp#39858

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-05-01 14:01:10 -07:00
Rollie Ma
78126c5d9f tailcfg: add node capability for services in desktop clients (#19605)
Add a node capability to help determine if the desktop clients should
show services list/menu/section

Updates: https://github.com/tailscale/corp/issues/40900

Change-Id: Ie34b3362f921d710173b2a0dd190354352bb26f0

Signed-off-by: Rollie Ma <rollie@tailscale.com>
2026-05-01 12:07:33 -07:00
Tom Meadows
ee10f9881c cmd/k8s-operator: add authkey reissuing to recorder reconciler (#19556)
also fixes memory leak with authKeyReissuing map on ProxyGroup
reconciler authkey reissue.

Updates #19311

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-05-01 18:26:55 +01:00
Alex Chan
3ced30b0b6 tka: clarify that this limit is on disablement *values* not *secrets*
Values get written into TKA state; secrets don't.

Updates #cleanup

Change-Id: Ief9831dcb1102f584a33b2e71b611b38ca463724
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-01 18:25:39 +01:00
Andrew Lytvynov
f15a4f4416 client/web: move API permission checks into handlers (#19576)
There are only a couple endpoints that check peer capabilities. Keeping
permission checks with the code that assumes they were performed, rather
than with the routing layer, feels easier to reason about.

Check that the caller is actually a peer and pass their capabilities via
a context value for handlers that want to check them.

Along with this, simplify the helper handler wrappers that are not
needed for most of the endpoints.

Updates #40851

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-05-01 09:01:53 -07:00
Brad Fitzpatrick
bbcb8650d4 cmd/tailscale/cli: fetch netmap via current-netmap debug action
Stop opening an IPN bus subscription with NotifyInitialNetMap purely to
read the current netmap once. Use the LocalAPI debug current-netmap
action (added in 159cf8707) instead, which returns the current netmap
synchronously without subscribing to the bus.

Updates #12542

Change-Id: I8aa2096d65aaea4dfe62634f03ce06b5470e0e51
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-01 07:53:51 -07:00
Brad Fitzpatrick
4c3ed5ab32 all: migrate code off Notify.NetMap to Notify.SelfChange
Move tailscaled's in-tree reactive users from of IPN bus Notify.NetMap
updates to the narrower Notify.SelfChange signal introduced earlier in
this series. Consumers that need additional state (peers, DNS config,
etc.) fetch it on demand via the LocalAPI.

It is a step toward the larger goal of not fanning Notify.NetMap out
to every bus watcher on Linux/non-GUI hosts.

A future change stops sending Notify.NetMap entirely on Linux and
non-GUI platforms. (eventually once macOS/iOS/Windows migrate to the
upcoming new Notify APIs, we'll remove ipn.Notify.NetMap entirely)

Updates #12542

Change-Id: I51ea9d86bdca1909d6ac0e7d5bd3934a3a4e8516
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-01 06:51:40 -07:00
Claus Lensbøl
ff9c3f0e00 tstest/natlab/vmtest: add test loading netmap cache from disk (#19598)
For testing the loading of netmap cache from disk, the cache needs to
exist. The simple solution is to start two nodes and connect them to
control, with the netmap caching capability set. Then cut the connection
to control, restart the nodes, and ping between them.

This tests that we can start from a cache and get to running state, but
also that we are able to establish a connection between the nodes.

For now this is not testing how the nodes are able to talk to each other
(DERP vs direct).

Updates #19597

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-05-01 09:46:19 -04:00
Brad Fitzpatrick
89a78dc9b7 client/local, ipn/localapi, ipn/ipnlocal: add PeerByID
Add a narrow LocalAPI accessor and matching client/LocalBackend method
to look up a single peer's current full [tailcfg.Node] by NodeID, in
O(1) time on the daemon side, without fetching the entire netmap.

Useful for callers that need the latest state of a single peer (e.g.
in response to a peer-mutation event on the IPN bus) without paying
for a full netmap fetch.

Updates #12542

Change-Id: I1cb2d350e6ad846a5dabc1f5368dfc8121387f7c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-05-01 06:20:46 -07:00
Alex Chan
cac94f51cc ipn/ipnlocal: don't compact TKA state on startup
Compacting on startup means nodes may compact at a different cadence
based on whether they're long-running or restarting frequently.

We already compact after every sync, which only occurs when the TKA
state has changed. Waiting for TKA changes to trigger compaction on
nodes means compaction will occur more consistently across a tailnet.

Updates tailscale/corp#33537

Change-Id: Ia0aa6d9e5e362e9ab08450fde69772841790d5b5
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-05-01 13:27:12 +01:00
Brad Fitzpatrick
a6c5d23742 ipn, ipn/ipnlocal: add Notify.SelfChange
Add a new bus signal that lets reactive consumers (containerboot, kube
agents, sniproxy, tsconsensus, etc.) react to self-node updates without
having to subscribe to the full netmap. Today those consumers either
watch Notify.NetMap (which on large tailnets is expensive to encode and
ship per watcher) or poll. SelfChange is a cheap, narrow alternative:
addresses, name, key expiry, capabilities, etc.

Consumers that need additional state can react to SelfChange and then
fetch the relevant bits on demand via existing LocalClient methods.

Producer-side, every netmap-bearing setControlClientStatus call now
also publishes SelfChange. Future changes will migrate individual
in-tree consumers off Notify.NetMap to this signal, and eventually
gate the legacy NetMap emission to platforms whose host GUIs still
require it.

Updates #12542

Change-Id: I4441650b0e085d663eb6bf26a03748b7d961ca49
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 14:47:03 -07:00
Brad Fitzpatrick
9f343fdc0c client/local, ipn/localapi, all: add CertDomains and DNSConfig accessors
Add two narrow LocalAPI accessors so callers don't have to subscribe to
the IPN bus and pull a full *netmap.NetworkMap just to read DNS-shaped
fields:

  - GET /localapi/v0/cert-domains returns DNS.CertDomains.
  - GET /localapi/v0/dns-config returns the full tailcfg.DNSConfig.

Migrate in-tree callers off the netmap-on-the-bus pattern:

  - kube/certs.waitForCertDomain still wakes on the IPN bus but now
    queries CertDomains via LocalClient.CertDomains rather than
    reading n.NetMap.DNS.CertDomains. The kube LocalClient interface
    and FakeLocalClient gain a CertDomains method.
  - cmd/tailscale dns status calls LocalClient.DNSConfig directly
    instead of opening a NotifyInitialNetMap watcher.
  - cmd/tailscale configure kubeconfig switches from a netmap watcher
    + serviceDNSRecordFromNetMap to LocalClient.DNSConfig +
    serviceDNSRecordFromDNSConfig.

This is part of a series moving callers away from depending on the
netmap traveling on the IPN bus, so the bus payload can shrink in a
later change.

Updates #12542

Change-Id: Ie10204e141d085fbac183b4cfe497226b670ad6c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 13:50:46 -07:00
Michael Ben-Ami
822299642b feature/conn25: centralize config on Conn25 with atomic access
We have two sources of truth for configuration state: the node view
(from the netmap/policy) and prefs (the --advertise-connector option).
These come with two independent update paths: onSelfChange for node view
changes and profileStateChange for pref changes.

Centralize config on Conn25 so that onSelfChange and profileStateChange
can update their independent parts without bundling changes together.
The old bundled approach required read-modify-write, which opened the
door to potential TOCTOU bugs. The node view config is
stored as an atomic.Pointer[config] and the prefs-derived field
(advertise-connector) becomes an independent atomic.Bool. onSelfChange
creates a fresh config and stores it atomically. profileStateChange sets
the bool.

This also establishes clearer lines of responsibility:

 - Configuration state lives on Conn25. Methods that need to read
   config (isConnectorDomain, mapDNSResponse, the IPMapper methods)
   are on Conn25, and use the atomics for synchronization.

 - "Active" state (address allocations, transit IP mappings) lives on
   client and connector, and use a mutex for synchronization on that
   state, without conflicting with configuration synchronization.
   It's fine for active state to be out of sync with config — e.g. a
   transit IP allocated for an app should still be tracked, and gracefully
   expired, even if the app is removed from the node view.
   Removing config responsibility from client/connector makes these
   cases clearer to handle.

 - In cases where the client or connector does need access to
   config-derived state, e.g. a client reconfiguring its IP pools from
   the IPSets in the config, we can use closures for the
   client or connector to get just the latest state it needs from the
   config. See getIPSets() in this commit.

 - As of this commit, the connector doesn't need config-derived state at
   all.

Fixes tailscale/corp#40872

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-04-30 16:29:56 -04:00
Brad Fitzpatrick
159cf8707a ipn/ipnlocal, all: split LocalBackend.NetMap into NetMapNoPeers / NetMapWithPeers
Add two narrower accessors alongside the existing
[LocalBackend.NetMap], with docs that distinguish their semantics:

  - NetMapNoPeers: cheap (returns the cached *netmap.NetworkMap with
    a possibly-stale Peers slice). For callers that only read non-Peers
    fields like SelfNode, DNS, PacketFilter, capabilities.
  - NetMapWithPeers: documented as returning an up-to-date Peers slice.
    For callers that genuinely need to iterate Peers or call
    PeerByXxx.

Mark the existing NetMap deprecated and point readers at the two new
accessors. NetMap, NetMapNoPeers, and NetMapWithPeers all currently
return the same value (b.currentNode().NetMap()): this commit is a
no-op behaviorally, just a renaming and migration of in-tree callers.
A subsequent change in the same series will switch
NetMapWithPeers to actually rebuild the Peers slice from the live
per-node-backend peers map (O(N) per call), at which point the
distinction between the two new accessors becomes load-bearing.

Migrate in-tree callers to the appropriate accessor based on what
fields they read:

  - NetMapNoPeers (most common): localapi handlers, peerapi accept,
    GetCertPEMWithValidity, web client noise request, doctor DNS
    resolver check, tsnet CertDomains/TailscaleIPs, ssh/tailssh
    SSH-policy/cap reads, several LocalBackend internals
    (isLocalIP, allowExitNodeDNSProxyToServeName, pauseForNetwork
    nil-check, serve config).
  - NetMapWithPeers: writeNetmapToDiskLocked (persist full netmap to
    disk for fast restart), PeerByTailscaleIP lookup.

Tests still call the legacy NetMap; they'll see the deprecation
warning but otherwise behave identically.

Also add two pieces of plumbing the next change in this series will
need, but which are already useful on their own:

  - [client/local.GetDebugResultJSON]: a generic [Client.DebugResultJSON]
    that decodes directly into a target type T, avoiding the
    marshal/unmarshal roundtrip callers otherwise need.
  - localapi "current-netmap" debug action: returns the current
    netmap (with peers) as JSON. Documented as debug-only — the
    netmap.NetworkMap shape is internal and may change without notice.

This commit is part of a series breaking up a larger change for
review; on its own it is a no-op refactor.

Updates #12542

Change-Id: Idbb30707414f8da3149c44ca0273262708375b02
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 11:14:06 -07:00
Brad Fitzpatrick
92179b1fc7 cmd/hello: split server into helloserver package
Move the template, request handler, and HTTP/HTTPS server wiring out
of package main and into a new cmd/hello/helloserver package so the
server can be embedded in other binaries. The main package now only
constructs a helloserver.Server with the production addresses and
calls Run.

While here, drop the -http, -https, and -test-ip flags along with the
dev-mode template and fake-data fallbacks they enabled; the binary is
only run in production.

Updates tailscale/corp#32398

Change-Id: Id1d38b981733334cafc596021130f36e1c1eed67
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 08:40:55 -07:00
David Bond
644c3224e9 cmd/{containerboot,k8s-operator}: don't return pointers to maps (#19593)
This commit modifies the usage of the `egressservices.Configs` type
within containerboot and the k8s operator.

Originally it was being thrown around as a pointer which is not required
as maps are already pointers under the hood.

Signed-off-by: David Bond <davidsbond93@gmail.com>
2026-04-30 16:11:00 +01:00
Brad Fitzpatrick
815bb291c9 cmd/tailscale/cli: allow tag without "tag:" prefix in 'tailscale up'
If a user passes --advertise-tags=foo,bar (with no colons in any
segment), automatically prepend "tag:" client-side so it goes on the
wire as "tag:foo,tag:bar". Segments that already contain a colon are
left untouched and must be fully-qualified ("tag:foo"), which keeps
the door open for future colon-bearing syntax.

This was originally added in cd07437ad (2020-10-28) and then reverted
in 1be01ddc6 (2020-11-10) over forward-compatibility concerns. But
then it was realized in 2026-04-29 that this was always safe for
future extensiblity anyway (tags can't contain colons-- tag:foo:bar is
invalid anyway, per the 2020 CheckTag restrictions). So if we wanted
to perhaps some hypothetical --advertise-tags=tagset:setfoo or "group:foo",
we'd still have syntax to do, as it can't conflict with tag:group:foo.

Avery signed off on this on Slack: "Ok, I withdraw my objection to
auto-qualifying tag names in advertise-tags and I hope I won't regret
it :)"

Updates #861

Change-Id: I06935b0d3ae909894c95c9c2e185b7d6a219ff32
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-30 07:13:48 -07:00
Brad Fitzpatrick
f343b496c3 wgengine, all: remove LazyWG, use wireguard-go callback API for on-demand peers
Replace the UAPI text protocol-based wireguard configuration with
wireguard-go's new direct callback API (SetPeerLookupFunc,
SetPeerByIPPacketFunc, RemoveMatchingPeers, SetPrivateKey).

Instead of computing a trimmed wireguard config ahead of time upon
control plane updates and pushing it via UAPI, install callbacks so
wireguard-go creates peers on demand when packets arrive. This removes
all the LazyWG trimming machinery: idle peer tracking, activity maps,
noteRecvActivity callbacks, the KeepFullWGConfig control knob, and the
ts_omit_lazywg build tag.

For incoming packets, PeerLookupFunc answers wireguard-go's questions
about unknown public keys by looking up the peer in the full config.
For outgoing packets, PeerByIPPacketFunc (installed from
LocalBackend.lookupPeerByIP) maps destination IPs to node public keys
using the existing nodeByAddr index.

Updates tailscale/corp#12345

Change-Id: I4cba80979ac49a1231d00a01fdba5f0c2af95dd8
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 19:46:19 -07:00
Brad Fitzpatrick
b313bffbe7 control/tsp, tstest/integration/testcontrol: deflake TestMapAgainstTestControl
The test was flaky under stress with "AddRawMapResponse N: node not
connected" failures. The root cause was in testcontrol's addDebugMessage:
it conflated "no streaming poll registered" with "wake-up channel buffer
momentarily full". The single-slot updatesCh is just a lossy wake-up
signal, but the streaming serveMap loop has fast paths
(takeRawMapMessage and the hasPendingRawMapMessage continue) that don't
drain it. A stale notification could remain buffered, causing the next
sendUpdate to fail even though msgToSend had been queued and the
streaming poll would still pick it up.

Detect the real failure case (no streaming poll) by checking
s.updates[nodeID] directly, and treat sendUpdate's buffer-full result as
benign — the message is in msgToSend, which is the source of truth.

Also plumb an optional *health.Tracker through tsp.ClientOpts to the
underlying ts2021.Client and supply one in the tests, eliminating the
"## WARNING: (non-fatal) nil health.Tracker (being strict in CI)" stack
dumps emitted by controlhttp.(*Dialer).forceNoise443 under CI.

Fixes #19583

Change-Id: Ib2334376585e8d6562f000a0b71dea0117acb0ff
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-04-29 16:11:00 -07:00
Claus Lensbøl
978b6a81b2 ipn/ipnlocal: always ReSTUN when starting up without a cache (#19586)
78627c1 introduced starting up and preserving the DERP server from
cache, but also changed it so the initial ReSTUN would not fire when
setting the DERPMap.

Change this so when not working from a cache, the ReSTUN will always
fire during startup.

Updates #19585

Signed-off-by: Claus Lensbøl <claus@tailscale.com>
2026-04-29 18:56:57 -04:00
Jordan Whited
c0a9728fe2 derp/derpserver: fix Server.UpdateRateLimits docs
As of 0e9f9e2bd it is possible to have an infinity per-client limit,
with finite global.

Updates tailscale/corp#40962

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-04-29 14:43:12 -07:00