Commit Graph

10851 Commits

Author SHA1 Message Date
Alex Chan
9169b206be Revert "control/controlclient: continue map poll during key expiry to receive extensions" (#20257)
* Revert "control/controlclient: continue map poll during key expiry to receive extensions"

This reverts commit 6a822dcc36. This commit
has caused test failures in the corp repo by unexpected changing the login
behaviour when nodes have a valid node key.

Updates tailscale/corp#43705
Updates #19326

Signed-off-by: Alex Chan <alexc@tailscale.com>

* Revert "tsnet: test key extension after server restart"

This reverts commit 317201375f. This test
relies on changes in 317201375f, which is
also being reverted because it causes test failures in corp.

Updates tailscale/corp#43705
Updates #19326

Signed-off-by: Alex Chan <alexc@tailscale.com>

---------

Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-06-25 15:24:12 -07:00
Tom Meadows
6e1de5b651 cmd/containerboot: refresh DNS config on SelfChange (#20236)
364b952d6 switched containerboot to partial netmap fetching, but
stopped refreshing `DNS.ExtraRecords`, so Tailscale Services created
after pod boot were invisible to resolveTailnetFQDN. To fix we watch
for SelfChange ipn bus notifies, and refetch dns-config via LocalAPI
to get a fresh set of `DNS.ExtraRecords`.

Fixes #20233

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
2026-06-25 14:50:25 +01:00
Alex Chan
9f92a4728e util/cmpver: add a test for comparing three-digit versions
No code changes needed; this is to rule out cmpver as the source of any
version-comparison issues.

Updates #20238

Change-Id: Ib8765dd042e994549d9e2c03859a5f769a856704
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-06-25 10:02:50 +01:00
Brad Fitzpatrick
dd1df38200 ipn/ipnlocal: pass capability set, not netmap, to two helpers
setWebClientAtomicBoolLocked and setDebugLogsByCapabilityLocked
each only need the node capabilities to decide what to do, so
take a set.Set[tailcfg.NodeCapability] directly as part of
getting rid of netmap.NetworkMap.

Updates #12542

Change-Id: If7c30b6354fd42dfe82ed6d2e2fe3439de401315
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-24 16:08:33 -07:00
Brad Fitzpatrick
87cb2a8d1e wgengine: replace Engine.SetNetworkMap with SetSelfNode
The engine only used the netmap to look up self addresses and the
self node's primary routes, so pass it the self node directly
rather than the whole netmap.

Updates #12542

Change-Id: I13c0028eed65d2177baf4cf6c449f5e441845a18
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-24 15:03:55 -07:00
Michael Ben-Ami
1b2062f3c1 net/tstun: invoke conn25 app connector hook on injected reads
The primary purpose is that return packets from the target app get
properly SNATed on connectors with --tun=userspace-networking, matching
the NAT behavior in the kernel tun path.

This is also necessary but not sufficient for clients of connectors in
userspace networking mode. The hook will DNAT MagicIPs, but won't
actually be sent MagicIPs until conn25 app connector DNS works with
userspace networking.

Fixes tailscale/corp#43201

Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
2026-06-24 16:59:58 -04:00
Brendan Creane
77d2c87b17 wgengine/router/osrouter,util/linuxfw: remove orphaned tailnet addrs (#20199)
Router.Set reconciled tailscale0's addresses only against the in-memory
r.addrs map, which starts empty each run. After a restart the kernel can
still hold the addresses a previous profile put on tailscale0. With no
record of them, Set never removed them, leaving two tailnets' CGNAT
addresses on the interface. That broke connectivity, because the kernel
could source traffic from the wrong IP.

Fix this by scanning the addresses actually on the interface and, after
reconciling the desired set, removing any in Tailscale's CGNAT/ULA ranges
that aren't in the config. Non-Tailscale addresses are never touched,
and IPv6 addresses are skipped when IPv6 is unavailable, since delAddress
no-ops there. To avoid a netlink dump on every Set, the scan runs only on
the first Set and when the desired address set changes.

This also needs the iptables DelLoopbackRule to tolerate a missing rule:
an orphan left by a previous instance never went through AddLoopbackRule
here, and iptables (unlike nftables) errors when deleting an absent
rule, which would otherwise block the address delete.

Fixes #19974

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-06-24 13:41:36 -07:00
Patrick O'Doherty
453c078baf .github: add zizmor GitHub Actions linting (#20243)
Add zizmor GitHub Actions linting on changes to .github/workflows.

Updates tailscale/corp#28760

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-06-24 13:14:54 -07:00
Brad Fitzpatrick
aefb1531d1 net/tsdial, ipn/ipnlocal: stop using netmap.NetworkMap in Dialer
tsdial.Dialer.SetNetMap rebuilt an O(n peers) map of MagicDNS names on
every netmap change. As we move toward per-peer incremental deltas,
this becomes quadratic. This removes it and replaces it with
SetResolveMagicDNS, a callback into LocalBackend that looks up
hostnames from nodeBackend's new nodeByName index (populated alongside
nodeByAddr/nodeByKey on both full and delta paths). The index stores
both FQDNs and short names as keys.

This is the same treatment applied to netlog (8f210454d), wglog
(988b0905b), and drive (1d6989408): stop pushing *netmap.NetworkMap
into subsystems and instead have them pull from LocalBackend's live
data via callbacks.

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I24557ab0c8a27636e08e4779bcfd3ec633db0a78
2026-06-24 13:14:45 -07:00
Brad Fitzpatrick
8dde9b725b tstest/natlab/vmtest: serialize ensureDebugSSHKey across parallel boots
Env.Start boots all VM nodes in parallel; each calls
createCloudInitISO -> ensureDebugSSHKey concurrently. When
/tmp/vmtest_key doesn't yet exist, the first goroutine creates it
with os.WriteFile, which opens with O_CREATE|O_TRUNC and briefly
leaves the file existing-but-empty between the open and the
subsequent write. A concurrent goroutine that hits that window
sees ReadFile succeed with zero bytes, then fails ssh.ParsePrivateKey
with "ssh: no key found", causing boot to fail with:

  boot: creating cloud-init ISO: parse /tmp/vmtest_key: ssh: no key found

Observed in CI on TestSiteToSite (3 nodes). Wrap the function in
a package-level Mutex so the first caller fully writes the key
before any other caller reads it.

Updates #20228

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ie6399dcba0c397bb8041931d3de1c6063a11c568
2026-06-24 09:22:28 -07:00
Brad Fitzpatrick
0bc0cb8131 tstest/natlab/vmtest: retry SSHExec on transient SSH failures
Add a retry loop with BatchMode=yes to absorb the race window
between Env.Start() returning (when tta reports the tailscale
backend as Running) and cloud-init finishing the user/SSH-key
setup. In CI, the second VM's tta agent has been observed
connecting only a few hundred milliseconds before the test SSHes
in, which is inside the window where /root/.ssh/authorized_keys
hasn't fully landed yet. SSH key auth then fails and ssh(1) falls
back to interactive password prompts (3x), wasting time and
producing a confusing "Permission denied (publickey,password)"
error.

BatchMode=yes makes the client fail fast on auth failure instead
of prompting, and the retry loop handles SSH transport-level
errors (exit code 255) for up to 30 seconds with 500ms backoff.
Remote command non-zero exits still pass through unchanged.

Fixes #20228

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I17f7422e9e27bf7b995f505c0184cbb2b230ed81
2026-06-24 09:22:28 -07:00
Alex Chan
281404e9e3 cmd/tailscale/cli: fix capitalisation of flags
Most of our flag descriptions start with a lowercase word (except proper
nouns); fix the handful which do not.

Fixes #20230

Change-Id: I00aaac171254c050ad0b75c2cf8746590c8c4d8f
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-06-24 16:56:49 +01:00
Amal Bansode
c33a55737b ipn/ipnlocal: reduce excessive logging of exit node suggestions (#20237)
The logging added in 12188c0 was generating excessive spam in
backend logs. This may have been exacerbated by
tailscale GUI<->backend architecture on certain platforms like
Windows, where the GUI polls for exit node suggestions rather
than listening on the IPN bus.

Change this to log on error or if the current suggestion differs
from the previous suggestion.

Updates tailscale/corp#43691
Updates #20194

Signed-off-by: Amal Bansode <amal@tailscale.com>
2026-06-24 08:40:23 -07:00
Brad Fitzpatrick
d4f2917c1b wgengine, ipn/ipnlocal: route PeerForIP through LocalBackend's live data
userspaceEngine.PeerForIP read from e.netMap.Peers and
e.lastCfgFull.Peers, both of which go stale when peers arrive via
netmap deltas (which skip Engine.SetNetworkMap and Engine.Reconfig).
Every PeerForIP caller (Engine.Ping, the TSMP disco-key handler,
pendopen diagnostics, tsdial.Dialer.UseNetstackForIP, and
LocalBackend.GetPeerEndpointChanges) would report "no matching peer"
for freshly-added peers.

Fix it the same way SetPeerByIPPacketFunc fixed the outbound packet
hot path: have LocalBackend install a callback that reads the live
nodeBackend. nb.NodeByAddr is built from both SelfNode and Peers
(updateNodeByAddrLocked), so a single lookup covers the common case
with IsSelf set when the matched node ID is SelfNode's. The subnet-
route / exit-node-default-route slow path goes through a new
Engine.PeerKeyForIP that exposes the engine's AllowedIPs BART table
(the same table the outbound packet hot path already consults, with
exit-node selection honored), and resolves the matched key back to a
NodeView via the live nodeBackend.

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I0d4b0d8997c8e796b7367c46b49b61d4fdc717b0
2026-06-23 14:37:15 -07:00
Brad Fitzpatrick
e9ae398199 wgengine: drop userspaceEngine.peerSequence
Another baby step toward removing slices of peers from the engine.

getStatus iterated peerSequence (a key snapshot built in Reconfig
from cfg.Peers) and then asked wgdev for each peer's stats; peers
that weren't active in wgdev silently fell out. Iterate active wgdev
peers directly via RemoveMatchingPeers(returnFalse) instead.

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I3abd348abc30db706db29b3a785179259e48abda
2026-06-23 14:19:22 -07:00
Jordan Whited
badd0c4f93 wgengine/magicsock: consider VNI as part of peer relay handshake suppression
Otherwise we may never handshake a new peer relay server endpoint
around remote client restarts and/or disco key rotation.

Updates #20215

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-06-23 13:09:52 -07:00
James Tucker
b7422fa873 .gitattributes: explicitly mark text files as such with eol
I'm not keen on us having to deal with the bad side effects of the
autocrlf default, but alas, if it makes things easier.

Fixes #16175
Closes #16176

Signed-off-by: James Tucker <james@tailscale.com>
2026-06-23 13:04:07 -07:00
Brad Fitzpatrick
49e060bbcb wgengine: add Engine.ProbeLocks, drop PeerForIP lock-probe overload
The watchdog (ipn/ipnlocal/watchdog.go) was abusing PeerForIP with an
invalid netip.Addr as a way to acquire and release the engine's
internal locks for deadlock detection. This does the TODO to break it out
into its own method like all the other similarly named methods.

Splitting this out as a prerequisite for a follow-up rewrite of
PeerForIP itself; not having to preserve the lock-probe overload in
the new implementation keeps that follow-up smaller.

Updates #12542
Updates #cleanup

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I25cbffd11aeb65600d9128845404c4918ef88ead
2026-06-23 12:02:49 -07:00
Patrick O'Doherty
72876a91d5 .github: pin govulncheck@1.3.0 (#20219)
Pin govulncheck to resolve panics in the most recent version.

Updates #cleanup

Signed-off-by: Patrick O'Doherty <patrick@tailscale.com>
2026-06-23 11:51:46 -07:00
Brad Fitzpatrick
d22bf51e57 util/cloudenv: detect Hetzner Cloud
Detect Hetzner via /sys/class/dmi/id/sys_vendor == "Hetzner" and wire
up Hetzner's public recursive DNS resolvers (185.12.64.1, 185.12.64.2)
for use as a cloud host resolver.

Fixes #20217

Change-Id: I24a4c51956adfdd5731f62c937e3c7a4a733ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-23 11:10:59 -07:00
Brad Fitzpatrick
1d69894084 ipn/ipnlocal, drive: stop using netmap.NetworkMap in Taildrive too
This applies the same treatment from PR #20162 (netlog) and
PR #20171 (wglog) to the local Taildrive filesystem wiring, ending the
per-netmap-update O(n) rebuild of the drive remotes list.

This moves the O(n peers) taildrive-remote list rebuild from every
peer change (which previously happened regardless of whether you were
even using taildrive) to instead happen only as needed.

That running on every netmap update and was a contributor to the
broader quadratic behavior we want to eliminate when a single peer is
added or removed.

Instead, this introduces drive.RemoteSource, a small interface the
Taildrive filesystem pulls from lazily on incoming WebDAV requests,
and caches by a generation counter. ipn/ipnlocal installs a
driveRemoteSource once at NewLocalBackend time and bumps
LocalBackend.driveGen on the three events that can actually flip the
drive-capable peer set: full netmap installs (domain + self caps),
UpdateNetmapDelta (peer add/remove or per-peer address changes), and
updatePacketFilter (since PeerCapability values are derived from the
packet filter rules, not from peer.CapMap).

The hook itself is kept but narrowed: it no longer takes a
*netmap.NetworkMap and its only remaining job is to re-notify IPN bus
listeners of the current local shares list on full installs.

This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.

(Also add a bunch more tests)

Updates #12542

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I7e3d2f5b4a9c8e1d6f0a3b7c9e2d4f8a1b6c5e9d
2026-06-23 10:41:50 -07:00
Brad Fitzpatrick
988b0905bb wgengine/wglog: stop using netmap.NetworkMap here too
This applies the same treatment from 8f210454dd (netlog) to wglog,
ending use of netmap.NetworkMap and instead getting the canonical data
from LocalBackend/nodeBackend.

This is a dependency to removing the netmap.NetworkMap from
upstream callers, like wgengine.Engine in general.

Updates #12542

Change-Id: Icb5af0799322def048a6f594b49f7d11273f025d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-23 09:06:37 -07:00
Brad Fitzpatrick
295bf20cfd prober: deflake TestHTTPBandwidth
The test transferred only 64 KiB over loopback, which can complete
within a single clock tick on fast CI machines, causing
time.Since(start).Seconds() to return 0 and the
"transfer_time_seconds_total > 0" assertion to fail.

Increase the payload to 1 MiB so zero is genuinely implausible, and
retry up to 3 additional times. If the metric is still zero after 4
total attempts, fail hard — at that size it means the timing logic is
actually broken.

Fixes #20213

Change-Id: I3fab510ce8c567506fea5ad803d35acf40d65700
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-23 08:35:57 -07:00
Brad Fitzpatrick
af2f228a18 ipn/ipnlocal, types/netmap, tsnet: filter unsigned peers on delta path
aa5da2e5f2 (in the 1.99.x dev series, unstable) introduced some bugs,
only some of which were later fixed. This fixed another. As of that
change, tkaFilterNetmapLocked ran only on full netmaps through
LocalBackend.setClientStatusLocked and not peer upserts via new or
changed peers. The later ae743642d9 fixed a regression in the
Engine layer but didn't fix the tkaFilter code from re-running on
upserts.

This add a tkaFilterDeltaMutsLocked pass before
nodeBackend.UpdateNetmapDelta. For each NodeMutationUpsert whose
peer fails the same signature check tkaFilterNetmapLocked applies,
rewrite the upsert in place into a NodeMutationRemove targeting the
same node ID, so magicsock's per-mutation dispatch and
nodeBackend.peers both drop the peer, matching the prior full-netmap
semantics.

New tsnet tests added:

  - TestTailnetLockFiltersUnsignedDeltaPeer covers the new-peer
    case.
  - TestTailnetLockFiltersUnsignedDeltaPeerReplacement covers the
    existing-peer-replacement case, to an empty signature.
  - TestTailnetLockFiltersDeltaPeerWithInvalidSignature like above
    but with a bogus signature.

Updates #12542
Updates tailscale/corp#43767

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib35d0391541fee654867c26489847dbc5b7e2ae8
2026-06-23 08:12:36 -07:00
tsushanth
0b551986fe cmd/k8s-operator: scope HA Service hostname check per-tailnet (#20114)
The ProxyGroup HA Service reconciler's validateService scanned every
Service in the cluster with shouldExpose=true for duplicate hostnames.
With multi-tailnet (Tailnet CRD) support, that scan reaches across
tailnet boundaries:

  * A Service exposed via the single-proxy path (tailscale.com/expose)
    on the primary tailnet would block a ProxyGroup ingress Service
    for the same hostname on a secondary tailnet, even though the two
    live in different reconcilers and different tailnet DNS namespaces.

  * Two ProxyGroups joined to different tailnets via spec.tailnet
    would also block one another for shared hostnames, again despite
    living in separate DNS namespaces.

In both cases the ProxyGroup ingress Service was silently dropped
(IngressSvcInvalid event raised, queue cleared, ConfigMap never
written, ProxyGroup never serves the backend).

This change tightens the check in two ways:

  * Skip Services that aren't themselves managed by the ProxyGroup
    reconciler (use isTailscaleService instead of shouldExpose).
  * For ProxyGroup-managed Services attached to a different
    ProxyGroup, look up that ProxyGroup and skip the duplicate
    report when spec.Tailnet differs from the current one. Fall
    through and flag the collision on lookup failure so genuine
    duplicates are not silently allowed.

Adds regression tests covering both the single-proxy and the
different-tailnet cases. Updates the existing TestValidateService
expected error to reflect the rephrased message.

Updates #20069

Signed-off-by: tsushanth <78000697+tsushanth@users.noreply.github.com>
2026-06-23 14:25:11 +01:00
Brad Fitzpatrick
d6c8702e90 tstest/natlab/vnet: deflake TestPacketSideEffects and TestProtocolQEMU
Both tests started flaking after my 910735448 ("tstest/natlab/vnet:
send unsolicited IPv6 Router Advertisements") added background RA
traffic on v6-enabled networks.

TestPacketSideEffects races the periodic unsolicited-RA goroutine
against its synchronous packet-count assertions: when the multicast
RA fires after the test has registered its sinks, both sinks receive
it and "got 1 packet, want N" becomes "got N+2".

TestProtocolQEMU's reader was doing raw Read on the SOCK_STREAM unix
socket and comparing the whole result to the expected length-prefixed
packet. The kernel is free to coalesce the on-register RA frame and
the test packet into one Read, in which case bytes.Equal fails and
the entire chunk (including the test packet's bytes) gets discarded
as "unexpected", leading to a 5s i/o timeout. Parse the QEMU uint32
length-prefix framing with io.ReadFull instead so we read exactly one
frame per iteration regardless of how the kernel buffers them. The
SOCK_DGRAM path (TestProtocolUnixDgram) keeps the original raw Read
since datagram boundaries are preserved.

These where the top two flakes in oss on the flakes dashboards.

Updates #13038

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I32983656b692921a0f43a4a5e9a8a6ab2555ee49
2026-06-23 05:40:30 -07:00
Brad Fitzpatrick
e0677ccc76 net/tstun, wgengine/filter: track UDP flow state for injected packets
Outbound packets produced by netstack (used by tailscaled with
--tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies)
enter the wrapper via InjectOutbound{,PacketBuffer} and take the
injectedRead path, which bypasses Filter.RunOut.

RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple
into the connection-tracking LRU so that Filter.RunIn admits inbound
replies that no explicit ACL rule covers. Skipping it on the injected
path meant a netstack-side dial of UDP would send fine but the reply
would be dropped as "no matching rule". The kernel-TUN path was
already fine because it goes through RunOut.

Fixes #14229
Fixes #20064

Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b
2026-06-22 15:57:37 -07:00
Alex Valiushko
568c0bda24 go.mod: bump wireguard-go (#20203)
Fix leaking peers that failed to complete the handshake.

Updates #20183

Change-Id: I84f7ea0484f05b090d963a7d12c135a66a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
2026-06-22 13:45:50 -07:00
Anton Tolchanov
e9e209673e net/netcheck: ensure recent history has a full report
suggestExitNodeLocked now ranks exit node candidates using the per-region
latency tracked by the netcheck Client (RecentRegionLatency), which merges
the reports retained in c.prev. That history is only useful for far-away
regions if it contains a full netcheck report, since incremental reports
only re-probe the home region and a handful of the fastest ones.

The full-report cadence in GetReport and the c.prev retention window were
two independent 5-min constants - the way we schedule netchecks ensured
that the history always contaned a full report, but it was not a strong
contract and we did not have any checks around this.

Now full report interval and retention window are driven by the same
var, and a test confirms that the history contains a full report.

Updates tailscale/corp#17516

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2026-06-22 12:28:09 +02:00
Anton Tolchanov
f442cda999 ipn/ipnlocal: consider all DERP regions for exit node recommendations
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.

Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.

Updates tailscale/corp#17516

Signed-off-by: Anton Tolchanov <anton@tailscale.com>
2026-06-22 12:28:09 +02:00
Samy Djemaï
6a275c01db util/linuxfw: clamp MSS to PMTU in both forward directions (#20077)
ClampMSSToPMTU only added a rule matching the output interface (-o tun /
OIFNAME), which clamps the SYN forwarded out towards the tailnet peer but
not the SYN-ACK that arrives on tun and is forwarded back towards the
originating endpoint. As a result only one side of a forwarded handshake
had its MSS clamped; the endpoint on the other side of the proxy kept
advertising an MSS based on its own (larger) MTU.

When path MTU discovery is broken (e.g. proxies created by the Tailscale
Kubernetes operator, where tailscale0 has a 1280 MTU), the unclamped
endpoint's large segments exceed the tun MTU and are silently dropped,
causing TCP connections through proxy group pods to stall mid-stream on
large payloads. The earlier proxy-group fix (#19686) wired ClampMSSToPMTU
into the HA code paths but inherited this single-direction limitation, so
connections could still hang.

Add a second rule matching the input interface (-i tun / IIFNAME) in both
the iptables and nftables runners so both directions of the forwarded
handshake negotiate a PMTU-safe MSS.

Updates #19812

Signed-off-by: Samy Djemaï <53857555+SamyDjemai@users.noreply.github.com>
2026-06-22 11:25:15 +01:00
Mike O'Driscoll
59159d9180 prober: add HTTP bandwidth probe and dial-address override (#20185)
Add HTTPBandwidth/HTTPBandwidthWithDialAddr probe classes that download a
fixed number of bytes and record transfer time and bytes transferred as
Prometheus counters for bandwidth measurement, plus HTTPWithDialAddr and
the shared NewProbeTransport and HTTPBandwidthMetrics helpers.

The dial-address override lets a probe target a specific backend (e.g. a
single Funnel ingress node) while SNI, the Host header, and TLS cert
validation continue to derive from the URL host. HTTPBandwidthMetrics is
exported so other bandwidth probes (e.g. a receiver-reported upload probe)
emit an identical metric set and compare under a shared direction label.

Updates tailscale/corp#41587

Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
2026-06-19 15:33:29 -04:00
License Updater
07f63534b1 licenses: update license notices
Signed-off-by: License Updater <noreply+license-updater@tailscale.com>
2026-06-19 09:45:02 -07:00
Gesa Stupperich
53ef7f92cb sessionrecording: close idle connections after upload
If we don't close the connection between SSH server and recorder
explicitly once it's idle after the upload stream is closed, the
connection stays open and holds on to a port on the server. This
leads to port exhaustion on the server in the medium to long run.

To avoid this, close the idle connections explicitly. As an extra
step of precaution, set an idleConnTimeout of 30 seconds on both
the HTTP1 and HTTP2 recorder clients.

Updates tailscale/corp#43742

Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
2026-06-19 13:42:14 +01:00
Brendan Creane
0861dafddf net/dns: restore SELinux context on /etc/resolv.conf after rename (#20167)
In direct mode we write resolv.conf via a temp file and rename(2), which
preserves the source's generic etc_t label instead of net_conf_t, causing
AVC denials when NetworkManager later manages the file. Run restorecon
after the rename (Linux, SELinux-enforcing, best effort) to restore the
policy-default label.

Fixes #20149

Signed-off-by: Brendan Creane <bcreane@gmail.com>
2026-06-18 16:36:56 -07:00
Jordan Whited
54005752a5 wgengine/magicsock: suppress TSMP disco advert when bestAddr is peer relay
Updates #20156

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-06-18 11:43:00 -07:00
Simon Law
00b9e8d8ce ipn: add fmt.Stringer support to NotifyWatchOpt (#20072)
This patch adds support for the fmt.Stringer interface to the
ipn.NotifyWatchOpt enum. This is useful when debugging these bitmasks.

For example:

	fmt.Printf("%s", ipn.NotifyPeerChanges | ipn.NotifyNoNetMap)
	// Output: (ipn.NotifyPeerChanges | ipn.NotifyNoNetMap)

Fixes #20066

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-18 10:27:16 -07:00
Alex Chan
c3c2aa7093 all: don't repeat the the word "the" unnecessarily
Updates #cleanup

Change-Id: Ic1f430cd5dbf6cc1a385c59074a5d5cabe6fca57
Signed-off-by: Alex Chan <alexc@tailscale.com>
2026-06-18 16:32:08 +01:00
BeckyPauley
35a1a413f9 cmd/{containerboot,k8s-operator}: add 4via6 support in singleton egress (#19983)
Add support for configuring egress to destinations reachable via 4via6
subnet routes, using either the synthesized 4via6 address or the MagicDNS
name (in the form <IPv4-with-hyphens>-via-<siteID>[.*]).

Also update the Connector to validate and advertise 4via6 subnet routes.

Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.

This change only affects standalone egress proxies — ProxyGroup egress
requires IPv6 support before it can use 4via6.

Updates #19334

Change-Id: I6faecd6eb61ab55fc0cd97fe417af6b6a12fe7fc

Signed-off-by: Becky Pauley <becky@tailscale.com>
2026-06-18 16:13:10 +01:00
Simon Law
e3b16135b2 util/set: add iterator support to Set[T] (#20159)
This patch adds:

- Set.All which returns an iter.Seq to complement Set.Slice.

- Set.AddSeq which adds an iter.Seq.

- Set.DeleteSeq which deletes an iter.Seq to complement Set.AddSeq
  and provide the missing method for deleting multiple elements.

- Set.DeleteSlice and Set.DeleteSet to complement AddSlice and AddSet.

Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-18 00:12:56 -07:00
Jordan Whited
be2f554dd3 control/controlknobs,wgengine/magicsock: disable TSMP disco advert if netmap caching is disabled
Updates #20081

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2026-06-17 18:45:38 -07:00
Brad Fitzpatrick
8f210454dd wgengine/netlog: stop using netmap.NetworkMap type, use LocalBackend
The Logger previously took a *netmap.NetworkMap at Startup and on every
ReconfigNetworkMap call, denormalizing it into per-IP and self lookup
maps. That denormalization is O(n) over all peers and ran on every
netmap update, contributing to the broader quadratic behavior we want
to eliminate when a single peer is added or removed.

Instead, this makes netlog ask LocalBackend (well, nodeBackend) for
the info it needs, letting us remove the netmap.NetworkMap type
entirely from the netlog package.

This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.

Updates #12542

Change-Id: Ib5f2de96e788a667332c0a6f7ac833b3d0053b5c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-06-17 15:11:57 -07:00
Simon Law
994b2c8459 tsnet: fix tests that have a ping that races its destination node (#20151)
In PR #17809, @bradfitz tried to fix tsnet_test.TestConn by making the
second tailscaled start after the first was fully set up. On slow
runners, the Ping for connectivity to the second server would race
against that server establishing a connection with its DERP home. If
the Ping arrived too soon, the DERP server would respond with
PeerGoneNotHome and the Ping would wait for its full timeout before
failing the test.

This patch introduces waitForHomeDERPConnected and makes startServer
block until the server’s home DERP has established its connection.

This patch also reduces the Ping timeout to 10 seconds for the tsnet
tests, which should be enough that a hung Ping is fast enough for
interactive debugging, but with enough headroom for a RekeyTimeout.

Fixes #12766

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-17 14:26:05 -07:00
Naman Sood
47333e9487 feature/conn25: recreate transit IP mappings when connector loses them
Mappings from transit IPs to real IPs are stored ephemerally in the
connector, so they're lost on restart. When we send a packet to the
connector with a transit IP it does not recognize, it sends us a TSMP
message saying so (see #19883). If we (the client) know of such a
mapping, we now re-send it to the connector so that a connection can
proceed.

Fixes tailscale/corp#34256.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-06-17 13:50:51 -04:00
Simon Law
88f5206511 types/geo: add support for ScalarMarshaler and ScalarUnmarshaler (#20158)
Add support for the still pending encoding.ScalarMarshaler and
encoding.ScalarUnmarshaler interfaces, approved in golang/go#56235.

This patch deprecates geo.Point.MarshalUint64 in favour of
geo.Point.MarshalScalar and also adds an inline directive for go fix.
The same applies for the UnmarshalUint64 and UnmarshalScalar methods.

Updates #16583

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-16 16:36:43 -07:00
Simon Law
f0a1aa818f tailcfg: fix typo in doc comment for tailcfg.Node.DisplayNames (#20155)
Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-16 10:23:44 -07:00
James Tucker
26b2ed0a6a net/packet: clarify minFragBlks reuse for IPv6 and test chained ext header
Follow-up cleanups to the IPv6 fragment extension header support added in
the previous commit:

- Document that minFragBlks is sized for IPv4 but intentionally reused by
  decode6 for IPv6 fragments, where it is conservative (IPv6 fragments
  carry no per-fragment IP header) and only ever rejects more later
  fragments as Unknown, never fewer.

- Add a TestDecode case for a first fragment reachable only through a
  chained extension header (base Next Header = Hop-by-Hop Options, which
  chains to Fragment). decode6 only parses the Fragment header when it is
  the base header's immediate Next Header, so this must classify as
  Unknown. The test locks in that scoping decision.

Updates #20083
Updates #20140

Change-Id: Ibece03c6baf2385b0cc399f179819b08cbe921cc
Signed-off-by: James Tucker <james@tailscale.com>
2026-06-16 10:16:06 -07:00
Bobi Gunardi
ca20611d11 util: add parse fallback helpers (#20022)
util/def: add def.Bool and def.Duration default parse helpers

Replace multiple instances of def.Bool and def.Duration with a new util/def
package.

Updates #20018

Co-authored-by: Bobby <boby@codelabs.co.id>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
Signed-off-by: Bobby <boby@codelabs.co.id>
Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-15 15:58:51 -07:00
James Scott
94fbb03352 logtail: add stateless generic UploadLogs (#20005)
Add UploadLogs, a stateless alternative to NewLogger for callers that
want to push a batch of log entries without the background uploader,
ring buffer, stderr echoing, or network-up gating that a Logger
provides. Entries are encoded, batched up to the server's maximum
upload size, and POSTed synchronously; unlike Logger it does not retry.

The Logger construction is split into a new unexported newLogger so the
connection/encode/upload machinery is shared without starting the
background goroutine.

Log entries are modeled as a generic LogEntry[T] whose Value is inlined
(via go-json-experiment) alongside the reserved "logtail" metadata
member. T may be a struct (or pointer), a map with a string key, or a
jsontext.Value; use jsontext.Value to mix differently-shaped payloads in
a single upload. UploadLogs fills in client_time/proc_id/proc_seq from
the Config where the caller leaves them zero.

Updates tailscale/corp#40908

Change-Id: Idbf23cd0eb8233082fbdb9abed0f6f153b9225ba

Signed-off-by: James Scott <jim@tailscale.com>
2026-06-15 13:27:49 -07:00
Simon Law
eddd019ee4 ipn/ipnlocal: protect populatePeerStatusLocked from nil Hostinfo (#20150)
ipnlocal.LocalBackend.populatePeerStatusLocked assumed that Hostinfo
was always valid, but that’s not always true, especially in tests.
ipnlocal.peerAPIPorts suffered from a similar assumption.

This patch checks for NodeView.Valid and Hostinfo.Valid; assuming the
zero value as a safe default.

Updates #8948
Updates #12542

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-06-15 13:14:12 -07:00