* Revert "control/controlclient: continue map poll during key expiry to receive extensions"
This reverts commit 6a822dcc36. This commit
has caused test failures in the corp repo by unexpected changing the login
behaviour when nodes have a valid node key.
Updates tailscale/corp#43705
Updates #19326
Signed-off-by: Alex Chan <alexc@tailscale.com>
* Revert "tsnet: test key extension after server restart"
This reverts commit 317201375f. This test
relies on changes in 317201375f, which is
also being reverted because it causes test failures in corp.
Updates tailscale/corp#43705
Updates #19326
Signed-off-by: Alex Chan <alexc@tailscale.com>
---------
Signed-off-by: Alex Chan <alexc@tailscale.com>
tsdial.Dialer.SetNetMap rebuilt an O(n peers) map of MagicDNS names on
every netmap change. As we move toward per-peer incremental deltas,
this becomes quadratic. This removes it and replaces it with
SetResolveMagicDNS, a callback into LocalBackend that looks up
hostnames from nodeBackend's new nodeByName index (populated alongside
nodeByAddr/nodeByKey on both full and delta paths). The index stores
both FQDNs and short names as keys.
This is the same treatment applied to netlog (8f210454d), wglog
(988b0905b), and drive (1d6989408): stop pushing *netmap.NetworkMap
into subsystems and instead have them pull from LocalBackend's live
data via callbacks.
Updates #12542
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I24557ab0c8a27636e08e4779bcfd3ec633db0a78
Another baby step toward removing slices of peers from the engine.
getStatus iterated peerSequence (a key snapshot built in Reconfig
from cfg.Peers) and then asked wgdev for each peer's stats; peers
that weren't active in wgdev silently fell out. Iterate active wgdev
peers directly via RemoveMatchingPeers(returnFalse) instead.
Updates #12542
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I3abd348abc30db706db29b3a785179259e48abda
aa5da2e5f2 (in the 1.99.x dev series, unstable) introduced some bugs,
only some of which were later fixed. This fixed another. As of that
change, tkaFilterNetmapLocked ran only on full netmaps through
LocalBackend.setClientStatusLocked and not peer upserts via new or
changed peers. The later ae743642d9 fixed a regression in the
Engine layer but didn't fix the tkaFilter code from re-running on
upserts.
This add a tkaFilterDeltaMutsLocked pass before
nodeBackend.UpdateNetmapDelta. For each NodeMutationUpsert whose
peer fails the same signature check tkaFilterNetmapLocked applies,
rewrite the upsert in place into a NodeMutationRemove targeting the
same node ID, so magicsock's per-mutation dispatch and
nodeBackend.peers both drop the peer, matching the prior full-netmap
semantics.
New tsnet tests added:
- TestTailnetLockFiltersUnsignedDeltaPeer covers the new-peer
case.
- TestTailnetLockFiltersUnsignedDeltaPeerReplacement covers the
existing-peer-replacement case, to an empty signature.
- TestTailnetLockFiltersDeltaPeerWithInvalidSignature like above
but with a bogus signature.
Updates #12542
Updates tailscale/corp#43767
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib35d0391541fee654867c26489847dbc5b7e2ae8
Outbound packets produced by netstack (used by tailscaled with
--tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies)
enter the wrapper via InjectOutbound{,PacketBuffer} and take the
injectedRead path, which bypasses Filter.RunOut.
RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple
into the connection-tracking LRU so that Filter.RunIn admits inbound
replies that no explicit ACL rule covers. Skipping it on the injected
path meant a netstack-side dial of UDP would send fine but the reply
would be dropped as "no matching rule". The kernel-TUN path was
already fine because it goes through RunOut.
Fixes#14229Fixes#20064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.
Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
The Logger previously took a *netmap.NetworkMap at Startup and on every
ReconfigNetworkMap call, denormalizing it into per-IP and self lookup
maps. That denormalization is O(n) over all peers and ran on every
netmap update, contributing to the broader quadratic behavior we want
to eliminate when a single peer is added or removed.
Instead, this makes netlog ask LocalBackend (well, nodeBackend) for
the info it needs, letting us remove the netmap.NetworkMap type
entirely from the netlog package.
This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Ib5f2de96e788a667332c0a6f7ac833b3d0053b5c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In PR #17809, @bradfitz tried to fix tsnet_test.TestConn by making the
second tailscaled start after the first was fully set up. On slow
runners, the Ping for connectivity to the second server would race
against that server establishing a connection with its DERP home. If
the Ping arrived too soon, the DERP server would respond with
PeerGoneNotHome and the Ping would wait for its full timeout before
failing the test.
This patch introduces waitForHomeDERPConnected and makes startServer
block until the server’s home DERP has established its connection.
This patch also reduces the Ping timeout to 10 seconds for the tsnet
tests, which should be enough that a hung Ping is fast enough for
interactive debugging, but with enough headroom for a RekeyTimeout.
Fixes#12766
Signed-off-by: Simon Law <sfllaw@tailscale.com>
util/def: add def.Bool and def.Duration default parse helpers
Replace multiple instances of def.Bool and def.Duration with a new util/def
package.
Updates #20018
Co-authored-by: Bobby <boby@codelabs.co.id>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
Signed-off-by: Bobby <boby@codelabs.co.id>
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The earlier aa5da2e5f2 made peer adds and removes through a netmap
delta path that mutates only nodeBackend, on the assumption that
PeerForIP, lookupPeerByIP, the engine's wireguard config
(e.lastCfgFull), the engine BART, wgdev's PeerLookupFunc closure, and
the engine's cached netmap (e.netMap) would all stay correct without
further updates. They don't. I'd totally forgotten that
Engine.PeerForIP has its own alternate IP-to-peer lookup codepath.
Concretely, all of these failed for a peer that arrived via
[tailcfg.MapResponse.PeersChanged] (and never via a full
[tailcfg.MapResponse.Peers] list):
- [wgengine.Engine.PeerForIP] read from e.netMap and e.lastCfgFull
(neither updated on the delta path) and so missed the new
peer. The rando non-data-plane callers (Ping, TSMP, pendopen,
debug endpoints, tsdial.Dialer.UseNetstackForIP for tsnet and
onlyNetstack tailscaled) all returned "no matching peer".
- The engine BART (built from e.lastCfgFull) missed the new peer's
subnet routes / exit-node default routes.
- wgdev's [device.PeerLookupFunc] closure (rebuilt only inside
wgcfg.ReconfigDevice) didn't have the new peer's noise key, so
outbound encryption to the new peer dropped the packet even when
SetPeerByIPPacketFunc returned the right NodePublic.
- And nothing in the delta path triggered NodeMutationRemove to
flow through to authReconfig either, so the same stale state
pointed at removed peers indefinitely.
So just (functionally) revert it for now, to have something easily
cherry-pickable to the 1.100 release branch. Proper fixes can come later
for the next release.
This also adds three new tests:
- TestPingPeerLearnedViaDelta runs disco and TSMP subtests over a
delta-added peer with only self addresses. disco exercises the
cold PeerForIP path (magicsock); TSMP exercises the full data path
through wgdev encryption. Both fail without this fix.
- TestPingSubnetRouteOfDeltaPeer exercises a subnet-router peer
arriving via delta. With s1 in --accept-routes mode, an IP
inside the advertised CIDR must resolve to s2 and a TSMP ping
must round-trip. Hits the BART + lastCfgFull + wgdev staleness
in one go.
- TestPingSelfReturnsIsLocalIP is a regression guard for the
IsSelf early-out in Engine.Ping. Passes on main today; included
here so future refactors of PeerForIP can't regress self
handling without test breakage.
Updates tailscale/corp#43394
Change-Id: I7a049271359bd73e7147ae9e2554e85614c2b8d2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
tsnet depends on logpolicy, which in turn depended on util/syspolicy
because of a single LogTarget policy setting it uses.
In this commit, we replace that dependency with a feature.Hook,
which only tailscaled or its platform-specific alternatives should set.
Updates #20031
Signed-off-by: Nick Khyl <nickk@tailscale.com>
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.
This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This adds tsnet.Server.ListenSSH which, if the SSH feature is linked,
returns a net.Listener whose Accept yields *tailssh.Session values (as
net.Conn). This lets tsnet apps accept incoming SSH connections to
implement custom TUI applications.
Basic apps can use net.Conn directly (Read/Write/Close). Rich apps
import ssh/tailssh and type-assert for peer identity, PTY, signals,
etc. If feature/ssh isn't imported, ListenSSH returns an error.
Includes a demo guess-the-number game in tsnet/example/ssh-game.
Updates tailscale/corp#37839
Change-Id: I4e7c3c96afb030cdf4da8f2d8b2253820628129a
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
If we dispatch a ping too early (after a later patch removes a 250ms
blockage) then the ping may be lost due to the peers not yet knowing
about each other. The ping is retained in order to setup and ensure a
wireguard session prior to test flow.
Updates #19822
Change-Id: I6cfea28931646a9387b6ffc2654e72cd846f4e55
Signed-off-by: James Tucker <james@tailscale.com>
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously, sharding required tests to opt in by calling tstest.Shard,
which used a process-global counter to assign each test to a shard.
This had two problems: most tests didn't call it, so they ran on every
shard (defeating the purpose), and shard assignments were unstable
(depended on call order, so adding a test could reshuffle others).
Remove tstest.Shard and tstest.SkipOnUnshardedCI entirely. Instead,
have testwrapper implement sharding automatically for all tests: when
TS_TEST_SHARD=N/M is set, it uses "go list -json" (no compilation) to
find test source files, scans them for top-level Test/Benchmark/
Example/Fuzz function names, and filters by fnv32a(name) % M == N-1.
The filtered names are passed as an anchored -run regex to go test.
Using go list instead of "go test -list" avoids linking the test binary
twice (Go's build cache does not cache test binary linking).
Fixes#19886
Change-Id: I62ab7b3d757324d4c5fd0b5de50c1e3742681791
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In PR tailscale/corp#30448, we originally decided to break ties using
SHA256 for our rendezvous hashing algorithm. Now that we’ve had some
experience with it, we think that FNV-1a is a better choice. It
distributes bits evenly, it’s much faster, and it doesn’t need to be
cryptographically secure. The FNV designers recommend FNV-1a over the
deprecated FNV-1.
This PR makes the switch and updates the related tests, since changing
the algorithm changes which stable pick gets selected. As of 2026-05,
this is the best time to make this change, since there are almost no
clients in the wild with traffic steering enabled.
Updates #17366
Updates tailscale/corp#29964
Updates tailscale/corp#29966
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
For large tailnets (~50k+ nodes) with frequent peer churn (ephemeral
GitHub Actions workers etc.), tailscaled used to rebuild the full
netmap and fan it out on the IPN bus on every MapResponse that
added or removed a peer. There were two O(N) costs per delta: the
full netmap rebuild + every Notify.NetMap encode to every bus watcher.
This change tackles both:
1. Plumb O(1) peer add/remove through the delta path. PeersChanged
and PeersRemoved no longer prevent the delta happy path; instead,
they mutate the per-node-backend peer map in place.
2. Restrict ipn.Notify.NetMap emission to the platforms whose host
GUIs still depend on it (Windows, macOS, iOS) and migrate
in-tree consumers off it everywhere else:
- Migrate reactive consumers (containerboot, kube agents,
sniproxy, tsconsensus, etc.) off Notify.NetMap to the
previously-added Notify.SelfChange signal so they no longer
have to subscribe to the full netmap.
- Add ipn.NotifyNoNetMap so GUI clients on "legacy-emit" platforms
that have already migrated can opt out of the per-watcher
NetMap encode.
- Gate Notify.NetMap emission on the producer side by a compile-
time GOOS check, so the supporting code is dead-code-eliminated
on Linux and other geese where no GUI consumer needs it.
Re-running BenchmarkGiantTailnet from tstest/largetailnet, which was
added along with baseline numbers on unmodified main in ad5436af0d,
the per-delta cost (one peer add+remove pair) is now ~O(1) regardless
of tailnet size N:
N no-watcher (ms/op) bus-watcher (ms/op)
before now factor before now factor
10000 32 0.11 300x 166 0.13 1300x
50000 222 0.11 2000x 865 0.13 6700x
100000 504 0.12 4100x 1765 0.13 13400x
250000 1551 0.12 12500x 4696 0.15 32400x
Updates #12542
Change-Id: I94e34b37331d1a8ec74c299deffadf4d061fda9e
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The traffic package contains helpers for evaluating traffic steering
scores and picking appropriate nodes. These were extracted from
ipnlocal.suggestExitNodeUsingTrafficSteering so they can be reused by
the new routecheck package to probe exit nodes in priority order.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The tailscale.com/wif package brings in the AWS SDK
(github.com/aws/aws-sdk-go-v2/{config,sts,...} and github.com/aws/smithy-go)
to support fetching ID tokens from AWS IMDS for workload identity
federation. Until now, tsnet pulled this in unconditionally via
feature/condregister/identityfederation, costing ~70 unwanted deps for
every tsnet program whether or not it uses workload identity federation.
These AWS SDK deps were originally removed from tsnet on 2025-09-29 by
commit 69c79cb9f ("ipn/store, feature/condregister: move AWS + Kube
store registration to condregister"). They were then accidentally added
back on 2026-01-14 by commit 6a6aa805d ("cmd,feature: add identity
token auto generation for workload identity", PR #18373) when the new
wif package was wired into tsnet via feature/identityfederation.
Drop the blanket import. tsnet programs that want workload identity
federation now opt in with:
import _ "tailscale.com/feature/identityfederation"
The hook lookup in resolveAuthKey already uses GetOk and degrades
gracefully when the feature isn't linked, so existing programs that
don't use workload identity federation see no behavior change. The
tailscale CLI still imports the condregister wrapper directly, so its
behavior is also unchanged.
Lock this in with TestDeps additions: tailscale.com/wif as a BadDep,
plus substring checks in OnDep that fail on any github.com/aws/ or
k8s.io/ dependency creeping back in.
Also, switch cmd/gitops-pusher from the condregister wrapper to a
direct import of feature/identityfederation: gitops-pusher's auth flow
calls HookExchangeJWTForTokenViaWIF directly, so it shouldn't be
subject to the ts_omit_identityfederation build tag.
Updates #12614
Change-Id: I70599f2bdd4d3666b26a859d5b76caa5d6b94507
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Commit 69c79cb9f (Sep 2025) moved awsstore and kubestore registration
behind condregister build tags so tsnet wouldn't pull in the AWS SDK
and Kubernetes client by default. The accompanying TestDeps BadDeps
entry was missed, so PR #19667 (which re-added those imports) wasn't
caught by the test.
Add the two packages to BadDeps so future regressions fail the test.
Updates #19667
Updates #12614
Change-Id: I903b7c976e5e122cc0c0b956dc73740f5d474fac
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
There are only a couple endpoints that check peer capabilities. Keeping
permission checks with the code that assumes they were performed, rather
than with the routing layer, feels easier to reason about.
Check that the caller is actually a peer and pass their capabilities via
a context value for handlers that want to check them.
Along with this, simplify the helper handler wrappers that are not
needed for most of the endpoints.
Updates #40851
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Add two narrower accessors alongside the existing
[LocalBackend.NetMap], with docs that distinguish their semantics:
- NetMapNoPeers: cheap (returns the cached *netmap.NetworkMap with
a possibly-stale Peers slice). For callers that only read non-Peers
fields like SelfNode, DNS, PacketFilter, capabilities.
- NetMapWithPeers: documented as returning an up-to-date Peers slice.
For callers that genuinely need to iterate Peers or call
PeerByXxx.
Mark the existing NetMap deprecated and point readers at the two new
accessors. NetMap, NetMapNoPeers, and NetMapWithPeers all currently
return the same value (b.currentNode().NetMap()): this commit is a
no-op behaviorally, just a renaming and migration of in-tree callers.
A subsequent change in the same series will switch
NetMapWithPeers to actually rebuild the Peers slice from the live
per-node-backend peers map (O(N) per call), at which point the
distinction between the two new accessors becomes load-bearing.
Migrate in-tree callers to the appropriate accessor based on what
fields they read:
- NetMapNoPeers (most common): localapi handlers, peerapi accept,
GetCertPEMWithValidity, web client noise request, doctor DNS
resolver check, tsnet CertDomains/TailscaleIPs, ssh/tailssh
SSH-policy/cap reads, several LocalBackend internals
(isLocalIP, allowExitNodeDNSProxyToServeName, pauseForNetwork
nil-check, serve config).
- NetMapWithPeers: writeNetmapToDiskLocked (persist full netmap to
disk for fast restart), PeerByTailscaleIP lookup.
Tests still call the legacy NetMap; they'll see the deprecation
warning but otherwise behave identically.
Also add two pieces of plumbing the next change in this series will
need, but which are already useful on their own:
- [client/local.GetDebugResultJSON]: a generic [Client.DebugResultJSON]
that decodes directly into a target type T, avoiding the
marshal/unmarshal roundtrip callers otherwise need.
- localapi "current-netmap" debug action: returns the current
netmap (with peers) as JSON. Documented as debug-only — the
netmap.NetworkMap shape is internal and may change without notice.
This commit is part of a series breaking up a larger change for
review; on its own it is a no-op refactor.
Updates #12542
Change-Id: Idbb30707414f8da3149c44ca0273262708375b02
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a Go benchmark that exercises a single tailnet client (a [tsnet.Server]
running in the test process) against a synthetic large initial netmap and
a stream of caller-driven peer add/remove deltas, all in-process.
The harness is split in two parts:
- tstest/largetailnet, a reusable package containing a [Streamer]
that hijacks the map long-poll on a [testcontrol.Server] via the new
AltMapStream hook, sends one initial MapResponse with N synthetic
peers, and forwards caller-supplied delta MapResponses on the same
stream. Helpers like MakePeer / AllocPeer build synthetic peers with
unique IDs and addresses derived from the Tailscale ULA range.
- tstest/largetailnet/largetailnet_test.go, BenchmarkGiantTailnet
(headless tailscaled workload, no IPN bus subscriber) and
BenchmarkGiantTailnetBusWatcher (GUI-client workload with one
Notify subscriber attached). Both are gated on
--actually-test-giant-tailnet (skipped by default), stand up an
in-process testcontrol + tsnet.Server, let Up block until the
initial N-peer netmap has been processed, then ResetTimer and run
add+remove pairs via b.Loop. Per-delta sync is via a test-only
[ipnlocal.LocalBackend.AwaitNodeKeyForTest] channel that closes
once the just-added peer key appears in the netmap (no-watcher
variant) or via bus-Notify drain (bus-watcher variant).
To support the hijack, [testcontrol.Server] grows an AltMapStream hook
and a small MapStreamWriter interface for benchmarks/stress tests that
need to drive a controlled MapResponse sequence; the normal serveMap
path is untouched when AltMapStream is nil. The streamer answers
non-streaming "lite" map polls (which controlclient issues before the
streaming long-poll to push HostInfo) with an empty MapResponse and
returns immediately, so the streaming poll that follows is the one
that gets the initial netmap.
The benchmark is intended for before/after comparisons of netmap- and
delta-handling changes targeted at large tailnets. CPU profiles on
unmodified main show the expected O(N) hotspots:
setControlClientStatusLocked / authReconfigLocked /
userspaceEngine.Reconfig / setNetMapLocked, plus JSON encoding of the
full Notify.NetMap to bus watchers (which dominates the BusWatcher
variant).
Median ms/op over 10 runs on unmodified main, by tailnet size N:
N no-watcher bus-watcher
10000 32 166
50000 222 865
100000 504 1765
250000 1551 4696
Recommended invocation:
go test ./tstest/largetailnet/ -run=^$ \
-bench='BenchmarkGiantTailnet(BusWatcher)?$' \
-benchtime=2000x -timeout=10m \
--actually-test-giant-tailnet \
--giant-tailnet-n=250000 \
-cpuprofile=/tmp/giant.cpu.pprof
Updates #12542
Change-Id: I4f5b2bb271a36ba853d5a0ffe82054ef2b15c585
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This drops an indirect dependency on the old github.com/docker/docker
(which was replaced with github.com/moby/moby) and fixes a couple recent
CVEs.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
Adds a CI check to keep opted-in directories' README.md files in sync
with their package godoc. For now tsnet (and its sub-packages under
tsnet/example) is the only opted-in tree. The list of directories
lives in misc/genreadme/genreadme.go as defaultRoots, so CI and humans
both just run `./tool/go run ./misc/genreadme` with no arguments.
The check piggybacks on the existing go_generate job in test.yml and
fails if any README.md is out of date, pointing the user at the same
command.
Along the way:
- tempfork/pkgdoc now emits Markdown instead of plain text: headings
become level-2 with no {#hdr-...} anchors, and [Symbol] doc links
resolve to pkg.go.dev URLs, including for symbols in the current
package (which the default Printer would otherwise emit as bare
#Name fragments with no backing anchor in a README). Parsing no
longer uses parser.ImportsOnly, so doc.Package knows the package's
symbols and can resolve [Symbol] links at all.
- genreadme also emits a pkg.go.dev Go Reference badge at the top of
a library package's README; suppressed for package main.
- tsnet/tsnet.go's package godoc is expanded in idiomatic godoc
syntax — [Type], [Type.Method], reference-style [link]: URL
definitions — rather than Markdown-flavored [text](url) or
backtick-quoted identifiers, so that both pkg.go.dev and the
generated README.md render cleanly from a single source.
Fixes#19431Fixes#19483Fixes#19470
Change-Id: I8ca37e9e7b3bd446b8bfa7a91ac548f142688cb1
Co-authored-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Signed-off-by: Walter Poupore <walterp@tailscale.com>
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
fixestailscale/corp#39422
Updates tailscale/certstore for properly macOS support and
builds the request signing support into macOS builds. iOS and builds
that do not use cGo are omitted.
Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
Add ExtraRootCAs *x509.CertPool to tsd.System and plumb it through
the control client, noise transport, DERP, and wgengine layers so
that platforms like Android can inject user-installed CA certificates
into Go's TLS verification.
tlsdial.Config now honors base.RootCAs as additional trusted roots,
tried after system roots and before the baked-in LetsEncrypt fallback.
SetConfigExpectedCert gets the same treatment for domain-fronted DERP.
The Android client will set sys.ExtraRootCAs with a pool built from
x509.SystemCertPool + user-installed certs obtained via the Android
KeyStore API, replacing the current SSL_CERT_DIR environment variable
approach.
Updates #8085
Change-Id: Iecce0fd140cd5aa0331b124e55a7045e24d8e0c2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Add a new vet analyzer that checks t.Run subtest names don't contain
characters requiring quoting when re-running via "go test -run". This
enforces the style guide rule: don't use spaces or punctuation in
subtest names.
The analyzer flags:
- Direct t.Run calls with string literal names containing spaces,
regex metacharacters, quotes, or other problematic characters
- Table-driven t.Run(tt.name, ...) calls where tt ranges over a
slice/map literal with bad name field values
Also fix all 978 existing violations across 81 test files, replacing
spaces with hyphens and shortening long sentence-like names to concise
hyphenated forms.
Updates #19242
Change-Id: Ib0ad96a111bd8e764582d1d4902fe2599454ab65
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Prior to this change, closing multiple ServiceListeners concurrently
could result in failures as the independent close operations vie for the
attention of the Server's LocalBackend. The close operations would each
obtain the current ETag of the serve config and try to write new serve
config using this ETag. When one write invalidated the ETag of another,
the latter would fail. Exacerbating the issue, ServiceListener.Close
cannot be retried.
This change resolves the bug by using Server.mu to synchronize across
all ServiceListener.Close operations, ensuring they happen serially.
Fixes#19169
Signed-off-by: Harry Harpham <harry@tailscale.com>
This is a regression test for #19166, in which it was discovered that
after calling Server.ListenService for multiple Services, only the
Service from the most recent call would be advertised.
The bug was fixed in 99f8039101
Updates #19166
Signed-off-by: Harry Harpham <harry@tailscale.com>
AppendTo returns the new slice but the result was discarded,
so only the newly added service was advertised.
Signed-off-by: Evan Champion <110177090+evan314159@users.noreply.github.com>
Previous to this change, closing the listener returned by
Server.ListenService would free system resources, but not clean up state
in the Server's local backend. With this change, the local backend state
is now cleaned on close.
Fixestailscale/corp#35860
Signed-off-by: Harry Harpham <harry@tailscale.com>
TestListenService needs to setup state (capabilities, advertised routes,
ACL tags, etc.). It is imperative that this state propagates to all nodes
in the test tailnet before proceeding with the test. To achieve this,
TestListenService currently polls each node's local backend in a loop.
Using local.Client.WatchIPNBus improves the situation by blocking until
a new netmap comes in.
Fixestailscale/corp#36244
Signed-off-by: Harry Harpham <harry@tailscale.com>
This helps us distribute tests across CI runners. Most tsnet tests call
tstest.Shard, but two recently added tests do not: tsnet.TestFunnelClose
and tsnet.TestListenService. This commit resolves the oversight.
Fixestailscale/corp#36242
Signed-off-by: Harry Harpham <harry@tailscale.com>
Replace byte-at-a-time ReadByte loops with Peek+Discard in the DERP
read path. Peek returns a slice into bufio's internal buffer without
allocating, and Discard advances the read pointer without copying.
Introduce util/bufiox with a BufferedReader interface and ReadFull
helper that uses Peek+copy+Discard as an allocation-free alternative
to io.ReadFull.
- derp.ReadFrameHeader: replace 5× ReadByte with Peek(5)+Discard(5),
reading the frame type and length directly from the peeked slice.
Remove now-unused readUint32 helper.
name old ns/op new ns/op speedup
ReadFrameHeader-8 24.2 12.4 ~2x
(0 allocs/op in both)
- key.NodePublic.ReadRawWithoutAllocating: replace 32× ReadByte with
bufiox.ReadFull. Addresses the "Dear future" comment about switching
away from byte-at-a-time reads once a non-escaping alternative exists.
name old ns/op new ns/op speedup
NodeReadRawWithoutAllocating-8 140 43.6 ~3.2x
(0 allocs/op in both)
- derpserver.handleFramePing: replace io.ReadFull with bufiox.ReadFull.
Updates tailscale/corp#38509
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
tsnet has a 5s sleep as part of its logic waiting to log successful auth.
Add an additional channel that will interrupt this sleep early if the
local backend's state changes before then. This is early enough in the
bootstrap logic that the local client has not been set up yet, so we
subscribe directly on the local backend in keeping with the rest of the
function, but it would be nice to port the whole function to the new
eventbus in a separate change.
Note this does not affect how quickly auth actually happens, it just
ensures we more responsively log the fact that auth state has changed.
Updates #16340
Change-Id: I7a28fd3927bbcdead9a5aad39f4a3596b5f659b0
Signed-off-by: Tom Proctor <tomhjp@users.noreply.github.com>
Updates #19050
When tsnet.Server.start() is called with both Hostname and Dir explicitly
set, os.Executable() failure should not prevent the server from starting.
Extend the existing ios fallback to also cover darwin, where the same
failure occurs when the Go runtime is embedded in a framework launched
via Xcode's debug launcher.
Signed-off-by: Prakash Rudraraju <prakashrj@yahoo.com>
When a client starts up without being able to connect to control, it
sends its discoKey to other nodes it wants to communicate with over
TSMP. This disco key will be a newer key than the one control knows
about.
If the client that can connect to control gets a full netmap, ensure
that the disco key for the node not connected to control is not
overwritten with the stale key control knows about.
This is implemented through keeping track of mapSession and use that for
the discokey injection if it is available. This ensures that we are not
constantly resetting the wireguard connection when getting the wrong
keys from control.
This is implemented as:
- If the key is received via TSMP:
- Set lastSeen for the peer to now()
- Set online for the peer to false
- When processing new keys, only accept keys where either:
- Peer is online
- lastSeen is newer than existing last seen
If mapSession is not available, as in we are not yet connected to
control, punt down the disco key injection to magicsock.
Ideally, we will want to have mapSession be long lived at some point in
the near future so we only need to inject keys in one location and then
also use that for testing and loading the cache, but that is a yak for
another PR.
Updates #12639
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
This commit adds a "fallback" mechanism to tsnet to allow
the consumer to set "TS_CONTROL_URL" to override the control server.
This allows tsnet applications to gain support for an alternative
control server by just updating without explicitly exposing the
ControlURL option.
Updates #16934
Signed-off-by: Kristoffer Dalby <kristoffer@dalby.cc>
This makes tsnet apps not depend on x/crypto/ssh and locks that in with a test.
It also paves the wave for tsnet apps to opt-in to SSH support via a
blank feature import in the future.
Updates #12614
Change-Id: Ica85628f89c8f015413b074f5001b82b27c953a9
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
After we intercept a DNS response and assign magic and transit addresses
we must communicate the assignment to our connector so that it can
direct traffic when it arrives.
Use the recently added peerapi endpoint to send the addresses.
Updates tailscale/corp#34258
Signed-off-by: Fran Bull <fran@tailscale.com>
I omitted a lot of the min/max modernizers because they didn't
result in more clear code.
Some of it's older "for x := range 123".
Also: errors.AsType, any, fmt.Appendf, etc.
Updates #18682
Change-Id: I83a451577f33877f962766a5b65ce86f7696471c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>