Commit Graph

1962 Commits

Author SHA1 Message Date
Simon Law
12188c0ade ipn/ipnlocal: log traffic steering scores and suggested exit nodes (#18681)
When traffic steering is enabled, some users are suggested an exit
node that is inappropriately far from their location. This seems to
happen right when the client connects to the control plane and the
client eventually fixes itself. But whenever an affected client
reconnects, its suggested exit node flaps, and this happens often
enough to be noticeable because connections drop whenever the exit
node is switched. This should not happen, since the map response that
contains the list of suggested exit nodes that the client picks from,
also contains the scores for those nodes.

Since our current logging and diagnostic tools don’t give us enough
insight into what is happening, this PR adds additional logging when:
- traffic steering scores are used to suggest an exit node
- an exit node is suggested, no matter how it was determined

Updates: tailscale/corp#29964
Updates: tailscale/corp#36446

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2026-02-10 18:14:32 -08:00
Brad Fitzpatrick
dc1d811d48 magicsock, ipnlocal: revert eventbus-based node/filter updates, remove Synchronize hack
Restore synchronous method calls from LocalBackend to magicsock.Conn
for node views, filter, and delta mutations. The eventbus delivery
introduced in 8e6f63cf1 was invalid for these updates because
subsequent operations in the same call chain depend on magicsock
already having the current state. The Synchronize/settleEventBus
workaround was fragile and kept requiring more workarounds and
introducing new mystery bugs.

Since eventbus was added, we've since learned more about when to use
eventbus, and this wasn't one of the cases.

We can take another swing at using eventbus for netmap changes in a
future change.

Fixes #16369
Updates #18575 (likely fixes)

Change-Id: I79057cc9259993368bb1e350ff0e073adf6b9a8f
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2026-02-10 07:32:05 -08:00
Jonathan Nobels
086968c15b net/dns, ipn/local: skip health warnings in dns forwarder when accept-dns is false (#18572)
fixes tailscale/tailscale#18436

Queries can still make their way to the forwarder when accept-dns is disabled.
Since we have not configured the forwarder if --accept-dns is false, this errors out
(correctly) but it also generates a persistent health warning.   This forwards the
Pref setting all the way through the stack to the forwarder so that we can be more
judicious about when we decide that the forward path is unintentionally missing, vs
simply not configured.

Testing:
tailscale set --accept-dns=false. (or from the GUI)
dig @100.100.100.100 example.com
tailscale status

No dns related health warnings should be surfaced.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-02-10 09:29:14 -05:00
KevinLiang10
5eaaf9786b tailcfg: add peerRelay bool to hostinfo
This commit adds a bool named PeerRelay to Hostinfo, to identify the host's status of acting as a peer relay.
Considering the RelayServerPort number can be 0, I just made this a bool in stead of a port number. If the port
info is needed in future this would also help indicating if the port was set to 0 (meaning any port in peer relay
context).

Updates tailscale/corp#35862

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-02-06 18:25:40 -07:00
James Tucker
fe69b7f0e5 cmd/tailscale: add event bus queue depth debugging
Under extremely high load it appears we may have some retention issues
as a result of queue depth build up, but there is currently no direct
way to observe this. The scenario does not trigger the slow subscriber
log message, and the event stream debugging endpoint produces a
saturating volume of information.

Updates tailscale/corp#36904

Signed-off-by: James Tucker <james@tailscale.com>
2026-02-06 10:46:29 -08:00
Will Hannah
058cc3f82b ipn/ipnlocal: skip AuthKey use if profiles exist (#18619)
If any profiles exist and an Authkey is provided via syspolicy, the
AuthKey is ignored on backend start, preventing re-auth attempts. This
is useful for one-time device provisioning scenarios, skipping authKey
use after initial setup when the authKey may no longer be valid.

updates #18618

Signed-off-by: Will Hannah <willh@tailscale.com>
2026-02-06 09:40:55 -05:00
Fernando Serboncini
5edfa6f9a8 ipn/ipnlocal: add wildcard TLS certificate support for subdomains (#18356)
When the NodeAttrDNSSubdomainResolve capability is present, enable
wildcard certificate issuance to cover all single-level subdomains
of a node's CertDomain.

Without the capability, only exact CertDomain matches are allowed,
so node.ts.net yields a cert for node.ts.net. With the capability,
we now generate wildcard certificates. Wildcard certs include both
the wildcard and base domain in their SANs, and ACME authorization
requests both identifiers. The cert filenames are kept still based
on the base domain with the wildcard prefix stripped, so we aren't
creating separate files. DNS challenges still used the base domain

The checkCertDomain function is replaced by resolveCertDomain that
both validates and returns the appropriate cert domain to request.
Name validation is now moved earlier into GetCertPEMWithValidity()

Fixes #1196

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-02-03 16:08:36 -05:00
M. J. Fromberger
14322713a5 ipn/ipnlocal/netmapcache: ensure cache updates preserve unchanged data (#18590)
Found by @cmol. When rewriting the same value into the cache, we were dropping
the unchanged keys, resulting in the cache being pruned incorrectly.
Also update the tests to catch this.

Updates #12639

Change-Id: Iab67e444eb7ddc22ccc680baa2f6a741a00eb325
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-02-03 07:55:41 -08:00
KevinLiang10
03461ea7fb wgengine/netstack: add local tailscale service IPs to route and terminate locally (#18461)
* wgengine/netstack: add local tailscale service IPs to route and terminate locally

This commit adds the tailscales service IPs served locally to OS routes, and
make interception to packets so that the traffic terminates locally without
making affects to the HA traffics.

Fixes tailscale/corp#34048

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* fix test

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* add ready field to avoid accessing lb before netstack starts

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* wgengine/netstack: store values from lb to avoid acquiring a lock

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* add active services to netstack on starts with stored prefs.

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* fix comments

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

* update comments

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>

---------

Signed-off-by: KevinLiang10 <37811973+KevinLiang10@users.noreply.github.com>
2026-01-30 16:46:03 -05:00
Fernando Serboncini
f48cd46662 net/dns,ipn/ipnlocal: add nodecap to resolve subdomains (#18258)
This adds a new node capability 'dns-subdomain-resolve' that signals
that all of hosts' subdomains should resolve to the same IP address.
It allows wildcard matching on any node marked with this capability.

This change also includes an util/dnsname utility function that lets
us access the parent of a full qualified domain name. MagicDNS takes
this function and recursively searchs for a matching real node name.

One important thing to observe is that, in this context, a subdomain
can have multiple sub labels. This means that for a given node named
machine, both my.machine and be.my.machine will be a positive match.

Updates #1196

Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
2026-01-30 13:32:34 -05:00
M. J. Fromberger
99584b26ae ipn/ipnlocal/netmapcache: report the correct error for a missing column (#18547)
The file-based cache implementation was not reporting the correct error when
attempting to load a missing column key. Make it do so, and update the tests to
cover that case.

Updates #12639

Change-Id: Ie2c45a0a7e528d4125f857859c92df807116a56e
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-01-28 14:32:40 -08:00
Amal Bansode
6de5b01e04 ipn/localapi: stop logging "broken pipe" errors (#18487)
The Tailscale CLI has some methods to watch the IPN bus for
messages, say, the current netmap (`tailscale debug netmap`).
The Tailscale daemon supports this using a streaming HTTP
response. Sometimes, the client can close its connection
abruptly -- due to an interruption, or in the case of `debug netmap`,
intentionally after consuming one message.

If the server daemon is writing a response as the client closes
its end of the socket, the daemon typically encounters a "broken pipe"
error. The "Watch IPN Bus" handler currently logs such errors after
they're propagated by a JSON encoding/writer helper.

Since the Tailscale CLI nominally closes its socket with the daemon
in this slightly ungraceful way (viz. `debug netmap`), stop logging
these broken pipe errors as far as possible. This will help avoid
confounding users when they scan backend logs.

Updates #18477

Signed-off-by: Amal Bansode <amal@tailscale.com>
2026-01-26 16:41:03 -08:00
M. J. Fromberger
9385dfe7f6 ipn/ipnlocal/netmapcache: add a package to split and cache network maps (#18497)
This commit is based on part of #17925, reworked as a separate package.

Add a package that can store and load netmap.NetworkMap values in persistent
storage, using a basic columnar representation. This commit includes a default
storage interface based on plain files, but the interface can be implemented
with more structured storage if we want to later.

The tests are set up to require that all the fields of the NetworkMap are
handled, except those explicitly designated as not-cached, and check that a
fully-populated value can round-trip correctly through the cache.  Adding or
removing fields, either in the NetworkMap or in the cached representation, will
trigger either build failures (e.g., for type mismatch) or test failures (e.g.,
for representation changes or missing fields). This isn't quite as nice as
automatically updating the representation, which I also prototyped, but is much
simpler to maintain and less code.

This commit does not yet hook up the cache to the backend, that will be a
subsequent change.

Updates #12639

Change-Id: Icb48639e1d61f2aec59904ecd172c73e05ba7bf9
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-01-26 14:55:30 -08:00
Fran Bull
9d13a6df9c appc,ipn/ipnlocal: Add split DNS entries for conn25 peers
If conn25 config is sent in the netmap: add split DNS entries to use
appropriately tagged peers' PeerAPI to resolve DNS requests for those
domains.

This will enable future work where we use the peers as connectors for
the configured domains.

Updates tailscale/corp#34252

Signed-off-by: Fran Bull <fran@tailscale.com>
2026-01-26 08:10:38 -08:00
Will Norris
3ec5be3f51 all: remove AUTHORS file and references to it
This file was never truly necessary and has never actually been used in
the history of Tailscale's open source releases.

A Brief History of AUTHORS files
---

The AUTHORS file was a pattern developed at Google, originally for
Chromium, then adopted by Go and a bunch of other projects. The problem
was that Chromium originally had a copyright line only recognizing
Google as the copyright holder. Because Google (and most open source
projects) do not require copyright assignemnt for contributions, each
contributor maintains their copyright. Some large corporate contributors
then tried to add their own name to the copyright line in the LICENSE
file or in file headers. This quickly becomes unwieldy, and puts a
tremendous burden on anyone building on top of Chromium, since the
license requires that they keep all copyright lines intact.

The compromise was to create an AUTHORS file that would list all of the
copyright holders. The LICENSE file and source file headers would then
include that list by reference, listing the copyright holder as "The
Chromium Authors".

This also become cumbersome to simply keep the file up to date with a
high rate of new contributors. Plus it's not always obvious who the
copyright holder is. Sometimes it is the individual making the
contribution, but many times it may be their employer. There is no way
for the proejct maintainer to know.

Eventually, Google changed their policy to no longer recommend trying to
keep the AUTHORS file up to date proactively, and instead to only add to
it when requested: https://opensource.google/docs/releasing/authors.
They are also clear that:

> Adding contributors to the AUTHORS file is entirely within the
> project's discretion and has no implications for copyright ownership.

It was primarily added to appease a small number of large contributors
that insisted that they be recognized as copyright holders (which was
entirely their right to do). But it's not truly necessary, and not even
the most accurate way of identifying contributors and/or copyright
holders.

In practice, we've never added anyone to our AUTHORS file. It only lists
Tailscale, so it's not really serving any purpose. It also causes
confusion because Tailscalars put the "Tailscale Inc & AUTHORS" header
in other open source repos which don't actually have an AUTHORS file, so
it's ambiguous what that means.

Instead, we just acknowledge that the contributors to Tailscale (whoever
they are) are copyright holders for their individual contributions. We
also have the benefit of using the DCO (developercertificate.org) which
provides some additional certification of their right to make the
contribution.

The source file changes were purely mechanical with:

    git ls-files | xargs sed -i -e 's/\(Tailscale Inc &\) AUTHORS/\1 contributors/g'

Updates #cleanup

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2026-01-23 15:49:45 -08:00
M. J. Fromberger
ce12863ee5 ipn/ipnlocal: manage per-profile subdirectories in TailscaleVarRoot (#18485)
In order to better manage per-profile data resources on the client, add methods
to the LocalBackend to support creation of per-profile directory structures in
local storage. These methods build on the existing TailscaleVarRoot config, and
have the same limitation (i.e., if no local storage is available, it will
report an error when used).

The immediate motivation is to support netmap caching, but we can also use this
mechanism for other per-profile resources including pending taildrop files and
Tailnet Lock authority caches.

This commit only adds the directory-management plumbing; later commits will
handle migrating taildrop, TKA, etc. to this mechanism, as well as caching
network maps.

Updates #12639

Change-Id: Ia75741955c7bf885e49c1ad99f856f669a754169
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
2026-01-23 10:09:46 -08:00
Harry Harpham
3840183be9 tsnet: add support for Services
This change allows tsnet nodes to act as Service hosts by adding a new
function, tsnet.Server.ListenService. Invoking this function will
advertise the node as a host for the Service and create a listener to
receive traffic for the Service.

Fixes #17697
Fixes tailscale/corp#27200
Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-01-16 15:28:31 -07:00
Harry Harpham
1b88e93ff5 ipn/ipnlocal: allow retrieval of serve config ETags from local API
This change adds API to ipn.LocalBackend to retrieve the ETag when
querying for the current serve config. This allows consumers of
ipn.LocalBackend.SetServeConfig to utilize the concurrency control
offered by ETags. Previous to this change, utilizing serve config ETags
required copying the local backend's internal ETag calcuation.

The local API server was previously copying the local backend's ETag
calculation as described above. With this change, the local API server
now uses the new ETag retrieval function instead. Serve config ETags are
therefore now opaque to clients, in line with best practices.

Fixes tailscale/corp#35857
Signed-off-by: Harry Harpham <harry@tailscale.com>
2026-01-16 15:28:31 -07:00
Jonathan Nobels
643e91f2eb net/netmon: move TailscaleInterfaceIndex out of netmon.State (#18428)
fixes tailscale/tailscale#18418

Both Serve and PeerAPI broke when we moved the TailscaleInterfaceName
into State, which is updated asynchronously and may not be
available when we configure the listeners.

This extracts the explicit interface name property from netmon.State
and adds as a static struct with getters that have proper error
handling.

The bug is only found in sandboxed Darwin clients, where we
need to know the Tailscale interface details in order to set up the
listeners correctly (they must bind to our interface explicitly to escape
the network sandboxing that is applied by NECP).

Currently set only sandboxed macOS and Plan9 set this but it will
also be useful on Windows to simplify interface filtering in netns.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2026-01-16 14:53:23 -05:00
Tom Meadows
c3b7f24051 ipn,ipn/local: always accept routes for Tailscale Services (cgnat range) (#18173)
Updates #18198

Signed-off-by: chaosinthecrd <tom@tmlabs.co.uk>
Co-authored-by: James Tucker <raggi@tailscale.com>
2026-01-14 18:20:00 +00:00
Irbe Krumina
8c17d871b3 ipn/store/kubestore: don't load write replica certs in memory (#18395)
Fixes a bug where, for kube HA proxies, TLS certs for the replica
responsible for cert issuance where loaded in memory on startup,
although the in-memory store was not updated after renewal (to
avoid failing re-issuance for re-created Ingresses).
Now the 'write' replica always reads certs from the kube Secret.

Updates tailscale/tailscale#18394

Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
2026-01-13 12:43:17 +00:00
Naman Sood
480ee9fec0 ipn,cmd/tailscale/cli: set correct SNI name for TLS-terminated TCP Services (#17752)
Fixes #17749.

Signed-off-by: Naman Sood <mail@nsood.in>
2026-01-07 09:31:46 -05:00
Irbe Krumina
8ea90ba80d cmd/tailscaled,ipn/{ipnlocal,store/kubestore}: don't create attestation keys for stores that are not bound to a node (#18322)
Ensure that hardware attestation keys are not added to tailscaled
state stores that are Kubernetes Secrets or AWS SSM as those Tailscale
devices should be able to be recreated on different nodes, for example,
when moving Pods between nodes.

Updates tailscale/tailscale#18302

Signed-off-by: Irbe Krumina <irbekrm@gmail.com>
2026-01-06 11:29:46 +00:00
Andrew Lytvynov
68617bb82e cmd/tailscaled: disable state encryption / attestation by default (#18336)
TPM-based features have been incredibly painful due to the heterogeneous
devices in the wild, and many situations in which the TPM "changes" (is
reset or replaced). All of this leads to a lot of customer issues.

We hoped to iron out all the kinks and get all users to benefit from
state encryption and hardware attestation without manually opting in,
but the long tail of kinks is just too long.

This change disables TPM-based features on Windows and Linux by default.
Node state should get auto-decrypted on update, and old attestation keys
will be removed.

There's also tailscaled-on-macOS, but it won't have a TPM or Keychain
bindings anyway.

Updates #18302
Updates #15830

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-01-05 17:05:00 -08:00
Andrew Lytvynov
2e77b75e96 ipn/ipnlocal: don't fail profile unmarshal due to attestation keys (#18335)
Soft-fail on initial unmarshal and try again, ignoring the
AttestationKey. This helps in cases where something about the
attestation key storage (usually a TPM) is messed up. The old key will
be lost, but at least the node can start again.

Updates #18302
Updates #15830

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2026-01-05 16:58:59 -08:00
Nick Khyl
2917ea8d0e ipn/ipnauth, safesocket: defer named pipe client's token retrieval until ipnserver needs it
An error returned by net.Listener.Accept() causes the owning http.Server to shut down.
With the deprecation of net.Error.Temporary(), there's no way for the http.Server to test
whether the returned error is temporary / retryable or not (see golang/go#66252).

Because of that, errors returned by (*safesocket.winIOPipeListener).Accept() cause the LocalAPI
server (aka ipnserver.Server) to shut down, and tailscaled process to exit.

While this might be acceptable in the case of non-recoverable errors, such as programmer errors,
we shouldn't shut down the entire tailscaled process for client- or connection-specific errors,
such as when we couldn't obtain the client's access token because the client attempts to connect
at the Anonymous impersonation level. Instead, the LocalAPI server should gracefully handle
these errors by denying access and returning a 401 Unauthorized to the client.

In tailscale/tscert#15, we fixed a known bug where Caddy and other apps using tscert would attempt
to connect at the Anonymous impersonation level and fail. However, we should also fix this on the tailscaled
side to prevent a potential DoS, where a local app could deliberately open the Tailscale LocalAPI named pipe
at the Anonymous impersonation level and cause tailscaled to exit.

In this PR, we defer token retrieval until (*WindowsClientConn).Token() is called and propagate the returned token
or error via ipnauth.GetConnIdentity() to ipnserver, which handles it the same way as other ipnauth-related errors.

Fixes #18212
Fixes tailscale/tscert#13

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-23 14:04:45 -06:00
Irbe Krumina
90b4358113 cmd/k8s-operator,ipn/ipnlocal: allow opting out of ACME order replace extension (#18252)
In dynamically changing environments where ACME account keys and certs
are stored separately, it can happen that the account key would get
deleted (and recreated) between issuances. If that is the case,
we currently fail renewals and the only way to recover is for users
to delete certs.
This adds a config knob to allow opting out of the replaces extension
and utilizes it in the Kubernetes operator where there are known
user workflows that could end up with this edge case.

Updates #18251

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-12-19 15:59:26 +00:00
Irbe Krumina
b73fb467e4 ipn/ipnlocal: log cert renewal failures (#18246)
Updates#cleanup

Signed-off-by: Irbe Krumina <irbe@tailscale.com>
2025-12-18 09:58:13 +00:00
Jonathan Nobels
3e89068792 net/netmon, wgengine/userspace: purge ChangeDelta.Major and address TODOs (#17823)
updates tailscale/corp#33891

Addresses several older the TODO's in netmon.  This removes the 
Major flag precomputes the ChangeDelta state, rather than making
consumers of ChangeDeltas sort that out themselves.   We're also seeing
a lot of ChangeDelta's being flagged as "Major" when they are
not interesting, triggering rebinds in wgengine that are not needed.  This
cleans that up and adds a host of additional tests.

The dependencies are cleaned, notably removing dependency on netmon
itself for calculating what is interesting, and what is not.  This includes letting
individual platforms set a bespoke global "IsInterestingInterface"
function.  This is only used on Darwin.

RebindRequired now roughly follows how "Major" was historically
calculated but includes some additional checks for various
uninteresting events such as changes in interface addresses that
shouldn't trigger a rebind.  This significantly reduces thrashing (by
roughly half on Darwin clients which switching between nics).   The individual
values that we roll  into RebindRequired are also exposed so that
components consuming netmap.ChangeDelta can ask more
targeted questions.

Signed-off-by: Jonathan Nobels <jonathan@tailscale.com>
2025-12-17 12:32:40 -05:00
Will Norris
0fd1670a59 client/local: add method to set gauge metric to a value
The existing client metric methods only support incrementing (or
decrementing) a delta value.  This new method allows setting the metric
to a specific value.

Updates tailscale/corp#35327

Change-Id: Ia101a4a3005adb9118051b3416f5a64a4a45987d
Signed-off-by: Will Norris <will@tailscale.com>
2025-12-16 14:11:33 -08:00
Raj Singh
65182f2119 ipn/ipnlocal: add ProxyProtocol support to VIP service TCP handler (#18175)
tcpHandlerForVIPService was missing ProxyProtocol support that
tcpHandlerForServe already had. Extract the shared logic into
forwardTCPWithProxyProtocol helper and use it in both handlers.

Fixes #18172

Signed-off-by: Raj Singh <raj@tailscale.com>
2025-12-12 02:53:21 +05:30
Brad Fitzpatrick
0df4631308 ipn/ipnlocal: avoid ResetAndStop panic
Updates #18187

Change-Id: If7375efb7df0452a5e85b742fc4c4eecbbd62717
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-12-11 09:07:45 -08:00
Alex Chan
378ee20b9a cmd/tailscale/cli: stabilise the output of tailscale lock status --json
This patch stabilises the JSON output, and improves it in the following
ways:

* The AUM hash in Head uses the base32-encoded form of an AUM hash,
  consistent with how it's presented elsewhere
* TrustedKeys are the same format as the keys as `tailnet lock log --json`
* SigKind, Pubkey and KeyID are all presented consistently with other
  JSON output in NodeKeySignature
* FilteredPeers don't have a NodeKeySignature, because it will always
  be empty

For reference, here's the JSON output from the CLI prior to this change:

```json
{
  "Enabled": true,
  "Head": [
    196,
    69,
    63,
    243,
    213,
    133,
    123,
    46,
    183,
    203,
    143,
    34,
    184,
    85,
    80,
    1,
    221,
    92,
    49,
    213,
    93,
    106,
    5,
    206,
    176,
    250,
    58,
    165,
    155,
    136,
    11,
    13
  ],
  "PublicKey": "nlpub:0f99af5c02216193963ce9304bb4ca418846eddebe237f37a6de1c59097ed0b8",
  "NodeKey": "nodekey:8abfe98b38151748919f6e346ad16436201c3ecd453b01e9d6d3a38e1826000d",
  "NodeKeySigned": true,
  "NodeKeySignature": {
    "SigKind": 1,
    "Pubkey": "bnCKv+mLOBUXSJGfbjRq0WQ2IBw+zUU7AenW06OOGCYADQ==",
    "KeyID": "D5mvXAIhYZOWPOkwS7TKQYhG7d6+I383pt4cWQl+0Lg=",
    "Signature": "4DPW4v6MyLLwQ8AMDm27BVDGABjeC9gg1EfqRdKgzVXi/mJDwY9PTAoX0+0WTRs5SUksWjY0u1CLxq5xgjFGBA==",
    "Nested": null,
    "WrappingPubkey": "D5mvXAIhYZOWPOkwS7TKQYhG7d6+I383pt4cWQl+0Lg="
  },
  "TrustedKeys": [
    {
      "Key": "nlpub:0f99af5c02216193963ce9304bb4ca418846eddebe237f37a6de1c59097ed0b8",
      "Metadata": null,
      "Votes": 1
    },
    {
      "Key": "nlpub:de2254c040e728140d92bc967d51284e9daea103a28a97a215694c5bda2128b8",
      "Metadata": null,
      "Votes": 1
    }
  ],
  "VisiblePeers": [
    {
      "Name": "signing2.taila62b.unknown.c.ts.net.",
      "ID": 7525920332164264,
      "StableID": "nRX6TbAWm121DEVEL",
      "TailscaleIPs": [
        "100.110.67.20",
        "fd7a:115c:a1e0::9c01:4314"
      ],
      "NodeKey": "nodekey:10bf4a5c168051d700a29123cd81568377849da458abef4b328794ca9cae4313",
      "NodeKeySignature": {
        "SigKind": 1,
        "Pubkey": "bnAQv0pcFoBR1wCikSPNgVaDd4SdpFir70syh5TKnK5DEw==",
        "KeyID": "D5mvXAIhYZOWPOkwS7TKQYhG7d6+I383pt4cWQl+0Lg=",
        "Signature": "h9fhwHiNdkTqOGVQNdW6AVFoio6MFaFobPiK9ydywgmtYxcExJ38b76Tabdc56aNLxf8IfCaRw2VYPcQG2J/AA==",
        "Nested": null,
        "WrappingPubkey": "3iJUwEDnKBQNkryWfVEoTp2uoQOiipeiFWlMW9ohKLg="
      }
    }
  ],
  "FilteredPeers": [
    {
      "Name": "node3.taila62b.unknown.c.ts.net.",
      "ID": 5200614049042386,
      "StableID": "n3jAr7KNch11DEVEL",
      "TailscaleIPs": [
        "100.95.29.124",
        "fd7a:115c:a1e0::f901:1d7c"
      ],
      "NodeKey": "nodekey:454d2c8602c10574c5ec3a6790f159714802012b7b8bb8d2ab47d637f9df1d7b",
      "NodeKeySignature": {
        "SigKind": 0,
        "Pubkey": null,
        "KeyID": null,
        "Signature": null,
        "Nested": null,
        "WrappingPubkey": null
      }
    }
  ],
  "StateID": 16885615198276932820
}
```

Updates https://github.com/tailscale/corp/issues/22355
Updates https://github.com/tailscale/tailscale/issues/17619

Signed-off-by: Alex Chan <alexc@tailscale.com>

Change-Id: I65b58ff4520033e6b70fc3b1ba7fc91c1f70a960
2025-12-09 09:40:06 +00:00
Nick Khyl
da0ea8ef3e Revert "ipn/ipnlocal: shut down old control client synchronously on reset"
It appears (*controlclient.Auto).Shutdown() can still deadlock when called with b.mu held, and therefore the changes in #18127 are unsafe.

This reverts #18127 until we figure out what causes it.

This reverts commit d199ecac80.

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-08 15:37:08 -06:00
James 'zofrex' Sanderson
cf40cf5ccb ipn/ipnlocal: add peer API endpoints to Hostinfo on initial client creation (#17851)
Previously we only set this when it updated, which was fine for the first
call to Start(), but after that point future updates would be skipped if
nothing had changed. If Start() was called again, it would wipe the peer API
endpoints and they wouldn't get added back again, breaking exit nodes (and
anything else requiring peer API to be advertised).

Updates tailscale/corp#27173

Signed-off-by: James Sanderson <jsanderson@tailscale.com>
2025-12-05 13:33:47 +00:00
Peter A.
f4d34f38be cmd/tailscale,ipn: add Unix socket support for serve
Based on PR #16700 by @lox, adapted to current codebase.

Adds support for proxying HTTP requests to Unix domain sockets via
tailscale serve unix:/path/to/socket, enabling exposure of services
like Docker, containerd, PHP-FPM over Tailscale without TCP bridging.

The implementation includes reasonable protections against exposure of
tailscaled's own socket.

Adaptations from original PR:
- Use net.Dialer.DialContext instead of net.Dial for context propagation
- Use http.Transport with Protocols API (current h2c approach, not http2.Transport)
- Resolve conflicts with hasScheme variable in ExpandProxyTargetValue

Updates #9771

Signed-off-by: Peter A. <ink.splatters@pm.me>
Co-authored-by: Lachlan Donald <lachlan@ljd.cc>
2025-12-04 11:06:06 -08:00
Nick Khyl
557457f3c2 ipn/ipnlocal: fix LocalBackend deadlock when packet arrives during profile switch (#18126)
If a packet arrives while WireGuard is being reconfigured with b.mu held, such as during a profile switch,
calling back into (*LocalBackend).GetPeerAPIPort from (*Wrapper).filterPacketInboundFromWireGuard
may deadlock when it tries to acquire b.mu.

This occurs because a peer cannot be removed while an inbound packet is being processed.
The reconfig and profile switch wait for (*Peer).RoutineSequentialReceiver to return, but it never finishes
because GetPeerAPIPort needs b.mu, which the waiting goroutine already holds.

In this PR, we make peerAPIPorts a new syncs.AtomicValue field that is written with b.mu held
but can be read by GetPeerAPIPort without holding the mutex, which fixes the deadlock.

There might be other long-term ways to address the issue, such as moving peer API listeners
from LocalBackend to nodeBackend so they can be accessed without holding b.mu,
but these changes are too large and risky at this stage in the v1.92 release cycle.

Updates #18124

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-04 10:13:13 -05:00
Nick Khyl
d199ecac80 ipn/ipnlocal: shut down old control client synchronously on reset
Previously, callers of (*LocalBackend).resetControlClientLocked were supposed
to call Shutdown on the returned controlclient.Client after releasing b.mu.
In #17804, we started calling Shutdown while holding b.mu, which caused
deadlocks during profile switches due to the (*ExecQueue).RunSync implementation.

We first patched this in #18053 by calling Shutdown in a new goroutine,
which avoided the deadlocks but made TestStateMachine flaky because
the shutdown order was no longer guaranteed.

In #18070, we updated (*ExecQueue).RunSync to allow shutting down
the queue without waiting for RunSync to return. With that change,
shutting down the control client while holding b.mu became safe.

Therefore, this PR updates (*LocalBackend).resetControlClientLocked
to shut down the old client synchronously during the reset, instead of
returning it and shifting that responsibility to the callers.

This fixes the flaky tests and simplifies the code.

Fixes #18052

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-12-03 20:35:25 -06:00
Brad Fitzpatrick
74ed589042 syncs: add means of declare locking assumptions for debug mode validation
Updates #17852

Change-Id: I42a64a990dcc8f708fa23a516a40731a19967aba
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
2025-11-26 13:04:28 -08:00
Alex Chan
b7658a4ad2 tstest/integration: add integration test for Tailnet Lock
This patch adds an integration test for Tailnet Lock, checking that a node can't
talk to peers in the tailnet until it becomes signed.

This patch also introduces a new package `tstest/tkatest`, which has some helpers
for constructing a mock control server that responds to TKA requests. This allows
us to reduce boilerplate in the IPN tests.

Updates tailscale/corp#33599

Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-11-26 11:54:48 +00:00
Jordan Whited
824027305a cmd/tailscale/cli,ipn,all: make peer relay server port a *uint16
In preparation for exposing its configuration via ipn.ConfigVAlpha,
change {Masked}Prefs.RelayServerPort from *int to *uint16. This takes a
defensive stance against invalid inputs at JSON decode time.

'tailscale set --relay-server-port' is currently the only input to this
pref, and has always sanitized input to fit within a uint16.

Updates tailscale/corp#34591

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-11-25 19:40:17 -08:00
Sachin Iyer
53476ce872 ipn/serve: validate service paths in HasPathHandler
Fixes #17839

Signed-off-by: Sachin Iyer <siyer@detail.dev>
2025-11-25 16:27:37 -05:00
Alex Chan
b38dd1ae06 ipn/ipnlocal: don't panic if there are no suitable exit nodes
In suggestExitNodeLocked, if no exit node candidates have a home DERP or
valid location info, `bestCandidates` is an empty slice. This slice is
passed to `selectNode` (`randomNode` in prod):

```go func randomNode(nodes views.Slice[tailcfg.NodeView], …) tailcfg.NodeView {
	…
	return nodes.At(rand.IntN(nodes.Len()))
}
```

An empty slice becomes a call to `rand.IntN(0)`, which panics.

This patch changes the behaviour, so if we've filtered out all the
candidates before calling `selectNode`, reset the list and then pick
from any of the available candidates.

This patch also updates our tests to give us more coverage of `randomNode`,
so we can spot other potential issues.

Updates #17661

Change-Id: I63eb5e4494d45a1df5b1f4b1b5c6d5576322aa72
Signed-off-by: Alex Chan <alexc@tailscale.com>
2025-11-25 19:05:13 +00:00
Simon Law
848978e664 ipn/ipnlocal: test traffic-steering when feature is not enabled (#17997)
In PR tailscale/corp#34401, the `traffic-steering` feature flag does
not automatically enable traffic steering for all nodes. Instead, an
admin must add the `traffic-steering` node attribute to each client
node that they want opted-in.

For backwards compatibility with older clients, tailscale/corp#34401
strips out the `traffic-steering` node attribute if the feature flag
is not enabled, even if it is set in the policy file. This lets us
safely disable the feature flag.

This PR adds a missing test case for suggested exit nodes that have no
priority.

Updates tailscale/corp#34399

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-11-25 09:21:55 -08:00
Nick Khyl
7073f246d3 ipn/ipnlocal: do not call controlclient.Client.Shutdown with b.mu held
This fixes a regression in #17804 that caused a deadlock.

Updates #18052

Signed-off-by: Nick Khyl <nickk@tailscale.com>
2025-11-25 09:22:50 -06:00
Simon Law
9c3a2aa797 ipn/ipnlocal: replace log.Printf with logf (#18045)
Updates #cleanup

Signed-off-by: Simon Law <sfllaw@tailscale.com>
2025-11-24 17:42:58 -08:00
Jordan Whited
7426eca163 cmd/tailscale,feature/relayserver,ipn: add relay-server-static-endpoints set flag
Updates tailscale/corp#31489
Updates #17791

Signed-off-by: Jordan Whited <jordan@tailscale.com>
2025-11-24 16:37:15 -08:00
Andrew Dunham
698eecda04 ipn/ipnlocal: fix panic in driveTransport on network error
When the underlying transport returns a network error, the RoundTrip
method returns (nil, error). The defer was attempting to access resp
without checking if it was nil first, causing a panic. Fix this by
checking for nil in the defer.

Also changes driveTransport.tr from *http.Transport to http.RoundTripper
and adds a test.

Fixes #17306

Signed-off-by: Andrew Dunham <andrew@tailscale.com>
Change-Id: Icf38a020b45aaa9cfbc1415d55fd8b70b978f54c
2025-11-24 10:35:23 -05:00
Andrew Lytvynov
c679aaba32 cmd/tailscaled,ipn: show a health warning when state store fails to open (#17883)
With the introduction of node sealing, store.New fails in some cases due
to the TPM device being reset or unavailable. Currently it results in
tailscaled crashing at startup, which is not obvious to the user until
they check the logs.

Instead of crashing tailscaled at startup, start with an in-memory store
with a health warning about state initialization and a link to (future)
docs on what to do. When this health message is set, also block any
login attempts to avoid masking the problem with an ephemeral node
registration.

Updates #15830
Updates #17654

Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
2025-11-20 15:52:58 -06:00
Harry Harpham
ac74d28190 ipn/ipnlocal: add validations when setting serve config (#17950)
These validations were previously performed in the CLI frontend. There
are two motivations for moving these to the local backend:
1. The backend controls synchronization around the relevant state, so
   only the backend can guarantee many of these validations.
2. Doing these validations in the back-end avoids the need to repeat
   them across every frontend (e.g. the CLI and tsnet).

Updates tailscale/corp#27200

Signed-off-by: Harry Harpham <harry@tailscale.com>
2025-11-20 13:40:05 -06:00