The ProxyGroup HA Service reconciler's validateService scanned every
Service in the cluster with shouldExpose=true for duplicate hostnames.
With multi-tailnet (Tailnet CRD) support, that scan reaches across
tailnet boundaries:
* A Service exposed via the single-proxy path (tailscale.com/expose)
on the primary tailnet would block a ProxyGroup ingress Service
for the same hostname on a secondary tailnet, even though the two
live in different reconcilers and different tailnet DNS namespaces.
* Two ProxyGroups joined to different tailnets via spec.tailnet
would also block one another for shared hostnames, again despite
living in separate DNS namespaces.
In both cases the ProxyGroup ingress Service was silently dropped
(IngressSvcInvalid event raised, queue cleared, ConfigMap never
written, ProxyGroup never serves the backend).
This change tightens the check in two ways:
* Skip Services that aren't themselves managed by the ProxyGroup
reconciler (use isTailscaleService instead of shouldExpose).
* For ProxyGroup-managed Services attached to a different
ProxyGroup, look up that ProxyGroup and skip the duplicate
report when spec.Tailnet differs from the current one. Fall
through and flag the collision on lookup failure so genuine
duplicates are not silently allowed.
Adds regression tests covering both the single-proxy and the
different-tailnet cases. Updates the existing TestValidateService
expected error to reflect the rephrased message.
Updates #20069
Signed-off-by: tsushanth <78000697+tsushanth@users.noreply.github.com>
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.
Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
Add support for configuring egress to destinations reachable via 4via6
subnet routes, using either the synthesized 4via6 address or the MagicDNS
name (in the form <IPv4-with-hyphens>-via-<siteID>[.*]).
Also update the Connector to validate and advertise 4via6 subnet routes.
Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.
This change only affects standalone egress proxies — ProxyGroup egress
requires IPv6 support before it can use 4via6.
Updates #19334
Change-Id: I6faecd6eb61ab55fc0cd97fe417af6b6a12fe7fc
Signed-off-by: Becky Pauley <becky@tailscale.com>
The Logger previously took a *netmap.NetworkMap at Startup and on every
ReconfigNetworkMap call, denormalizing it into per-IP and self lookup
maps. That denormalization is O(n) over all peers and ran on every
netmap update, contributing to the broader quadratic behavior we want
to eliminate when a single peer is added or removed.
Instead, this makes netlog ask LocalBackend (well, nodeBackend) for
the info it needs, letting us remove the netmap.NetworkMap type
entirely from the netlog package.
This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Ib5f2de96e788a667332c0a6f7ac833b3d0053b5c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
util/def: add def.Bool and def.Duration default parse helpers
Replace multiple instances of def.Bool and def.Duration with a new util/def
package.
Updates #20018
Co-authored-by: Bobby <boby@codelabs.co.id>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
Signed-off-by: Bobby <boby@codelabs.co.id>
Signed-off-by: Simon Law <sfllaw@tailscale.com>
tailscale serve set-config now also accepts the legacy raw ipn.ServeConfig
format (as emitted by `tailscale serve status --json` and consumed via
TS_SERVE_CONFIG, which has no "version" field), so the common
serve-status-edit-set workflow stops failing. Only the services-oriented
content is applied; any node-level fields are skipped with a warning to
stderr pointing users at get-config to migrate.
Fixestailscale/corp#39793
Signed-off-by: Brendan Creane <bcreane@gmail.com>
Bumps wireguard-go pin to include the roaming endpoints fix, and
two internal enhancements.
Pulls stock wireguard-go for non-tailscale simulation in tests,
to use its endpoint discovery mechanism.
Updates #20082
Change-Id: I2ff282cb7fe4ab099ce5e780a1d40ae86a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
When running under the macOS sandbox, "tailscale configure kubeconfig"
refused outright whenever $KUBECONFIG was set, assuming the path would
not be writable. Yet when $KUBECONFIG was unset it happily relied on the
home-relative-path entitlement to write to ~/.kube/config, so the two
paths made inconsistent assumptions about what the sandbox can reach.
Resolve the kubeconfig path first, then check whether the target file
(or the nearest existing parent directory) is actually writable. Only
report an error if it is not, and include macOS sandbox guidance in that
error since a path outside the home directory is the likely cause. This
lets a $KUBECONFIG that does point under the home directory work, rather
than being rejected unconditionally.
Fixes#20007
Change-Id: I9880363c38b981efaed7e97367851ddacf647be1
Signed-off-by: James Tucker <james@tailscale.com>
* cmd/k8s-operator: rework [unexpected] log lines
This commit modifies several places in the operator logs where we
prepend `[unexpected]` to instead use an appropriate logging level.
The `[unexpected]` prefix is intended to be used when the program
violates some internal invariant (or for example, a database has
become corrupted). Many of these cases were simply log lines that
then fell back to a default value/behaviour. These have been releveled
to warnings.
Some of these log lines also seemed extraeneous as for the example of
service reconcilers logging when there is no proxy group annotation. As
far as I can tell we've never had any predicates for limiting the
services reconciled to ones with that annotation, so they can just
be removed to reduce log spam.
Fixes: #cleanup
Signed-off-by: David Bond <davidsbond93@gmail.com>
* Update cmd/k8s-operator/egress-services-readiness.go
Co-authored-by: BeckyPauley <64131207+BeckyPauley@users.noreply.github.com>
Signed-off-by: David Bond <davidsbond@users.noreply.github.com>
* Update cmd/k8s-operator/operator.go
Co-authored-by: BeckyPauley <64131207+BeckyPauley@users.noreply.github.com>
Signed-off-by: David Bond <davidsbond@users.noreply.github.com>
---------
Signed-off-by: David Bond <davidsbond93@gmail.com>
Signed-off-by: David Bond <davidsbond@users.noreply.github.com>
Co-authored-by: BeckyPauley <64131207+BeckyPauley@users.noreply.github.com>
Prevent tailscale ssh from automatically adding a username when
connecting to a server, only forward one if provided. The previous
behaviour prevented username overrides in the ssh configuration, since
the provided username takes precedence to the configured one.
This also keeps the tailscale ssh a thin wrapper around ssh by not
adding any extra arguments unless required.
Fixes#19357
Signed-off-by: Örjan Fors <o@42mm.org>
This commit modifies the reconciler for the `Tailnet` custom resource
to allow referenced secrets to specify an `audience` field. If a
referenced secret contains both an `audience` and `client_id` we assume
the user's intention is to use workload identity.
In that case, we configure the tailscale API client to authenticate
using the Kubernetes token request API against the operator's service
account. This requires the operator to be aware of its own service
account name.
A small change has also been made to the messages added to the `Tailnet`
CRD's status field in the even that it is missing scopes to make it
clearer that certain scopes may not be applied.
Closes: #19090
Updates: #19471
Signed-off-by: David Bond <davidsbond93@gmail.com>
To avoid breaking downstream code, add deprecated aliases for all the
old names.
Updates tailscale/corp#37904
Change-Id: I86d0b0d7da371946440b181c665448f91c3ef8d2
Signed-off-by: Alex Chan <alexc@tailscale.com>
tsnet depends on logpolicy, which in turn depended on util/syspolicy
because of a single LogTarget policy setting it uses.
In this commit, we replace that dependency with a feature.Hook,
which only tailscaled or its platform-specific alternatives should set.
Updates #20031
Signed-off-by: Nick Khyl <nickk@tailscale.com>
We don't need to log if the policy doesn't actually say that hardware
attestation must be enabled.
Updates #cleanup
Signed-off-by: Andrew Lytvynov <awly@tailscale.com>
This patch adds examples for unmarshalling the JSON outputs of the
following commands:
tailscale dns query --json
tailscale dns status --json
It also adds an example usage of `tailscale dns` to both
jsonoutput.DNSQueryResult and jsonoutput.DNSStatusResult.
Updates #13326
Updates #18750
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This adds support for Gokrazy GAF (Gokrazy Archive Format) zip
auto-updates, starting to wire up Tailscale's clientupdate mechanism
to Gokrazy's update mechanism.
Currently there's just a CLI command to update from a GAF URL,
with an --unsigned flag for use in a new natlab vmtest.
Next step would be publishing unstable track GAF files on
pkgs.tailscale.com, with detached signatures, and then making the
clientupdate mechanism also download those and check signatures.
Updates #20002
Change-Id: Ib03c56f17a57f8a4638398ef83549dac4813323d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
SchemaVersion didn’t actually parse boolean values properly, so
calling `tailscale lock status --json=false` would fail with:
invalid boolean value "false" for -json: invalid integer value passed to --json: "false"
This patch makes SchemaVersion.Set delegate to flag.FlagSet for its
argument parsing, with accompanying tests that ensure that both
boolean and integer values are parsed properly.
It also removes the restriction that prevented the flag from appearing
multiple times in the arguments list. Now, the final flag clobbers all
previous ones, aligning this behaviour with the standard flag package.
We also change the SchemaVersion.String output for the zero value to
"false", so that the default help message doesn’t change when we
switch other commands over from their boolean representations:
user@host:~$ tailscale whoami --help
FLAGS
--json, --json=false
output in JSON format (default false)
Updates #17613
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Since we don’t think anyone has actually imported the jsonoutput
package yet, we still have a chance to rename its fundamental types:
1. Rename the JSONSchemaVersion struct to SchemaVersion because
it is a flag.Value that can represent any schema version.
2. Rename the JSONSchemaVersion.Value field to SchemaVersion.Version
so the struct reads better:
if args.json.IsSet && args.json.Version == 1 {
// ...
}
Updates #17613
Signed-off-by: Simon Law <sfllaw@tailscale.com>
This patch:
1. Removes hardcoded mentions of a `--json` flag from the
documentation for JSONSchemaVersion, because the type could be used
for anything.
2. Removes `code` formatting because Go doc comments don’t support
this syntax.
3. Fixes [links] in doc comments so they link to the types’
online documentation.
4. Checks that JSONSchemaVersion satisfies the flag.Value interface.
5. Adds documentation examples for using both JSONSchemaVersion and
ResponseEnvelope.
Updates #17613
Updates #18750
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Per chat with an SSH client author who wanted a URL in the output.
And then make it more clear what parts are banners.
Updates #cleanup
Change-Id: If5033ad9dc0dba3d833f24ea39e117a455010492
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
cros-garcon NULL-derefs on cold-boot netlink enumeration when
tailscale0 is present, preventing the Crostini container and
ChromeOS Terminal from starting cleanly. This is an upstream
ChromiumOS bug in cros-garcon; tailscaled can work around it
by defaulting to userspace-networking mode on Crostini.
Tailscale SSH continues to work via tailscaled's netstack.
Users can override with --tun=tailscale0 on ChromeOS builds
where cros-garcon is fixed.
Crostini is detected via /opt/google/cros-containers/bin/garcon,
which is present in every Crostini penguin container.
ssh/tailssh extends the existing Debian default-PATH case to
cover Crostini, since Crostini is Debian-based and benefits
from the same SSH PATH defaults.
RELNOTE: Crostini now defaults to userspace-networking.
Fixes#19488
Updates #12090
Signed-off-by: ferrumclaudepilgrim <ferrumclaudepilgrim@users.noreply.github.com>
The Tailscale daemon only refreshed TLS certs as a side effect of inbound
TLS handshakes or "tailscale cert" CLI calls. A node that doesn't see
inbound traffic during the renewal window silently rolls past expiry.
Add a once-per-hour background loop on LocalBackend that enumerates Serve
and Funnel HTTPS hostnames (filtered against the netmap's CertDomains so
we don't poke ACME for other nodes' service hostnames) and calls the
existing GetCertPEM path. The renewal decision (ARI window, then 2/3
expiry fallback) is unchanged; the loop just guarantees it runs.
For visibility during initial issuance or restart with a long-expired
cached cert, add a "tls-cert-pending" health Warnable that's set while
ACME is in flight and no usable cached cert exists. Async renewal of a
still-valid cert intentionally doesn't fire it. And then make the CLI "cert"
subcommand print out a warning if it's blocking due to a cert fetch
in flight, using that health info.
Fixes#19911Fixes#19912
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I144e46c40e957b2e879587decace32a523a6eade
When running `tailscale netcheck`, the reported timestamp used to be
in UTC and formatted according to RFC 3339 with a `T` to separate the
date from the time:
sfllaw@h2co3:~$ tailscale netcheck | head -n3
Report:
* Time: 2026-06-01T21:12:32.252620138Z
This is machine-readable time leaking out to the user interface. Times
in normal commands are formatted for humans to read:
sfllaw@h2co3:~$ date
Mon 01 Jun 2026 02:39:14 PM PDT
sfllaw@h2co3:~$ journalctl -t tailscaled | tail -n1
Jun 01 14:35:21 h2co3 tailscaled[3328921]: wgengine: sending TSMP disco key advertisement to 100.90.144.102
sfllaw@h2co3:~$ timedatectl show
Timezone=America/Los_Angeles
LocalRTC=no
CanNTP=yes
NTP=yes
NTPSynchronized=yes
TimeUSec=Mon 2026-06-01 14:38:32 PDT
RTCTimeUSec=Mon 2026-06-01 14:38:32 PDT
sfllaw@h2co3:~$ uptime --since
2026-05-15 07:37:45
This PR makes the times printed by the CLI commands consistent:
- For `tailscale routecheck`, it now prints local time as
`2026-05-15 07:37:45-07:00`.
- For `netlogfmt`, it has always printed local time with a space,
but now includes the time zone.
- All machine-readable outputs continue to be standard RFC 3339 in
UTC, i.e. `--format=json`.
As part of a general cleanup, this PR also adds standard common
time.Format layouts as tstime constants.
Fixes#19928
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Several packages built their HTTP transports with
http.DefaultTransport.(*http.Transport).Clone()
The standard library only documents http.DefaultTransport as an
http.RoundTripper, so an application is free to replace it with a
RoundTripper that is not a *http.Transport (e.g. an instrumented or
tracing wrapper). When such an application embeds tsnet.Server, the
unchecked type assertion panics as soon as tsnet brings up its control
connection, DNS bootstrap, or log uploader.
Add netutil.NewDefaultTransport, which returns a clone of the global
when it is still the standard *http.Transport (preserving existing
behavior) and otherwise returns a fresh transport mirroring the stdlib
defaults. Route every clone site through it.
Updates #19937
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Signed-off-by: Achille Roussel <achille.roussel@gmail.com>
This adds @alexwlchan's proposed "tailscale get" command that reads
current preference values, complementing "tailscale set". It uses the
same flag names as set.
tailscale get # show all settings as a table
tailscale get all # same
tailscale get accept-dns # show a single value
tailscale get --json # output as JSON object
tailscale get --set-flags # output as tailscale set argv
Fixes#11389Fixestailscale/corp#38702
Change-Id: Ie366f27f11ccc56c76fff9a94ed8a9de9c835bd0
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Introduce a new `tailscale routecheck` command which prints a report
of high-availability routers that are reachable.
This command rhymes with the `tailscale netcheck` command and but
instead of reporting on local network conditions, `routecheck` reports
on remote connectivity.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Previously, testwrapper only retried tests explicitly annotated with
flakytest.Mark. Authors don't pre-emptively mark tests that haven't
flaked yet, so the first flake of a brand-new test failed CI even
when a re-run would have passed.
testwrapper now retries every failing test within a per-test wall-clock
budget (default: 5 minute per-attempt timeout capped at 1.5x the first
failure duration, 10 minute total). A test that fails and then passes
on retry is reported as flaky; a test that never passes within the
budget remains a real failure (exit non-zero).
For flakeapp's existing log scraping, the wire format is preserved:
the "flakytest failures JSON:" line is now emitted only for tests
that ultimately flaked (passed on retry). Unmarked tests get a fake
issue URL of the form https://github.com/{owner}/{repo}/issues/UNKNOWN
where owner/repo is detected from GITHUB_REPOSITORY, the local git
remote, or falls back to tailscale/tailscale. A new "permanent test
failures JSON:" line is emitted for tests that never passed; flakeapp
ignores it for now (a follow-up can teach it to record real failures
separately).
flakytest.Mark stays as an opt-in API: still useful for tracking a
known-flaky test against a real issue and for TS_SKIP_FLAKY_TESTS.
Updates tailscale/corp#38960
Change-Id: I56dfc9b023486d239f60793a53e9690578ce8017
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In order to support a `tailscale routecheck` command, we introduce the
`/localapi/v0/routecheck` endpoint to the local API. This endpoint
returns the most recent report collected by the routecheck client.
If `force=true` is an argument in the query string, then this endpoint
will actively probe before returning the report.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The routecheck package parallels the netcheck package, where the
former checks routes and routers while the latter checks networks.
Like netcheck, it compiles reports for other systems to consume.
Historically, the client has never known whether a peer is actually
reachable. Most of the time this doesn’t matter, since the client will
want to establish a WireGuard tunnel to any given destination.
However, if the client needs to choose between two or more nodes,
then it should try to choose a node that it can reach.
Suggested exit nodes are one such example, where the client filters
out any nodes that aren’t connected to the control plane. Sometimes an
exit node will get disconnected from the control plane: when the
network between the two is unreliable or when the exit node is too
busy to keep its control connection alive. In these cases, Control
disables the Node.Online flag for the exit node and broadcasts this
across the tailnet. Arguably, the client should never have relied on
this flag, since it only makes sense in the admin console.
This patch implements an initial routecheck client that can probe
every node that your client knows about. You should not ping scan your
visible tailnet, this method is for debugging only.
This patch also introduces a new OnNetMapToggle hook, which fires when
the netmap transitions from nil to non-nil, or vice versa. This
happens either when the client receives its first MapResponse after
connecting to the control plane, or when it clears the netmap while it
is disconnecting. Routecheck uses this to wait for a valid netmap
so it knows which peers to probe.
Updates #17366
Updates tailscale/corp#33033
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Add a "tailscale whoami" subcommand that is equivalent to running
"tailscale whois $(tailscale ip -4)" but more ergonomic. It supports
the --json flag just like whois, and shares the WhoIsResponse
rendering code with whois.
Fixes#19907
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I8f33ba7a5608bab7dffa8213303beb5f345936d3
When parsing the `tailscale up --exit-node=ARG` argument, we try to
resolve hostnames by searching the list of peers. However, at startup,
the peer list is empty, causing hostname lookups to trivially fail with
an unhelpful "invalid value" erorr.
Improve the error message when the peer list is empty to inform the user
that hostnames cannot be resolved during startup, and advise them to use
the exit node's Tailscale IP address instead.
Also, clarify that hostnames must be peer hostnames, not arbitrary
hostnames.
Fixes#19882
Change-Id: I9390a427c2863d657cf46c5e33b43cb3c5363764
Signed-off-by: Alex Chan <alexc@tailscale.com>
Single-pod ingress/egress proxies already called ClampMSSToPMTU when
setting up forwarding rules, but the proxy group (HA) code paths in
egressservices.go and ingressservices.go did not. This caused TCP
connections through proxy group pods to suffer from MSS/MTU mismatch
issues in environments where path MTU discovery is not working.
Add ClampMSSToPMTU calls in the egress sync loop (alongside the existing
EnsureSNATForDst call) and in addDNATRuleForSvc (alongside the existing
EnsureDNATRuleForSvc call), mirroring what the single-pod forwarding
rules already do.
Also add MSS clamping assertions to TestSyncIngressConfigs and track
ClampMSSToPMTU calls in FakeNetfilterRunner.
Fixes issue #19812https://github.com/tailscale/tailscale/issues/19812.
Tracking internal ticket TSS-86326.
Signed-off-by: Jay Tung <ltung@crusoeenergy.com>
Co-authored-by: Jay Tung <ltung@crusoeenergy.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Parallel subtests share *ipn.Notify pointers (e.g. runningNotify).
When multiple subtests reached the same phase concurrently, they
all wrote to the shared notify's InitialStatus field without
synchronization, triggering the race detector.
Fix by shallow-copying *ipn.Notify before setting InitialStatus,
so each test iteration works on its own copy.
Updates #19380
Change-Id: I9dd40037e02146166f006f4f7c1ddcc47adba191
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Previously, sharding required tests to opt in by calling tstest.Shard,
which used a process-global counter to assign each test to a shard.
This had two problems: most tests didn't call it, so they ran on every
shard (defeating the purpose), and shard assignments were unstable
(depended on call order, so adding a test could reshuffle others).
Remove tstest.Shard and tstest.SkipOnUnshardedCI entirely. Instead,
have testwrapper implement sharding automatically for all tests: when
TS_TEST_SHARD=N/M is set, it uses "go list -json" (no compilation) to
find test source files, scans them for top-level Test/Benchmark/
Example/Fuzz function names, and filters by fnv32a(name) % M == N-1.
The filtered names are passed as an anchored -run regex to go test.
Using go list instead of "go test -list" avoids linking the test binary
twice (Go's build cache does not cache test binary linking).
Fixes#19886
Change-Id: I62ab7b3d757324d4c5fd0b5de50c1e3742681791
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Some tests in another repo were broken by tailscale/tailscale#19607.
This fixes them, by finishing off the rest of the migration away from
netmap.NetworkMap on the IPN bus in containerboot.
Containerboot used to rebuild a full NetworkMap-shaped view while
reacting to IPN bus notifications. Now it insteads has its own
netmapState type (immutable) of exactly what it needs to track, and
sends those immutable values around, making cheap edits of new
immutable values when an IPN bus edit arrives.
This should make cmd/containerboot scale to much larger tailnets now too.
Fixes#19852Fixestailscale/corp#42347
Updates #12542
Change-Id: I88adaf061f85f677f954a764935e6654329d75a6
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Commit e5a8cf3b1 added feature/runtimemetrics, which emits heap bytes
and total process memory as clientmetrics when the
NodeAttrEmitRuntimeMetrics capability is set. That subsumes the job of
the TS_DEBUG_MEMORY envknob, whose only effect is to prefix every log
line with Go heap+stack and Maxrss via logger.RusagePrefixLog.
Updates tailscale/corp#39434
Signed-off-by: Jordan Whited <jordan@tailscale.com>
tailscale-client-go-v2 natively supports identity federation authentication,
and in #19010 the required authentication provider is used, but the manual
token exchange was never removed, so we were exchanging JWT token to an auth
token, and then were trying to use that auth token for exchange once again.
This commit removes the legacy mechanism, fully relying on
tailscale-client-go-v2 to handle authentication.
Fixes#19844
Signed-off-by: Artem Leshchev <matshch@avride.ai>
findStaticEndpoints built its return slice by iterating nodes.Items in
the order returned by r.List, which is not guaranteed to be stable
across calls. When the resulting set of addresses already matched the
existing config Secret, the slice could still permute between
reconciles, making the marshalled config Secret differ byte-for-byte.
That tripped the DeepEqual check on the config Secret, which rewrote
the Secret, which fired a watch event, which re-enqueued the
ProxyGroup, looping forever.
Detect this case and return the existing currAddrs slice unchanged
when the resulting set is the same, preserving the "use the currently
used IPs first" intent without spurious writes.
Fixes#19700
Signed-off-by: Jason Dillingham <jasonmdillingham@gmail.com>