The watchdog (ipn/ipnlocal/watchdog.go) was abusing PeerForIP with an
invalid netip.Addr as a way to acquire and release the engine's
internal locks for deadlock detection. This does the TODO to break it out
into its own method like all the other similarly named methods.
Splitting this out as a prerequisite for a follow-up rewrite of
PeerForIP itself; not having to preserve the lock-probe overload in
the new implementation keeps that follow-up smaller.
Updates #12542
Updates #cleanup
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I25cbffd11aeb65600d9128845404c4918ef88ead
Detect Hetzner via /sys/class/dmi/id/sys_vendor == "Hetzner" and wire
up Hetzner's public recursive DNS resolvers (185.12.64.1, 185.12.64.2)
for use as a cloud host resolver.
Fixes#20217
Change-Id: I24a4c51956adfdd5731f62c937e3c7a4a733ffc7
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
This applies the same treatment from PR #20162 (netlog) and
PR #20171 (wglog) to the local Taildrive filesystem wiring, ending the
per-netmap-update O(n) rebuild of the drive remotes list.
This moves the O(n peers) taildrive-remote list rebuild from every
peer change (which previously happened regardless of whether you were
even using taildrive) to instead happen only as needed.
That running on every netmap update and was a contributor to the
broader quadratic behavior we want to eliminate when a single peer is
added or removed.
Instead, this introduces drive.RemoteSource, a small interface the
Taildrive filesystem pulls from lazily on incoming WebDAV requests,
and caches by a generation counter. ipn/ipnlocal installs a
driveRemoteSource once at NewLocalBackend time and bumps
LocalBackend.driveGen on the three events that can actually flip the
drive-capable peer set: full netmap installs (domain + self caps),
UpdateNetmapDelta (peer add/remove or per-peer address changes), and
updatePacketFilter (since PeerCapability values are derived from the
packet filter rules, not from peer.CapMap).
The hook itself is kept but narrowed: it no longer takes a
*netmap.NetworkMap and its only remaining job is to re-notify IPN bus
listeners of the current local shares list on full installs.
This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.
(Also add a bunch more tests)
Updates #12542
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I7e3d2f5b4a9c8e1d6f0a3b7c9e2d4f8a1b6c5e9d
This applies the same treatment from 8f210454dd (netlog) to wglog,
ending use of netmap.NetworkMap and instead getting the canonical data
from LocalBackend/nodeBackend.
This is a dependency to removing the netmap.NetworkMap from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Icb5af0799322def048a6f594b49f7d11273f025d
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
The test transferred only 64 KiB over loopback, which can complete
within a single clock tick on fast CI machines, causing
time.Since(start).Seconds() to return 0 and the
"transfer_time_seconds_total > 0" assertion to fail.
Increase the payload to 1 MiB so zero is genuinely implausible, and
retry up to 3 additional times. If the metric is still zero after 4
total attempts, fail hard — at that size it means the timing logic is
actually broken.
Fixes#20213
Change-Id: I3fab510ce8c567506fea5ad803d35acf40d65700
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
aa5da2e5f2 (in the 1.99.x dev series, unstable) introduced some bugs,
only some of which were later fixed. This fixed another. As of that
change, tkaFilterNetmapLocked ran only on full netmaps through
LocalBackend.setClientStatusLocked and not peer upserts via new or
changed peers. The later ae743642d9 fixed a regression in the
Engine layer but didn't fix the tkaFilter code from re-running on
upserts.
This add a tkaFilterDeltaMutsLocked pass before
nodeBackend.UpdateNetmapDelta. For each NodeMutationUpsert whose
peer fails the same signature check tkaFilterNetmapLocked applies,
rewrite the upsert in place into a NodeMutationRemove targeting the
same node ID, so magicsock's per-mutation dispatch and
nodeBackend.peers both drop the peer, matching the prior full-netmap
semantics.
New tsnet tests added:
- TestTailnetLockFiltersUnsignedDeltaPeer covers the new-peer
case.
- TestTailnetLockFiltersUnsignedDeltaPeerReplacement covers the
existing-peer-replacement case, to an empty signature.
- TestTailnetLockFiltersDeltaPeerWithInvalidSignature like above
but with a bogus signature.
Updates #12542
Updates tailscale/corp#43767
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: Ib35d0391541fee654867c26489847dbc5b7e2ae8
The ProxyGroup HA Service reconciler's validateService scanned every
Service in the cluster with shouldExpose=true for duplicate hostnames.
With multi-tailnet (Tailnet CRD) support, that scan reaches across
tailnet boundaries:
* A Service exposed via the single-proxy path (tailscale.com/expose)
on the primary tailnet would block a ProxyGroup ingress Service
for the same hostname on a secondary tailnet, even though the two
live in different reconcilers and different tailnet DNS namespaces.
* Two ProxyGroups joined to different tailnets via spec.tailnet
would also block one another for shared hostnames, again despite
living in separate DNS namespaces.
In both cases the ProxyGroup ingress Service was silently dropped
(IngressSvcInvalid event raised, queue cleared, ConfigMap never
written, ProxyGroup never serves the backend).
This change tightens the check in two ways:
* Skip Services that aren't themselves managed by the ProxyGroup
reconciler (use isTailscaleService instead of shouldExpose).
* For ProxyGroup-managed Services attached to a different
ProxyGroup, look up that ProxyGroup and skip the duplicate
report when spec.Tailnet differs from the current one. Fall
through and flag the collision on lookup failure so genuine
duplicates are not silently allowed.
Adds regression tests covering both the single-proxy and the
different-tailnet cases. Updates the existing TestValidateService
expected error to reflect the rephrased message.
Updates #20069
Signed-off-by: tsushanth <78000697+tsushanth@users.noreply.github.com>
Both tests started flaking after my 910735448 ("tstest/natlab/vnet:
send unsolicited IPv6 Router Advertisements") added background RA
traffic on v6-enabled networks.
TestPacketSideEffects races the periodic unsolicited-RA goroutine
against its synchronous packet-count assertions: when the multicast
RA fires after the test has registered its sinks, both sinks receive
it and "got 1 packet, want N" becomes "got N+2".
TestProtocolQEMU's reader was doing raw Read on the SOCK_STREAM unix
socket and comparing the whole result to the expected length-prefixed
packet. The kernel is free to coalesce the on-register RA frame and
the test packet into one Read, in which case bytes.Equal fails and
the entire chunk (including the test packet's bytes) gets discarded
as "unexpected", leading to a 5s i/o timeout. Parse the QEMU uint32
length-prefix framing with io.ReadFull instead so we read exactly one
frame per iteration regardless of how the kernel buffers them. The
SOCK_DGRAM path (TestProtocolUnixDgram) keeps the original raw Read
since datagram boundaries are preserved.
These where the top two flakes in oss on the flakes dashboards.
Updates #13038
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I32983656b692921a0f43a4a5e9a8a6ab2555ee49
Outbound packets produced by netstack (used by tailscaled with
--tun userspace-networking, by tsnet, and by the SOCKS5/HTTP proxies)
enter the wrapper via InjectOutbound{,PacketBuffer} and take the
injectedRead path, which bypasses Filter.RunOut.
RunOut's side effect for UDP/SCTP is to insert the reverse-flow tuple
into the connection-tracking LRU so that Filter.RunIn admits inbound
replies that no explicit ACL rule covers. Skipping it on the injected
path meant a netstack-side dial of UDP would send fine but the reply
would be dropped as "no matching rule". The kernel-TUN path was
already fine because it goes through RunOut.
Fixes#14229Fixes#20064
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
Change-Id: I816ef55c493a12ff4f561cd89c095559b5c2743b
Fix leaking peers that failed to complete the handshake.
Updates #20183
Change-Id: I84f7ea0484f05b090d963a7d12c135a66a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
suggestExitNodeLocked now ranks exit node candidates using the per-region
latency tracked by the netcheck Client (RecentRegionLatency), which merges
the reports retained in c.prev. That history is only useful for far-away
regions if it contains a full netcheck report, since incremental reports
only re-probe the home region and a handful of the fastest ones.
The full-report cadence in GetReport and the c.prev retention window were
two independent 5-min constants - the way we schedule netchecks ensured
that the history always contaned a full report, but it was not a strong
contract and we did not have any checks around this.
Now full report interval and retention window are driven by the same
var, and a test confirms that the history contains a full report.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
When recommending an exit node, suggestExitNodeLocked ranks candidates by
the latency to their home DERP region, taken from the most recent netcheck
report. But netcheck alternates between full reports, which probe every
region, and incremental reports, which only re-probe the home region and a
handful of the fastest regions. When the most recent report is incremental,
the suggestion fell back to a random for exit nodes that are far away.
Now we rank candidates against the best recent latency, tracked by the
`netcheck.Client` - the same data that is used to pick the preferred
DERP. It uses a history of measurements which includes a full netcheck
report, so should cover all DERP regions.
Updates tailscale/corp#17516
Signed-off-by: Anton Tolchanov <anton@tailscale.com>
ClampMSSToPMTU only added a rule matching the output interface (-o tun /
OIFNAME), which clamps the SYN forwarded out towards the tailnet peer but
not the SYN-ACK that arrives on tun and is forwarded back towards the
originating endpoint. As a result only one side of a forwarded handshake
had its MSS clamped; the endpoint on the other side of the proxy kept
advertising an MSS based on its own (larger) MTU.
When path MTU discovery is broken (e.g. proxies created by the Tailscale
Kubernetes operator, where tailscale0 has a 1280 MTU), the unclamped
endpoint's large segments exceed the tun MTU and are silently dropped,
causing TCP connections through proxy group pods to stall mid-stream on
large payloads. The earlier proxy-group fix (#19686) wired ClampMSSToPMTU
into the HA code paths but inherited this single-direction limitation, so
connections could still hang.
Add a second rule matching the input interface (-i tun / IIFNAME) in both
the iptables and nftables runners so both directions of the forwarded
handshake negotiate a PMTU-safe MSS.
Updates #19812
Signed-off-by: Samy Djemaï <53857555+SamyDjemai@users.noreply.github.com>
Add HTTPBandwidth/HTTPBandwidthWithDialAddr probe classes that download a
fixed number of bytes and record transfer time and bytes transferred as
Prometheus counters for bandwidth measurement, plus HTTPWithDialAddr and
the shared NewProbeTransport and HTTPBandwidthMetrics helpers.
The dial-address override lets a probe target a specific backend (e.g. a
single Funnel ingress node) while SNI, the Host header, and TLS cert
validation continue to derive from the URL host. HTTPBandwidthMetrics is
exported so other bandwidth probes (e.g. a receiver-reported upload probe)
emit an identical metric set and compare under a shared direction label.
Updates tailscale/corp#41587
Signed-off-by: Mike O'Driscoll <mikeo@tailscale.com>
If we don't close the connection between SSH server and recorder
explicitly once it's idle after the upload stream is closed, the
connection stays open and holds on to a port on the server. This
leads to port exhaustion on the server in the medium to long run.
To avoid this, close the idle connections explicitly. As an extra
step of precaution, set an idleConnTimeout of 30 seconds on both
the HTTP1 and HTTP2 recorder clients.
Updates tailscale/corp#43742
Signed-off-by: Gesa Stupperich <gesa@tailscale.com>
In direct mode we write resolv.conf via a temp file and rename(2), which
preserves the source's generic etc_t label instead of net_conf_t, causing
AVC denials when NetworkManager later manages the file. Run restorecon
after the rename (Linux, SELinux-enforcing, best effort) to restore the
policy-default label.
Fixes#20149
Signed-off-by: Brendan Creane <bcreane@gmail.com>
This patch adds support for the fmt.Stringer interface to the
ipn.NotifyWatchOpt enum. This is useful when debugging these bitmasks.
For example:
fmt.Printf("%s", ipn.NotifyPeerChanges | ipn.NotifyNoNetMap)
// Output: (ipn.NotifyPeerChanges | ipn.NotifyNoNetMap)
Fixes#20066
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Add support for configuring egress to destinations reachable via 4via6
subnet routes, using either the synthesized 4via6 address or the MagicDNS
name (in the form <IPv4-with-hyphens>-via-<siteID>[.*]).
Also update the Connector to validate and advertise 4via6 subnet routes.
Export net/netutil.ValidateViaPrefix so it can be reused by the Connector
validation logic.
This change only affects standalone egress proxies — ProxyGroup egress
requires IPv6 support before it can use 4via6.
Updates #19334
Change-Id: I6faecd6eb61ab55fc0cd97fe417af6b6a12fe7fc
Signed-off-by: Becky Pauley <becky@tailscale.com>
This patch adds:
- Set.All which returns an iter.Seq to complement Set.Slice.
- Set.AddSeq which adds an iter.Seq.
- Set.DeleteSeq which deletes an iter.Seq to complement Set.AddSeq
and provide the missing method for deleting multiple elements.
- Set.DeleteSlice and Set.DeleteSet to complement AddSlice and AddSet.
Updates #cleanup
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The Logger previously took a *netmap.NetworkMap at Startup and on every
ReconfigNetworkMap call, denormalizing it into per-IP and self lookup
maps. That denormalization is O(n) over all peers and ran on every
netmap update, contributing to the broader quadratic behavior we want
to eliminate when a single peer is added or removed.
Instead, this makes netlog ask LocalBackend (well, nodeBackend) for
the info it needs, letting us remove the netmap.NetworkMap type
entirely from the netlog package.
This is a dependency to removing the netmap.NetworkMap type from
upstream callers, like wgengine.Engine in general.
Updates #12542
Change-Id: Ib5f2de96e788a667332c0a6f7ac833b3d0053b5c
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
In PR #17809, @bradfitz tried to fix tsnet_test.TestConn by making the
second tailscaled start after the first was fully set up. On slow
runners, the Ping for connectivity to the second server would race
against that server establishing a connection with its DERP home. If
the Ping arrived too soon, the DERP server would respond with
PeerGoneNotHome and the Ping would wait for its full timeout before
failing the test.
This patch introduces waitForHomeDERPConnected and makes startServer
block until the server’s home DERP has established its connection.
This patch also reduces the Ping timeout to 10 seconds for the tsnet
tests, which should be enough that a hung Ping is fast enough for
interactive debugging, but with enough headroom for a RekeyTimeout.
Fixes#12766
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Mappings from transit IPs to real IPs are stored ephemerally in the
connector, so they're lost on restart. When we send a packet to the
connector with a transit IP it does not recognize, it sends us a TSMP
message saying so (see #19883). If we (the client) know of such a
mapping, we now re-send it to the connector so that a connection can
proceed.
Fixestailscale/corp#34256.
Signed-off-by: Naman Sood <mail@nsood.in>
Add support for the still pending encoding.ScalarMarshaler and
encoding.ScalarUnmarshaler interfaces, approved in golang/go#56235.
This patch deprecates geo.Point.MarshalUint64 in favour of
geo.Point.MarshalScalar and also adds an inline directive for go fix.
The same applies for the UnmarshalUint64 and UnmarshalScalar methods.
Updates #16583
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Follow-up cleanups to the IPv6 fragment extension header support added in
the previous commit:
- Document that minFragBlks is sized for IPv4 but intentionally reused by
decode6 for IPv6 fragments, where it is conservative (IPv6 fragments
carry no per-fragment IP header) and only ever rejects more later
fragments as Unknown, never fewer.
- Add a TestDecode case for a first fragment reachable only through a
chained extension header (base Next Header = Hop-by-Hop Options, which
chains to Fragment). decode6 only parses the Fragment header when it is
the base header's immediate Next Header, so this must classify as
Unknown. The test locks in that scoping decision.
Updates #20083
Updates #20140
Change-Id: Ibece03c6baf2385b0cc399f179819b08cbe921cc
Signed-off-by: James Tucker <james@tailscale.com>
util/def: add def.Bool and def.Duration default parse helpers
Replace multiple instances of def.Bool and def.Duration with a new util/def
package.
Updates #20018
Co-authored-by: Bobby <boby@codelabs.co.id>
Co-authored-by: Simon Law <sfllaw@tailscale.com>
Signed-off-by: Bobby <boby@codelabs.co.id>
Signed-off-by: Simon Law <sfllaw@tailscale.com>
Add UploadLogs, a stateless alternative to NewLogger for callers that
want to push a batch of log entries without the background uploader,
ring buffer, stderr echoing, or network-up gating that a Logger
provides. Entries are encoded, batched up to the server's maximum
upload size, and POSTed synchronously; unlike Logger it does not retry.
The Logger construction is split into a new unexported newLogger so the
connection/encode/upload machinery is shared without starting the
background goroutine.
Log entries are modeled as a generic LogEntry[T] whose Value is inlined
(via go-json-experiment) alongside the reserved "logtail" metadata
member. T may be a struct (or pointer), a map with a string key, or a
jsontext.Value; use jsontext.Value to mix differently-shaped payloads in
a single upload. UploadLogs fills in client_time/proc_id/proc_seq from
the Config where the caller leaves them zero.
Updates tailscale/corp#40908
Change-Id: Idbf23cd0eb8233082fbdb9abed0f6f153b9225ba
Signed-off-by: James Scott <jim@tailscale.com>
ipnlocal.LocalBackend.populatePeerStatusLocked assumed that Hostinfo
was always valid, but that’s not always true, especially in tests.
ipnlocal.peerAPIPorts suffered from a similar assumption.
This patch checks for NodeView.Valid and Hostinfo.Valid; assuming the
zero value as a safe default.
Updates #8948
Updates #12542
Signed-off-by: Simon Law <sfllaw@tailscale.com>
The earlier aa5da2e5f2 made peer adds and removes through a netmap
delta path that mutates only nodeBackend, on the assumption that
PeerForIP, lookupPeerByIP, the engine's wireguard config
(e.lastCfgFull), the engine BART, wgdev's PeerLookupFunc closure, and
the engine's cached netmap (e.netMap) would all stay correct without
further updates. They don't. I'd totally forgotten that
Engine.PeerForIP has its own alternate IP-to-peer lookup codepath.
Concretely, all of these failed for a peer that arrived via
[tailcfg.MapResponse.PeersChanged] (and never via a full
[tailcfg.MapResponse.Peers] list):
- [wgengine.Engine.PeerForIP] read from e.netMap and e.lastCfgFull
(neither updated on the delta path) and so missed the new
peer. The rando non-data-plane callers (Ping, TSMP, pendopen,
debug endpoints, tsdial.Dialer.UseNetstackForIP for tsnet and
onlyNetstack tailscaled) all returned "no matching peer".
- The engine BART (built from e.lastCfgFull) missed the new peer's
subnet routes / exit-node default routes.
- wgdev's [device.PeerLookupFunc] closure (rebuilt only inside
wgcfg.ReconfigDevice) didn't have the new peer's noise key, so
outbound encryption to the new peer dropped the packet even when
SetPeerByIPPacketFunc returned the right NodePublic.
- And nothing in the delta path triggered NodeMutationRemove to
flow through to authReconfig either, so the same stale state
pointed at removed peers indefinitely.
So just (functionally) revert it for now, to have something easily
cherry-pickable to the 1.100 release branch. Proper fixes can come later
for the next release.
This also adds three new tests:
- TestPingPeerLearnedViaDelta runs disco and TSMP subtests over a
delta-added peer with only self addresses. disco exercises the
cold PeerForIP path (magicsock); TSMP exercises the full data path
through wgdev encryption. Both fail without this fix.
- TestPingSubnetRouteOfDeltaPeer exercises a subnet-router peer
arriving via delta. With s1 in --accept-routes mode, an IP
inside the advertised CIDR must resolve to s2 and a TSMP ping
must round-trip. Hits the BART + lastCfgFull + wgdev staleness
in one go.
- TestPingSelfReturnsIsLocalIP is a regression guard for the
IsSelf early-out in Engine.Ping. Passes on main today; included
here so future refactors of PeerForIP can't regress self
handling without test breakage.
Updates tailscale/corp#43394
Change-Id: I7a049271359bd73e7147ae9e2554e85614c2b8d2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
decode6 didn't parse the IPv6 Fragment extension header (Next Header 44),
so any source-fragmented IPv6 packet was classified as an unknown protocol
and matched no ACL rule. The filter then silently dropped it and counted it
as an "acl" drop, even on allow-all tailnets, blackholing large UDP (DNS,
WebRTC, etc.) over a tailnet's IPv6 addresses. IPv4 fragments were already
handled by decode4.
Parse the fragment header the same way: read the first fragment's transport
ports so the filter matches it like an unfragmented packet, pass later
fragments through as ipproto.Fragment, and reject overlapping-fragment
offsets (RFC 1858) and first fragments too short to hold the transport
header as unknown.
Fixes#20083
Signed-off-by: Steve Avery <hello@stevenavery.com>
Added in #20111, but it is too noisy under real load to be useful.
Updates #12542
Change-Id: Ib99a8966ade0bfa4281fccc057249819cdcdfe83
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
`go run` builds a manifest-less .exe, so Windows applies installer-
detection heuristics and requests admin privileges to programs that
contains "install", "setup", or "update". Rename to dodge that.
Updates #20133
Change-Id: I144d3fcb076d7a02e4a3eb9fd079ee022a035c76
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
Add a workflow that requests review from @tailscale/k8s-devs on PRs
touching Kubernetes operator, kube libraries, container build, etc.
Also cleans up check out code on k8s and dataplane workflow.
Updates #cleanup
Change-Id: I6fd7cacf71e1299f7e8f546ef52c4063fbf6bab8
Signed-off-by: Fernando Serboncini <fserb@tailscale.com>
tailscale serve set-config now also accepts the legacy raw ipn.ServeConfig
format (as emitted by `tailscale serve status --json` and consumed via
TS_SERVE_CONFIG, which has no "version" field), so the common
serve-status-edit-set workflow stops failing. Only the services-oriented
content is applied; any node-level fields are skipped with a warning to
stderr pointing users at get-config to migrate.
Fixestailscale/corp#39793
Signed-off-by: Brendan Creane <bcreane@gmail.com>
Bumps wireguard-go pin to include the roaming endpoints fix, and
two internal enhancements.
Pulls stock wireguard-go for non-tailscale simulation in tests,
to use its endpoint discovery mechanism.
Updates #20082
Change-Id: I2ff282cb7fe4ab099ce5e780a1d40ae86a6a6964
Signed-off-by: Alex Valiushko <alexvaliushko@tailscale.com>
Package features/conn25 wires up the hooks directly on the tun wrapper
without needing to go through the userspace engine, so this codepath is
unused and not needed.
Updates #cleanup
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Add an UpdatePeers method to the cache. This allows us to support netmap peer deltas,
by allowing just the peers to be updated in an existing cache. As a safety check, reject
an update if there was no base netmap data to apply a change to.
Then, when processing peer mutations in the backend, capture any changes that should
be applied to the cache and update it, if one is enabled.
Updates #12542
Change-Id: I2f8790a8fdc5e85fce6700ba4821a8cb10dddffa
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Since deltas are only (at present) received from the control plane, processing
a delta signifies we are no longer operating on a netmap fully loaded from
cache, even if most of the netmap is still in the same configuration.
Updates #12542
Change-Id: I84132c4bf2dde6e5c1c57144645edb986b051dca
Signed-off-by: M. J. Fromberger <fromberger@tailscale.com>
Flakeytest seems to not work on vmtest. We have a few PRs that will fix
the problem on these tests, so skip to unblock.
Updates #19843
Signed-off-by: Claus Lensbøl <claus@tailscale.com>
The hook fires when a flow is removed for any reason (LRU capacity eviction,
tuple-collision displacement, or idle-time expiry). The hook is invoked
exactly once per flow, after the flow table mutex is released, so callbacks
may safely acquire other locks.
We rename the IPMapper interface to Conn25Datapath, and add
ClientFlowCreated/ClientFlowRemoved methods so *Conn25 can keep client-side
address assignments alive while traffic is in flight. Those methods are
currently stubbed for future work.
Connector flows do not currently call these methods.
Updates tailscale/corp#38630
Updates tailscale/corp#43180
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
The returned error in the signature is left over from previous
implementations and was only returning nil.
If we know NewFlow will succeed we can fire a create hook (implemented
in a future commit) before NewFlow, which will prevent a remove hook for
a flow from firing before the create hook for the same flow.
Updates tailscale/corp#38630
Signed-off-by: Michael Ben-Ami <mzb@tailscale.com>
Adds tailscaled_serve_{inbound,outbound}_bytes_total, labeled by Tailscale
Service name, by wrapping the peer-facing conn in tcpHandlerForVIPService.
Per-service counters persist for the process lifetime rather than being
evicted on serve-config changes.
Fixes#19572
Signed-off-by: Raj Singh <raj@tailscale.com>
Co-authored-by: Ethan Smith <ethan.smith@grafana.com>
When running under the macOS sandbox, "tailscale configure kubeconfig"
refused outright whenever $KUBECONFIG was set, assuming the path would
not be writable. Yet when $KUBECONFIG was unset it happily relied on the
home-relative-path entitlement to write to ~/.kube/config, so the two
paths made inconsistent assumptions about what the sandbox can reach.
Resolve the kubeconfig path first, then check whether the target file
(or the nearest existing parent directory) is actually writable. Only
report an error if it is not, and include macOS sandbox guidance in that
error since a path outside the home directory is the likely cause. This
lets a $KUBECONFIG that does point under the home directory work, rather
than being rejected unconditionally.
Fixes#20007
Change-Id: I9880363c38b981efaed7e97367851ddacf647be1
Signed-off-by: James Tucker <james@tailscale.com>