Cover the non-RCE behaviour of the new JSON cache:
- round-trip: written file is valid JSON, re-read produces equivalent dict
- legacy pickle: a pre-fix pickle cache is treated as a cache miss, not
a crash (upgrade path)
- expiry: caches older than 7 days are invalidated
- version skew: caches written by a different installed version are
invalidated
- first run: a missing file is not an error
The version-check cache at $XDG_CACHE_HOME/glances/glances-version.db is
read at every Glances startup via pickle.load() — an execution-capable
deserialization format. Any process able to write that path (local user
on a shared host, sibling container on a shared volume, symlink race
during first run) could plant a pickle whose __reduce__ runs arbitrary
code as the Glances user, including root in typical deployments.
Switch to json for both load and save. The stored payload is trivial:
two strings plus a timestamp that round-trips via isoformat(). Any
unreadable, malformed, or legacy-pickle file is caught by the existing
exception handler and treated as a cache miss — the next PyPI refresh
overwrites it with a JSON file. No user-visible behaviour change.
The pickle module is removed from the imports — it has no other use in
this file.
Mitigates CVE-2026-46607.
Regression test for GHSA-9837-48hr-q32j: glances/outdated.py reads its
version-check cache file via pickle.load(), a deserialization format
that executes arbitrary callables embedded via __reduce__.
The test plants a poisoned pickle at the cache path and asserts that
_load_cache() does NOT trigger the embedded callable. Against the
current (vulnerable) code this fails because the payload fires before
the TypeError is raised on the unrelated dict subscript.
The fix in the next commit replaces pickle with json, which is a passive
data format.
Mirrors the existing REST/WebUI warning style. Makes unprotected
XML-RPC deployments visible to operators without changing default
behaviour (no enforcement).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add opt-in DNS rebinding protection to the XML-RPC server via a new
xmlrpc_allowed_hosts config key in [outputs]. When set, the handler
rejects requests whose Host header does not match any of the listed
patterns (fnmatch wildcards supported). Validation runs before
authentication so spoofed Host values are rejected regardless of
credentials.
Default behaviour is unchanged (no allowlist = no filtering). A
startup warning is added in a follow-up commit to make unprotected
deployments visible to operators.
Mitigates CVE-2026-46611.
Adds a second test server bound to a config that enables xmlrpc_allowed_hosts,
plus the failing assertion that a spoofed Host header returns 400. The fix in
glances/server.py follows in the next commit.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This test passes on the unpatched server and proves the CVE-2026-46611
vulnerability exists today: a spoofed Host header is accepted.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Re-creates tests/test_xmlrpc.py (deleted symlink) with a pytest module
modelled on test_restful.py: subprocess-launched server and a helper
to POST XML-RPC calls with a controllable Host header. Restores the
existing 'make test-xmlrpc' Makefile target.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Design document for CVE-2026-46611 patch: add opt-in Host header
validation to the XML-RPC server via a new xmlrpc_allowed_hosts
config key, with permissive default and startup warning (mirrors
the REST/WebUI mitigation pattern).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Investigation flagged 88 epoll wakes/sec inside the asyncio scheduler
loop — vs the ~5/sec expected from 10 plugins on a 2s refresh. Root
cause: ``fs.model_v5._grab_stats`` issued one
``await asyncio.to_thread(psutil.disk_usage, mnt)`` per partition.
Snap-heavy hosts routinely expose 70+ mountpoints, producing 70+
thread-pool submissions + completions per cycle, each generating a
loop wake. Coalesce the whole walk inside a single ``to_thread`` so
the asyncio loop sees exactly one wake per fs cycle. Drop:
epoll.poll 701/8s → 138/8s (~88/s → ~17/s)
Two complementary base_v5 wins on top:
1) Hot-loop precomputation at __init__:
``_rate_fields``, ``_watched_fields`` and ``_allowed_field_names``
are cached once. ``_compute_rates_in_dict``, ``_compute_levels_for_item``,
``_precompute_plugin_thresholds``, ``_scan_pk_override_fields`` and
``_remove_parameters`` now skip the per-call ``.items()`` + flag
check. processlist with 580 procs × 17 fields × 4 hot loops used to
incur ~40k pointless lookups per cycle.
2) Skip ``_snapshot_raw`` entirely when no field is a rate counter:
the snapshot is only consulted by ``_transform_gauge``, which is a
no-op for plugins without ``rate: True`` fields. processlist (no
rate fields, 580 items) saves ~580 dict allocations per cycle.
10 s scheduler run (10 plugins, 580 procs, 71 fs mountpoints, idle):
before today: 4.0% CPU
after fs fix: 3.7% CPU
after pipeline: 3.1% CPU
v4 sim (raw psutil + engine, sequential): 2.6% CPU
→ v5↔v4 gap closes to ~0.5pp on a snap-heavy host.
Suite v5: 1370+30 green, lint clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Three orthogonal wins on the v5 hot path that together close most of the
v4↔v5 CPU gap observed when running both side-by-side with the
processlist plugin enabled:
1) alerts.ingest_plugin — skip min_duration lookup on stable observations
For processlist with 580 procs × 2 stable ok-levels = 1160 obs/cycle,
_reconcile already took a fast-path when observed == committed_level
— but only AFTER paying for _min_duration_for (up to 5 config reads
per call). Short-circuit before the lookup; book-keep has_committed
and fall through so the steady-state repeat-action dispatch still
runs for non-ok committed levels.
Saves ~5800 config reads/cycle.
2) TUI thread — Event.wait instead of polling sleep
_sleep_responsive was looping time.sleep(0.05) 20×/s just to check
the stop flag. threading.Event.wait(timeout) blocks until either the
timeout expires or stop() flips it — no polling, no kernel wakeups
between repaints. Same q/ESC latency (bounded by refresh_interval —
getch() is polled at the top of the loop, not during the sleep).
3) TUI refresh interval default = plugin refresh_time
Previously: tui_refresh_interval defaulted to 1.0s while plugin
refresh_time defaulted to 2.0s — the TUI repainted the same data
twice between every update. Default the TUI cadence to
[global] refresh_time. Operators who want a snappier UI can still
override via [outputs] tui_refresh_interval=.
Profile delta (10 cycles, 580 procs, full multi-plugin pipeline with
alerts ingest, 2s refresh):
before: 126 ms/cycle
after: 109 ms/cycle
plus halved TUI wake rate → less idle-CPU between cycles.
Suite v5: 1370+30 green, lint clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
processlist with ~580 processes burned ~80 ms/cycle in
``_compute_levels_for_item`` — 1158 calls to ``read_thresholds_categorical``
+ 9264 ``_parse_csv_tokens`` per cycle, all re-reading identical
plugin-wide configuration for every (item, field) pair.
base_v5._derived_parameters now precomputes plugin-level thresholds
(both numeric and categorical) once per cycle, then layers per-item
``<pk>_<field>_<level>`` overrides only when a section scan reports
such keys exist. The common case (no pk overrides — every plugin
except network in practice) reuses the precomputed mapping with a
single dict lookup per item.
GlancesConfigV5 grows ``section_keys(section)`` so the scanner can
introspect declared keys without forcing the read through
``as_dict()`` (which deep-copies the whole config).
Profile delta (10 cycles, 580 procs):
``_transform`` cumtime 91 ms/cycle → 25 ms/cycle (-73%)
cycle total 235 ms/cycle → 165 ms/cycle (-30%)
``read_thresholds_categorical`` 1158 calls → 2 calls per cycle
Suite v5: 1370+30 green, lint clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>