The assertion "We should not be processing a client id twice per update"
can fail when a process has multiple file descriptors referencing the
same DRM client (e.g., via dup(), fork(), or DRM master operations).
The kcmp syscall filters duplicate file descriptions but not distinct
file descriptions that report the same underlying DRM client_id.
This change converts the debug assertion into a runtime check that
gracefully skips duplicate entries and frees any newly allocated
cache entries to prevent memory leaks.
Fixes the crash:
nvtop: ./src/extract_gpuinfo_amdgpu.c:964: parse_drm_fdinfo_amd:
Assertion `!cache_entry_check && "We should not be processing a
client id twice per update"' failed.
Applied to all affected drivers:
- AMDGPU
- Intel i915
- Intel Xe
- Qualcomm MSM (also fixed incorrect hash key usage)
- ARM Mali
The MEM display currently shows just VRAM (which is the carveout
of system memory). However amdgpu will use VRAM or GTT interchangeably
on APUs. That is this isn't really useful for most APU users.
Detect that an APU is in use and add the two together.
Closes: https://github.com/Syllo/nvtop/issues/399
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Introduces a new 'Effective Load' metric, calculated as GPU Load weighted by power consumption (Current Power / Max Power). This provides a more accurate representation of hardware throughput, especially in low P-States.
- Updates 'gpuinfo_dynamic_info' to store effective_load_rate.
- Implements calculation logic in 'extract_gpuinfo.c'.
- Adds 'Eff. Load' display to the device header in the ncurses interface.
- Adds 'Effective load rate' as a selectable metric in the Chart setup menu and handles config persistence.
Currently, GPU clocks and GPU memory clocks have
the same same field width size in the interface,
which is presumed to be in the thousands of MHz.
However, memory clocks reported by the GPU can be
in the tens of thousands range, presumably to
account for memory features such as PAM4 (like on
the RTX 4090). This causes the GPU memory clock
field to be one byte short when 5 digit clocks
are reported, cutting the 'z' from MHz.
This commit fixes that by adding a new
device_field for the memory clock that's one char
longer than the device_field for the GPU clocks,
and makes the appropriate changes in usage and
calculations that rely on these values.
Implements comprehensive terminal state handling to fix display
corruption when reconnecting to tmux sessions:
- Add SIGCONT signal handler to detect process resumption
- Add KEY_RESIZE event handling for ncurses-detected state changes
- Both events trigger complete window reinitialization
- Eliminates garbage display issues after tmux disconnect/reconnect
This matches the behavior of well-behaved terminal applications
like htop that properly handle terminal state transitions.
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
Implements keyboard shortcuts to manually refresh the display:
- F5: Refresh screen and redraw all interface elements
- Ctrl+L: Same as F5, common terminal refresh shortcut
Uses complete window reinitialization to ensure proper redraw
of all elements including window borders and frames.
Fixes#394🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
I originally moved gpuinfo_refresh_utilisation_rate() from Mali code
into src/extract_gpuinfo.c when I realised utilisation rate could be
calculated in a device-independent way simply by following the
percentage utilisation guidelines given in
Documentation/gpu/drm-usage-stats.rst
However, I forgot to replace the magic number '2' which stood for the
engine count in Mali GPUs with a value that make sense for different
devices.
Source the engine count from gpu_info's static information values.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Rockchip is a Chinese fabless semiconductor company based in Fuzhou, Fujian province.
This commit adds the load monitoring capabilities of Rockchip's NPU products to NVTOP.