Introduces a new 'Effective Load' metric, calculated as GPU Load weighted by power consumption (Current Power / Max Power). This provides a more accurate representation of hardware throughput, especially in low P-States.
- Updates 'gpuinfo_dynamic_info' to store effective_load_rate.
- Implements calculation logic in 'extract_gpuinfo.c'.
- Adds 'Eff. Load' display to the device header in the ncurses interface.
- Adds 'Effective load rate' as a selectable metric in the Chart setup menu and handles config persistence.
Currently, GPU clocks and GPU memory clocks have
the same same field width size in the interface,
which is presumed to be in the thousands of MHz.
However, memory clocks reported by the GPU can be
in the tens of thousands range, presumably to
account for memory features such as PAM4 (like on
the RTX 4090). This causes the GPU memory clock
field to be one byte short when 5 digit clocks
are reported, cutting the 'z' from MHz.
This commit fixes that by adding a new
device_field for the memory clock that's one char
longer than the device_field for the GPU clocks,
and makes the appropriate changes in usage and
calculations that rely on these values.
I originally moved gpuinfo_refresh_utilisation_rate() from Mali code
into src/extract_gpuinfo.c when I realised utilisation rate could be
calculated in a device-independent way simply by following the
percentage utilisation guidelines given in
Documentation/gpu/drm-usage-stats.rst
However, I forgot to replace the magic number '2' which stood for the
engine count in Mali GPUs with a value that make sense for different
devices.
Source the engine count from gpu_info's static information values.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
The level of code reduplication between Panthor and Panfrost backends was
outrageous. Factorise all their shared functions and definitions into a
separate library that is only built for these two backends.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
According to Kernel documentation, in
Documentation/gpu/drm-usage-stats.rst, drm-maxfreq-keystr and
drm-cycles-keystr can be used to perform a manual calculation of the
engine's percentage utilization, in cases where the underlying GPU doesn't
support providing this information through a native hardware interface.
However, for its calculations, all fdinfo drm-cycles values for every
single process that has opened the device file must have been retrieved, so
a new gpu vendor struct callback was added to do this right before
gpuinfo_fix_dynamic_info_from_process_info is invoked.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
Adds a new UI bar with some optional parameters like the number of shader
cores, number of execution engines and size of L2 cache.
Display of this bar is triggered with a program argument.
Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com>
GPUs that rely on the fdinfo to expose process information
(AMDGPU/Intel) can register a callback function to parse a fdinfo file
open to the DRM driver.
The callback function returns a populated process info struct uppon
successfull parsing.
During each main loop iteration, the processes in /proc are scanned only
once instead of once per AMDGPU previously.