Jobstats is an open-source tool that helps researchers and system administrators understand how efficiently their HPC jobs use resources on Slurm clusters. It collects detailed information about CPU, GPU, and memory usage for each job using Prometheus and displays it through Grafana dashboards. With Jobstats, you can see in real time how your job is performing, compare what you requested versus what you actually used, and also get intelligent recommendations. It also keeps a history of past jobs so you can spot trends and make smarter choices in future runs. For researchers, that means faster queues and fewer wasted resources; for admins, it means better overall cluster efficiency and easier planning. Jobstats turns raw usage data into clear, actionable insights that help everyone make the most of HPC resources.

Articles (5)

Effective usage of CPU

Discovery HPC Cluster CPU Usage Tips.

GPU Computing on Discovery Cluster

This article provides a practical, overview of how GPU computing works on the Discovery cluster, helping researchers understand when and how to use GPUs effectively. It explains why GPUs do not automatically outperform CPUs, emphasizing key factors such as data transfer overhead, workload size, and the need for GPU-enabled software.

Improving GPU Utilization on Discovery

This article explains when using a GPU is actually helpful. It shows that GPUs are not always faster because data must be moved between the CPU and GPU, and not all software supports GPUs. It helps users how to understand how to request GPUs, check usage, avoid common mistakes, and run jobs efficiently on Discovery.

Memory Allocation for Slurm script

This page explains how to correctly request memory in Slurm scripts on the SDSU Discovery HPC cluster and how to diagnose memory-related errors. On Discovery, the word “memory” always means RAM, not file storage.