Research Cyberinfrastructure

High-Performance Computing (HPC) Documentation.

Categories (7)

Innovator Cluster

The Innovator HPC cluster boasts advanced CPU and GPU architectures for parallel computing and acceleration of machine learning applications. The computational tasks of AI researchers and simulations will gain significant boost with NVIDIA A100 GPU hardware.

Slurm (Cluster Resource Manager)

Slurm is an open source, fault-tolerant, and highly scalable cluster management and job scheduling system for large and small Linux clusters. The Slurm resource manager has three key functions. First, it allocates exclusive and/or non-exclusive access to resources (compute nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (normally a parallel job) on the set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

Software

Research software tutorials and information.

Workshops

Research Cyberinfrastructure workshop tutorials and other information.

Discovery Cluster

This MRI-funded high-performance computing (HPC) cluster is designed to accelerate advanced research in GPU-intensive fields such as machine learning, deep learning, data analytics, and scientific simulation. Featuring the latest NVIDIA H100 GPUs, the system provides researchers with cutting-edge capabilities for massively parallel workloads and AI model training. This resource supports the SDSU research community by enabling scalable, high-throughput computation for projects that demand significant GPU acceleration.

Jobstats performance

Jobstats is an open-source tool that helps researchers and system administrators understand how efficiently their HPC jobs use resources on Slurm clusters. It collects detailed information about CPU, GPU, and memory usage for each job using Prometheus and displays it through Grafana dashboards. With Jobstats, you can see in real time how your job is performing, compare what you requested versus what you actually used, and also get intelligent recommendations. It also keeps a history of past jobs so you can spot trends and make smarter choices in future runs. For researchers, that means faster queues and fewer wasted resources; for admins, it means better overall cluster efficiency and easier planning. Jobstats turns raw usage data into clear, actionable insights that help everyone make the most of HPC resources.

Articles (10)

Globus

Documentation on how to use Globus for data transfers

Linux Command Line (Basic Commands)

Some basic command line commands to help users navigate without a GUI.

Secure Shell (SSH) Connections

Article on how to SSH into RCI Linux systems.

Transferring Files (FTP/SCP)

Article explaining how to use FTP and SCP to move files from one system to another.

Virtual Network Connection (VNC)

Article show the use of VNC on our research environment.

DCV (NICE Desktop Cloud Visualization)

Step by step guide to using visualization on the cluster with DCV.

SDSU OpenOndemand

Details on how to log into ondemand and access Jupyter software suite.