Frontier · ORNL
1.206 EF/s LINPACK · Still #1 on the November 2024 TOP500 with HPE Cray EX + AMD MI250X nodes.
Updated February 2025
hpctutorials tracks how top labs—from ORNL to Argonne—run clusters with thousands of nodes, GPUs, and impatient scientists. We distill their patterns into actionable runbooks so your team can stand up reliable, policy-compliant compute without the guesswork.
Every guide mirrors the thought-leader design system behind pranavkulkarni.org: single-column focus, ruthless clarity, and data pulled from the latest TOP500, MLPerf, and Slurm releases.
Maintained by Mandar Gurav & Pranav Kulkarni — operators who live inside Slurm, Flux, and exascale programs daily.
Long-form notes published weekly so operators, research software engineers, and leadership stay aligned.
From bastion policies to module stacks so new researchers ship jobs in under 30 minutes.
Modern Slurm patterns, job arrays, heterogeneous allocations, and QoS dashboards.
How labs fuse MPI, CUDA, and inference with profiling and governance guardrails.
1.206 EF/s LINPACK · Still #1 on the November 2024 TOP500 with HPE Cray EX + AMD MI250X nodes.
1.012 EF/s debut · Ponte Vecchio + Sapphire Rapids system entering production science.
2+ EF/s target · MI300A-powered system with strict zero-trust data transfer baked in.
Source: November 2024 TOP500 plus SC24/ISC field briefings. We monitor MLPerf releases because every lab now spans simulation + AI.