We present an example of CAMx runtimes on the US EPA’s High Performance Computing (HPC) system (Atmos). The CAMx v6.40 configuration includes:
- Single domain over the eastern US, 225x225x25 grid at 12 km resolution
- CB6r2 gas-phase chemistry + CF aerosol chemistry
- PiG invoked for major point sources
- Source Apportionment: 9 regions x 1 sector, OSAT + PSAT (sulfer and nitrogen families), 220 total tracers
The plot below shows model speed for 1 simulation day using combinations of OMP and MPI parallelization and combinations of standard disk and solid state (RAM) I/O. Speed improves up to 512 cores:
- Multiple points shown for each number of total cores results from different OMP/MPI combinations
- At 128 total cores, 128 MPI x 1 OMP is slowest, 32 MPI x 4 OMP is fastest
- Fast I/O (such as solid state drives) become important at large numbers of cores
- We recommend using OMP and MPI in combination
- Conduct tests to determine which OMP/MPI combinations work best for your model application and computer system