Nickel or Iron: Which Is Better for Your CONVERGE™ Job on Rescale's Platform?
CONVERGE™ from Convergent Science is one of the most popular computational fluid design(CFD) simulation programs in the field of engine design and simulation. Its parallel processing feature leverages MPI which effectively increase the job running speed in the multicore and many core environment. You can run CONVERGE™ jobs on demand on Rescale’s cloud simulation platform using all the available core types. In this blogpost, I’ll make a performance and cost comparison of both Nickel and Iron HPC core types. Hopefully this can serve as a core selection guide for running your own CONVERGE™ simulations on Rescale.
Test Environment
Nickel (HPC+) | Iron (HPC InfiniBand) | |
Application | CONVERGE™ 2.2 for Linux | CONVERGE™ 2.2 for Windows |
MPI Flavor | hp-mpi for Linux | Microsoft MPI 4.2 |
CoreType | Nickel | Iron |
Compute | 6.75 CU | 6.75 CU |
Memory (GB/core) | 3.8 GB | 3.8 GB |
Storage (GB/core) | 32 GB | 32 GB |
Network | 10 Gb/s | RDMA InfiniBand (40 Gb/s) |
Price | $0.15 (/core/hour) | $0.30 (/core/hour) |
The first two rows of table show the software environment we chose and the remaining rows indicate the hardware specifications. Although most of the hardware specifications are similar for Nickel and Iron, one noteworthy difference is the network. While the Nickel has only 10Gb/s, the Iron core type has 40Gb/s bandwidth, which is a significant advantage for jobs running across multiple nodes, and I believe this is the primary reason for the 50% price disparity between the two core types.
Benchmarking Job
The benchmark job we chose is provided by Convergent Science. It models the phenomenon of a curving shot on the soccer field, and is intended to show us what it takes to get the “bending” of the ball in mid-air (detailed description). The simulation models 0.1 seconds of the soccer ball’s movement using a time step of 0.001 seconds. The model initially consists of 81,576 nodes.
Small Cluster Performance-Cost Comparison
In the first round, we tested the cluster performance on 16, 32, and 64 cores respectively for both core types. The results are shown in the table below.
16 cores | 32 cores | 64 cores | ||
Nickel (HPC+) | Time(s) | 2611.39 | 2086.38 | 1857.28 |
Price($/hour) | 2.40 | 4.80 | 9.60 | |
Iron (HPC InfiniBand) | Time(s) | 3671.327635 | 2709.023048 | 2020.854733 |
Price($/hour) | 4.80 | 9.60 | 19.20 |
From the table, we see that a Nickel cluster with up to 64 cores has better performance than Iron, and is less expensive. So if you need to run a small job on a small cluster, Nickel is probably a better choice.
Mid-sized Cluster Performance-Cost Comparison
In the second round, we tested mid-sized cluster performance on 128 and 256 core clusters for both core types. And the results are shown in the table below.
128 cores | 256 cores | ||
Nickel (HPC+) | Time(s) | 2973.34 | / |
Price($/hour) | 19.20 | 38.40 | |
Iron (HPC InfiniBand) | Time(s) | 1434.00 | 1277.43 |
Price($/hour) | 38.40 | 76.80 |
We can see that the runtime of the Nickel cluster drastically increased when the number of cores reached 128. For the 256 core cluster case, I terminated the job after I found it takes longer than the 128 core case. This is caused by both communication overhead and slow interconnection. On the other hand, the performance of the Iron cluster increases steadily with the number of cores involved. So Iron outperforms the Nickel cluster when running on more than 128 cores.
Conclusion
From the graph above, we can tell that for clusters less than 64 cores, Nickel is faster for CONVERGE™ jobs, while for a mid-sized cluster, which has more than 128 cores, Iron is a better choice. More importantly, running a job faster could potentially save you yet more on the on-demand license cost.