SC24 Recap: Experts Share Insights on AI Surrogate Models, FP64 GPUs, and Digital Twins
Executive perspectives on balancing innovation, strategy, and operations in high-performance computing.
At SC24, supercomputing’s premier conference, Rescale hosted a panel discussion to dig into key high performance computing (HPC) insights as AI continues to integrate into daily workflows.
Moderated by Rescale’s COO Matt McKee, the panel kicked off with introductions of:
- Brandon Haugh, VP of Modeling Simulation at Kairos Power
- Doug Norton, Chief Marketing Officer for Inspire Semiconductor and President of the Society of HPC Professionals
- Srikanth Gubbala, Head of Global HPC IT Infrastructure for Applied Materials
With a wealth of experience from each panelist, we’re thrilled to share their insights with the simulation community. Let’s dive into the recap!
Surrogate Data
Contents
The panelists wasted no time jumping in. While the concept of synthetic data might seem like a recent trend, these mathematical techniques have been utilized for decades. With the rapid advancement of AI in simulation and modeling, leaders are now increasingly leveraging synthetic data to train, validate, and test models.
For the audience, one question was likely on everyone’s mind. Are surrogate models accurate?
Brandon, VP of Modeling Simulation at Kairos Power, described some of the R&D challenges his company faces in developing a clean nuclear reactor. “We take real data, we take synthetic data, and we use advanced machine learning algorithms to sort through that, to sort what’s real and what’s not, and fit it to a physics-conformed model. When we look at what the AI and ML is doing, we’re kind of giving it some bounds of reality.”
Without constraints, surrogate models can generate data that fails to accurately represent real-world engineering challenges, making them less reliable and more difficult to generalize for practical applications. If surrogate models can prove accurate and reliable enough for nuclear applications – where precision, safety, and regulatory scrutiny are paramount – then they can hopefully be applied to many other engineering domains with confidence and trust.
FP64 GPU
If they were listening closely, the SC24 audience caught a key insight from Doug. Wearing two hats as CMO of Inspire Semiconductor and President of the Society of HPC Professionals he emphasized, “We need 64-bit double precision math for those foundation models so we can do good AI.” To put it more explicitly, CAE applications require high precision, and only 64-bit floating-point (FP64) GPUs will meet these demands. It’s a subtle but crucial distinction.
Leading CAE applications like Abaqus, StarCCM+, ANSYS Fluent, LS-DYNA, and OpenFOAM utilize high-precision solvers to ensure accuracy and reliable convergence. If you’ve ever run a CFD simulation, you’ll know that small numerical errors can accumulate with iterative solvers.
For example, the transition from 32-bit to 64-bit for OpenFOAM began gradually around 2008 and continues with CAE vendors utilizing high-precision NVIDIA GPUs. While the GPU market is saturated with lower-precision core types – beneficial for other types of AI applications and edge computing – FP64 GPUs for CAE are in high demand.
At Rescale, we understand this better than anyone, and specialize in connecting customers with the best compute resources for their simulation needs. Rescales Coretype Portfolio provides the latest high-performance computing solutions, leveraging GPUs like NVIDIA Volta V100 and Tesla P100, as well as energy-efficient Arm-based processors such as AWS Graviton3 and Google Cloud’s Tau T2A, to optimize performance for a wide range of HPC workloads. To see
Digital Twins
With the next moderator question, the conversation shifted from core types to applications. For critical systems like nuclear reactors, real-time predictive monitoring and maintenance are not optional – they’re essential. Brandon shared compelling insights on how machine learning has advanced the sophistication of their Digital Twins.
Brandon shared, “We monitor and instrument the entire facility, and we look at our predictions. We look at how the twin is performing. We actually do prognostic and diagnostic machine learning.” Later in the panel, he revealed the impressive amount of data his company generates for their Digital Twins. Keep reading to find out!
Digital Twins have been a hot topic in the simulation community, making it exciting to hear how effectively they’ve been implemented at Kairos Power to enhance public safety. Equally exciting is their potential to drive breakthroughs in precision medicine, manufacturing, and beyond.
Quantum Computing and Big Big Data
As big data grows in complexity and scale, quantum computing is poised to transform how we extract meaningful insights from massive datasets. All the panelists were enthusiastic about exploring its potential. Doug Norton stated, “Quantum is going to make us all think differently.”
Srikanth, as Head of Global HPC IT, agreed and emphasized his vision of HPC and quantum computing working together seamlessly. He described the explosion of disconnected data within his organization, originating from testing, IoT edge devices, and other sources. Discussing the future, he said, “But for all this to happen, there must be a strong focus on data management.”
Brandon joined in, stating, “We are generating hundreds of terabytes of data a month,” and humbly shared what many engineering leaders are discovering. It’s easy to generate data, but much harder to bring all your data sources together to add value.
At Rescale, we recognize the value of quantum computing and are adding partners like IONQ to our catalog of over 1,200+ turnkey R&D applications. While the panel discussed many promising technologies, including 3D silicon chip development and resistive memory, Rescale remains laser-focused on the most pressing issue for all: data management.
Rescale has been progressively launching new features to support data management for your extensive data needs, from Metadata Management to a Cloud File Storage system. And we are not slowing down.
Balancing AI and Existing HPC in Organizations
Shifting focus from technology, the panel also discussed people and budgets. Balancing support for traditional HPC users while investing in AI was another key topic, highlighting the need to strategically allocate resources between these areas. The panelists shared insights on how organizations can manage budgetary pressures while ensuring both AI and HPC initiatives receive adequate support.
Srikanth from Applied Materials explained how Rescale Compute enables a clear, usage-based chargeback system that helps his organization balance compute costs for their diverse workloads, spanning thermal, mechanical, plasma, and optical applications. This cost transparency enhances accountability and ensures optimal resource utilization across teams.
Conclusion
We hope you found the discussions from this expert panel both insightful and engaging, and that you learned something new.
If you want to dive deeper and hear our SC24 supercomputing experts share their advice in their own words, watch the full SC24 panel discussion below.
Sometimes, the most insightful discussions arise not from the questions we plan as moderators, but from the curiosity of our community. At SC24, HPC conversations extended far beyond the panel Q&A—into hallways, impromptu meetups, and late-night discussions—because for us, it’s more than a topic; it’s our passion.
Please see our events page, for future events. If you have any AI and HPC questions we would love to hear them. Reach out to us here.
