| | AI/ML Physics | Applications | CxO / Leadership | Data Governance / Stewardship | Digital Twins / Digital Threads | Workflow Automation

Rise of an Agile Cloud-Native Approach To Simulation Data Automation

A user-centric approach to capturing simulation metadata can finally help companies harness the value from simulation data

As the world of engineering and scientific computing continues to evolve with new technologies and new methods, one rate-limiting step is simulation data – how it’s generated, how it’s managed, and how it’s used. But there are a few signals that this is no longer a tenable scenario. At Rescale we are playing our role to address this challenge.  

I’m thrilled to share that Rescale Data is in private beta for organizations that are interested in considering a new approach to capturing value from their simulation data.

Rescale today services the majority of top ten companies in aerospace, automotive, life sciences, industrials, and many other industries. While industries vary on their R&D computing objectives and maturity, common trends exist. In general, product lifecycle management (PLM) systems are well adopted and quite mature. This is not surprising since PLM systems hold vital product “bill of materials” data, including everything from part numbers to pricing to component specs – information critical to enable manufacturing the company’s products. PLM systems have many users, are critical to business operations, and therefore need to be carefully controlled.

Quite the opposite is true when it comes to how simulation and simulation data is managed. A comparatively smaller number of folks run simulations, and they want the latest simulation tools available. Simulation data is also often thought of as transactional – did the simulation pass?  If not, do more iterations. If yes, write that the design is good in a report (in Powerpoint, Word, PDF), perhaps with images of useful visualization, and then move on. Almost always, this is without any direct linkage to the simulation output or meshes used, and therefore not easily replicable.

While the concept of simulation data and process management (SPDM) has been around for many years, in general it has not yet seen much adoption. Common cited reasons include 1) the challenge of capturing the context and findings of the simulation as they happen, 2) the complexity & rigidity of SPDM systems, and 3) difficulties in getting users on board, since it entails changing how engineers or scientists work.

So companies run more simulations than ever before, creating a simulation data explosion that often sits on company share drives. Engineers and scientists are creating a ton of data that might be important, but no one will ever know because the data wasn’t properly labeled.  

But this situation isn’t viable for long, for a few reasons.

First, data volumes are growing, and so are costs. Even though storage is increasingly cheaper, it’s being outpaced by data growth. That’s a lot of money to spend on data you can’t use.  

Second, companies are recognizing the negative impact of rework and not seeing the big picture on simulation activities. A global automotive company executive shared that they believe 30% of simulation activities are probably wasted due to lack of coordination. A leading energy company executive shared that if they could easily gain full visibility to all relevant simulation results across design candidates, they could make design decisions 70% faster.

Third, leading companies want to start using AI in engineering and scientific research, with everything from AI-powered surrogate models to using physics-informed neural nets. And as we all know, AI is only as good as the data it’s trained on. And that requires that we have a handle on our data.

Whose Problem Is it?

Two years ago, our product managers noticed that several Rescale users were using spreadsheets to track their activities as they ran simulations on our platform. “What are you tracking?” we asked. In one of these conversations, the simulation engineer told us: “Oh, I’m just tracking which of the simulation runs were helping me solve the engineering problem I’m investigating. There’s no issue with the platform. It’s not your problem.” This kind of spreadsheet, of course, grows stale over time, making it impossible for teammates to benefit from this information.  

This and several conversations like it helped us to realize that we were not doing our part to be a good citizen of the digital thread – to help customers tie simulation activity and results to engineering decisions. In a way, we decided it was our problem – a problem we were determined to solve. 

Later that year we released resource tags and later custom fields. The former allows users to add any label on any of their simulation activities, and the latter provides engineering leaders the ability to capture any information they feel is required from their teams.

Today, we’re seeing a wide variety of use cases, from the single engineer who just wants to track what they are doing by using their own unique labels, to large organizations that are leveraging our APIs to ensure that the right part numbers are referenced whenever a simulation is performed on a particular component, and that key simulation results are pushed to custom fields. This combined with our enhanced job search capabilities (including saved search) is making it easier than ever for engineering teams to gain visibility of engineering computing activities and results. None of this has required implementing new complex systems or dramatically changing how users run simulations.

But we shouldn’t stop there.  

Rescale is becoming a better citizen of the digital thread that organizations need for them to be more agile and data-driven. How do we supercharge the progress we’ve made to deliver an even greater impact?

An Agile Cloud-Native Approach

Given that we can capture the context and results of simulations as they happen, what else can we do to make life easier for engineers and scientists who depend on these simulations and the insights they yield? Our approach was to think about how Rescale, as a cloud-based control plane for compute automation, could apply our experience to data automation.

Rescale Data builds on the foundation Rescale created in our first 13 years as a company in cloud high performance computing – to automate running any simulation on any architecture on any cloud provider.

Specifically, our goal with Rescale Data is to:

  • Provide full flexibility on what data needs to be captured
  • Capture all data in-context as the simulation activity occurs
  • Ensure any data captured is always accessible & contextualized
  • Automate everything

Providing full flexibility on what data needs to be captured is described above. To capture all data in context, we’re introducing the concept of Studies in Rescale.  

Rescale Studies organizes the users’ experience around the purpose of simulations being performed. Studies exist within Projects, and can include the jobs, workstations, and files that are related to the study. Because users run simulations from a study, all jobs inherit the study’s metadata, reducing the burden of users to add context manually. Analogous to Jira epics, Studies can be shared internally so that anyone within the organization can gain visibility on the results, rationale, and the underlying simulations that led to the study findings.

Another key component to Rescale Data is the integrated data lakehouse. With this feature, any simulation result coming out of a Rescale job is automatically pushed into the lakehouse. It’s analogous to taking a photo on your smartphone and having it instantly accessible from the cloud, searchable by date, location, and even name of the subject. Using a Jupyter notebook, users can query all the findings from a study, or all the jobs performed by anyone who covered a particular component. The data lakehouse becomes a source of truth of simulation results, and plays a key role in the digital thread.

Lastly, Rescale Data also includes our investment to build computational pipelines. By automating simulations that are frequently run together, we can help engineers focus on analysis and decision-making instead of repeatedly running simulations.

Those who are experienced in simulation data and process management will notice that we are not focused on modeling the individual engineering processes that organizations have. Instead, we are focused only on the building blocks that our customers tell us will be most useful, while minimizing complexity: 1) the ability to capture the context of simulations as they happen, 2) enable any authorized users to query and analyze the data, and 3) provide flexible automation and analytics tools so they can best harness the value of their simulation data.

Rescale Data Use Cases

Below are some of the use cases that have emerged from our customer discussions on data automation. If you are interested in learning more about the Rescale Data private beta, please sign up here, or contact your Rescale account executive.

Integrated Planning and Management

By defining the project studies up front, multi-disciplinary teams can work in a tightly coordinated way, as each plays their role. This approach provides full transparency on the progress of each study, and data that’s being used in each. 

Data Governance

Engineering leaders can ensure that the important contextual information they want to retain is captured as simulation engineers run their jobs. Fields can be adjusted by the project or simulations being performed, and can also be required to make sure data doesn’t get missed.  The data can be queried instantly in the simulation data lakehouse to help ensure data quality.

Model-Based Collaboration

A thermal engineer develops their method, runs thermal simulations, and identifies top candidate designs.  A structural engineer can then add another layer of data by performing peak stress analysis.

Digital Thread

An engineering lead identifies the final design after looking at all the data generated on a few leading design candidates based on assessment from simulation engineers from different disciplines. If this decision needs to be adjusted, the team has full traceability of the designs they previously evaluated.  They can even add an additional layer of data before revising their design decision.

Workflow Automation

Understanding automotive squealing requires that we first analyze the contact between brake rotors and pads (nonlinear contact analysis), and then identify what the implied vibration may be (complex frequency extraction).  On Rescale, computational pipelines can make this a highly automated and repeatable process.

AI-Assisted Engineering with Copilot

With simulation data aggregated in the Rescale simulation data lakehouse, it becomes possible to leverage large language models (LLMs) of specific datasets against requirements documents, including public regulatory guidelines.  In this example an ACME automotive engineer uses LLMs to assess how their designs perform against European NCAP standards.

AI Engineering with AI Physics

Rescale AI Physics powered by NVIDIA was announced at NVIDIA GTC earlier this year.  The goal is to help our customers easily combine the fast growing ecosystem of available AI tools for modeling real-life physics, and the computer simulations that serve as the data foundation. Key capabilities include process automation for AI Physics method development, and data provenance to version control the AI Physics models being used.

Interested In Learning More About the
Rescale Data Private Beta?

Author

  • Edward Hsu

    Edward is responsible for product strategy, design, roadmap, and go-to-market, and driving the commercial success of Rescale’s product portfolio. Prior to Rescale, Edward ran product and marketing at D2IQ (formerly Mesosphere), as well as product marketing at VMware. Earlier in his career, Edward worked as consultant in McKinsey & Company, and served as an engineering lead at Oracle’s CRM division. Edward has Masters and Bachelors degrees in Electrical Engineering and Computer Science from MIT, and an MBA from NYU Stern School of Business