| | | |

Automating Scientific Workflows with Rescale Executor for Nextflow to Accelerate R&D Processes

Every computational discipline has an idea of how to write scientific (or engineering) workflows, but this idea varies radically from bioinformatics to structural analysis to data science, and is shaped by the software, tooling, and standards of each field. We define a scientific workflow as a sequence of steps or software programs that need to be executed, sometimes many in parallel, sometimes several sequentially – in other words, a graph – to achieve a scientific or engineering goal.  When talking about workflow we think about the science (the work) of assembling these steps. However, at any reasonable scale, a specialized language or framework is necessary to describe a scientific workflow in a way that handles parallelism, restart logic, and modularity for easy reuse and modification. The “work” of workflow is shaped by this specialized framework, its capabilities, and the ease with which users can experiment with and contribute to existing workflows. 

In bioinformatics, enormous data sets and the need for reproducibility and flexibility have driven workflows into the cloud. Emerging from this scientific domain, Nextflow from Seqera is a workflow description language with many innovative features, but two that are particularly powerful are:

  1. You can separate the scientific logic from the platform on which it is executed by swapping out executors, which determine where a pipeline process is run and supervise its execution.
  2. You can mix and match any software, allowing scientists to innovate quickly by selecting the best software for a specific goal in a larger set of steps, and inserting their own tooling into established pipelines. 

Rescale is a high performance computing simulation platform built for the cloud that innovates in a complementary way. It is designed to make it easy for scientists and engineers to launch parallel applications (such as computational fluid dynamics, molecular dynamics simulations, and weather models) on large scale cloud infrastructure and across multiple cloud providers. Rescale has made it easy to leverage the flexibility of cloud resources to speed up analyses and centralize data so that scientists are not bound to their on-premise clusters.  However, until now, only proprietary products could orchestrate complex scientific workflows natively on the Rescale platform. This meant that it was difficult for a user to use new applications or program arbitrary logic to test out new ideas.  We have seen this inflexibility hold innovation back in every growing field over the last few decades. For example, machine learning has been transformative in life sciences, weather, and engineering – but only when there is flexibility to mix new programs and algorithms with established methods. 

We are pleased to announce our support for a Rescale executor for Nextflow. This brings the ability to use Nextflow on Rescale to our engineering and simulation workloads, domains that have not traditionally used a dedicated workflow language. Using Nextflow on Rescale you can run complex workflows at a previously unimagined scale, across cloud service providers, across execution paradigms – high-throughput computing (HTC) or high-performance computing (HPC) – and across different software packages, under a single control plane. In scientific and engineering scenarios with large design spaces, researchers can parallelize and replicate computing tasks to speed time to answer, for example evaluating many possible vehicle design points or precision drug formulations. Once processes are developed, they can be standardized, automated, and scaled. By bringing together these technologies, we are accelerating advancement and experimentation in and across new scientific domains.

Sign Up For a Demo

Learn more about how Rescale is transforming the HPC cloud space.

Author

  • Tara Madhyastha

    Dr Tara Madhyastha is an interdisciplinary scientist trained in high performance computing (HPC) with contributions in the fields of computer science, education, psychology and neuroscience. Recognizing the importance of cloud computing to science, she moved to the industry in 2019 to help scientists use cloud and high performance computing to advance their work.

Similar Posts