Rescale Cloud Filesystem (CFS)

Note: CFS is a beta product. Please get in touch with your Account Representative or [email protected] if you are interested in using CFS.

Managing siloed datasets across complex simulations is a common bottleneck in R&D. Rescale’s Cloud Filesystem (CFS) solves this by providing a high-performance, centralized, and persistent storage repository. It acts as a single source of truth for all critical HPC data, offering enhanced parallel file sharing that surpasses typical workload or team-specific storage solutions.

Data within CFS is stored in a filesystem-based hierarchical structure, providing an intuitive way to organize and access information. By functioning like a network drive that connects directly to Rescale jobs and workstations, CFS streamlines workflows, enhances collaboration, and helps control storage costs. As a sharable storage resource for all users of a workspace within an organization, it allows engineers, scientists, and researchers to easily organize, manage, and access common files, libraries, and project data in one place, ensuring everyone is working with the same information.

Key Benefits and Use Cases

CFS is designed to accelerate your R&D cycle by making data more accessible and manageable. It is ideal for organizations that need a secure, collaborative, and cost-effective way to handle their simulation data.

Benefits

  • Simplified Data Management: Centralize all your files by project in a single location, eliminating data silos and reducing redundant copies.
  • Accelerated Workflows: Avoid lengthy data uploads for each job and accelerate your workload with high-throughput storage. Mount the CFS to instantly access terabytes of data, moving from one simulation to the next in minutes, not hours.
  • Enhanced Collaboration: Share data seamlessly and securely across users and Rescale Projects. Since everyone accesses the same file repository, you can ensure consistency and improve teamwork.
  • Cost Optimization: Reduce your storage footprint by eliminating duplicate files. Data is kept in filesystem storage for active use and can be moved to Rescale Files for long-term storage.
  • Data Persistence: Files stored in CFS remain available indefinitely until you choose to delete them, providing a persistent home for all your critical project data across multiple jobs and analyses.

Common Use Cases

  • Large Multi-Job Simulations: A team running a series of related simulations (e.g., a Design of Experiments or a parameter sweep) can store the baseline input files in CFS. Each job can then access these files instantly without needing to re-upload them.
  • Collaborative Engineering Projects: An automotive design team can use CFS to store common CAD files, solver libraries, and simulation results. Engineers across the globe can access and add to this central repository, ensuring everyone is working on the latest design iteration.
  • Post-processing and Visualization: During or after running a large simulation, the output files are saved directly to CFS. A user can then launch a Rescale Workstation, attach the same CFS, and immediately begin post-processing or visualizing the results without any data transfer delay.

Functionality

CFS is designed to be powerful yet intuitive. Here’s how its core features work.

Data Flow for CFS-Mounted Jobs & Workstations

When a job or workstation is mounted to CFS, it leverages the high-performance filesystem nature of CFS while the workload is in progress. Users can access the outputs directly from the CFS. Once the job completes, output data is written to the standard Rescale Files storage, ensuring it is accessible in the long term. Similarly, files or jobs attached to workstations become accessible from CFS once attached.

Data Flow in Cloud Filesystem

Direct File Upload

To improve the efficiency of managing simulation files, you can upload files directly to the Cloud Filesystem, eliminating the need for intermediate steps. This streamlines data organization, making files immediately available for your workloads and projects.

When a file is uploaded directly, it is placed in the following path, which includes the unique CFS ID: /enc/{username}/storage_ext_{cfs_id}/users/{username}/library/file.

You can add files directly to CFS using the Rescale CLI or API:

  • CLI command: rescale-cli upload -p -f --copy-to-cfs. Learn more in our CLI documentation.
  • API Endpoint: https://{platform-url}/api/v3/folders/copy-to-device/. Learn more in our API documentation.

Data Storage and Organization

Each CFS is identified by a unique ID and is mounted directly into your job or workstation. The mount point will appear as a directory named after its ID, for example, storage_ext_{cfs_id}. Within this directory, Rescale automatically organizes data into a structured hierarchy, including dedicated directories for each user’s jobs (users/{username}/), a shared library space (library/), and a collaborative area for shared projects (projects/). Rescale platform creates this organization by default, organizing your data into a clean and predictable structure.

Users can also create their own sub-folders on top of this structure to organize data logically.

  • Mount Point: storage_ext_{cfs_id}
  • Example User-Created Structure (inside the mount point):
    • {user_name}/library/common_inputs/
    • {user_name}/library/2025/january/file_01.dat/

Files and folders can be created, moved, or edited using standard command-line tools (e.g., cp, mv, mkdir) or through the file explorer in a Rescale Workstation desktop session.

Mounting CFS to Jobs and Workstations

Once a CFS is provisioned and configured for your workspace, it is automatically mounted by default to any Rescale job or workstation and displayed on the Inputs page.

Job Mounting Cloud Filesystem

You may choose to unmount CFS by clicking on the X action button.

Once the job or workstation starts, CFS will be mounted at the storage_ext_{cfs_id} directory. You can read and write to this directory as if it were a local disk.

If you do not want your job to auto-mount the CFS, follow the instructions below:

MethodHow to Disable Auto-MountHow to Enable Auto-Mount (Default)
UIDeselect the CFS on the Input Files page.Ensure the CFS appears on the Input Files page.
CLISet env var: #RESCALE_AUTO_ATTACH_CLOUDFILESYSTEM=falseSkip this in the request body since it’s the default.
OR
In the request body, include: "autoAttachCloudfilesystem": true
APIIn the request body, include: "autoAttachCloudfilesystem": falseSkip this in the request body since it’s the default.
OR
In the request body, include: "autoAttachCloudfilesystem": true

Data Partitioning and Sharing

By default, data is organized in user-specific directories with the data owners having read/write permissions. However, additional directories that exist in CFS allow sharing and collaboration of the data.

A primary way to manage shared data is through Rescale Projects. When you assign a job to a Project, Rescale platform creates a dedicated directory for that project within the CFS (e.g., storage_ext_{cfs_id}/projects/Project_Name-{project_id}/). To make the job’s data accessible, a symbolic link (symlink) is created within this project folder, pointing directly to the job’s actual working directory (e.g., in storage_ext_{cfs_id}/users/{username}/…).

This powerful mechanism allows all project members to access the job’s files through the shared project folder as if the data resided there directly, without creating duplicate copies. This ensures that all collaborators are working from a single source of truth.

Similarly, if a workstation is assigned to a project, the user can access all project directories they are a part of, allowing them to read and write project files directly from their workstation.

Usage Monitoring and Alerts

For workspaces utilizing Cloud File Systems (CFS), Workspace Administrators have a dedicated dashboard to track storage consumption at the user level.

Cloud Filesystem Usage Dashboard
  • Granular Visibility: Admins can view exactly how much CFS storage each specific user is consuming.
  • Proactive Management: Quickly identify high-utilization users to manage quotas or encourage cleanup before limits are reached.

To help manage storage capacity, CFS usage alerts are automatically sent to workspace administrators when the filesystem reaches 60% and 90% of its total storage utilization.

Deleting Data

Data stored in CFS is persistent, meaning it remains until a user explicitly deletes it. Deleting files or folders can be done through two primary methods:

  1. From a Live Workstation: Navigate and delete files using a workstation that has the CFS mounted. Files can be deleted through the workstation’s graphical interface or by using terminal commands (e.g., rm) via the built-in terminal or an SSH connection.
  2. (COMING SOON!) When Archiving a Job: Data in CFS will be deleted when a job or workstation that has CFS mounted is archived. The job archival process can be initiated from the Rescale UI, CLI, or API.
File Deletion from CFS when Job is Archived

⚠️ Note: Deleting data from CFS is a permanent action and cannot be undone. Always ensure you have backups of critical data before deleting it.

Supported Filesystem: Lustre

Rescale CFS is powered by the Lustre filesystem, an open-source, parallel filesystem designed for the massive-scale cluster computing required in High-Performance Computing (HPC).

Benefits of Lustre for HPC

  • High Throughput: Lustre is built to handle extremely high data rates, making it ideal for applications that need to read and write large volumes of data quickly. This is critical for I/O-intensive simulations where data access speed can be a major bottleneck.
  • Scalability: It is designed to scale to thousands of clients, petabytes of storage, and hundreds of gigabytes per second of I/O throughput. As your computational needs grow, Lustre can scale with you.
  • Parallel I/O: Lustre allows multiple clients to write to the same file simultaneously in parallel. This capability is essential for large-scale HPC jobs running across many nodes, as it prevents data access from becoming a serialized bottleneck and significantly accelerates computation time.

Please contact your Rescale representative about other filesystem options.

Configuration and Options

  • Workspace Limit: Only one Cloud Filesystem can be provisioned per Rescale workspace.
  • Sizing: CFS is provisioned with a fixed storage capacity. If your storage requirements exceed the provisioned size, please contact your Rescale representative to discuss available options for additional capacity.
  • Optional Backups: For enhanced data protection, backups that mirror the CFS data every two hours can be enabled. This provides a fallback option to restore business-critical data in the rare event of a filesystem issue.

Frequently Asked Questions (FAQs)

1. How is CFS different from Rescale Files?

Rescale Files is based on low-performance, highly scalable storage for storing data long-term, whereas CFS is a high-performance, persistent, centralized storage space that exists independently and can be attached to any number of jobs or workstations, making it ideal for files needed across multiple runs.

2. What kind of performance can I expect from CFS?

CFS is optimized for bulk data access and throughput, making it excellent for reading large input decks and writing result files. It is optimized for workloads that require high-performance parallel filesystems for I/O-intensive computations, like HPC R&D. It also offers good interactive performance for general use. Get in touch with your Rescale representative for more information about performance tiers.

3. Is the data stored in CFS secure?

Yes. All data in CFS is encrypted in transit and at rest using industry-standard AES-256 encryption. Access is controlled through your secure Rescale account credentials. Only CFS-configured users can mount the filesystem, thus preventing access from other platform users. This creates an additional layer of protection by isolating workspaces so that only trusted team members can access each team’s CFS data.