Run Hundreds of Free AI & HPC Applications in the NVIDIA NGC Catalog Accelerated on Rescale

NVIDIA NGC home

Overview

NVIDIA GPU Cloud (NCC) Catalog is a curated set of free and open-source GPU-optimized software for AI, HPC and Visualization. The NGC Catalog consists of containerized applications, pre-trained models, Helm charts for Kubernetes deployments, and industry specific AI toolkits with software development kits (SDKs). Deploying NGC on Resale makes it easier for engineers and scientists to get started with a variety of new use cases for research and development on a single platform. Rescale automates the necessary hardware infrastructure, software, and workflow steps to make the latest computational tools more accessible and accelerated.

In this tutorial we are going to show you how to easily run NGC Catalog on Rescale. NGC can be run by either a batch job using the Rescale command line or interactively using Rescale Workstations.

Batch Job

Batch Job Video Tutorial

Sample Job and Results

You can access and directly launch the sample job by clicking the Import Job Setup button or view the results by clicking the Get Job Results button below. 

In the above sample job, we run a python script building a TensorFlow machine learning model with MNIST dataset. (Reference link).

To submit the batch job, the first step is to choose Basic Job Type.

Inputs

Upload running scripts as input files. In the sample job, the python scripts are inside the compressed archive file test.zip.

LRZlb53kb 1iKhBdI5YTSVroyqg1vbiou5bvnGBsDtHFxpvM4m9c0I 87N9AICV6VBHllMAOVVNKlme2VMOPnyfmbY3Pmu2SC Khp3dK UPDkcr0DyUuu5eoOVK3Qy3nZWK3DG5WSD qUcW18mJTP74

Software

Next choose the NVIDIA NGC Catalog tile.

gG31SCCWMZlXBXtoRXRhqDLxzBCEepASXlEmdOq3khroB9Np00ngQ9tk a3fshr5aSaoCKLJj6UpY9wNvu8 xwBBHb5yLtmvM9lxQHvYZxDn4mpjuWZWyAOOM u3w7jfRBiQLUDdl02sPXMZg7M1iY

4YamErIwLrpYUVgj1Z AuG8xh 3Kz28ecrzkMrfRQd7v01QXA0TnTpDtXRZAD7FgkCxKCo5jcoafSQ IbaPg7ee2N3Oash8NSEv itIrT2U4thjGfpVfEE1LmfeGq1vXJA81w5 ZQiez2AUMjR zqaM

In the Command box, we provide the pre-populated command to login your NGC account and pull and run docker containers.

Users can pull containers from NGC Catalog. Here, we use the TensorFlow container as an example.

Screen Shot 2022 08 11 at 10.21.01 AM

To pull the Tensorflow container, go to the NGC Catalog and click the Pull Tag to copy the docker pull cmd.

t 85EspNNLi9u0JiSidS7G6a60X usH EgiEd2OXJ2WVM iZmHlltwE4iqiypKYSZee8IRdui6 Y4fdSaekgSyAi 1YTVA3 7DiDlHsZTT2rjXukNMjf8eaDmSrfQwIUA1w967dN91n4JbJg7fLY1LA

In the command box, paste the pull tag command and input the following commands to run the container, as in the sample job.

docker pull nvcr.io/nvidia/tensorflow:22.06-tf1-py3
docker run --gpus all -v ${PWD}/test:/test nvcr.io/nvidia/tensorflow:22.06-tf1-py3 
bash -c 'cd /test;  python test.py'

Users can login their NGC Catalog account through Rescale platform.Check the Use Existing License box on the Rescale in the License Options. Input your API Key and Org Code here, as shown in the figure below.

GIHrWtPL7XbR Sf5jrqa1t8YedrTYHMi7HwSWLVQjCnspRb1lRCO4jRJeYlb JFPBvJaeQkDcdMItKr877CUUHcmKfkRv4ZaxeuptmBCa8amMUNC4jDWT3aibaEG6OzJw1R6Ri2XXg0gvc she9l cc

To get your API key and Org code, users need to login to your NGC Catalog account and generate your own API Key and Org code.

OumRxbmZ0ifDBwlbgZ ZRX1P1Y nLlloWs3dkcWefYkGOGUUQide 4ypU5zTfmLOx BPfP09KHJa2LBEXM8X0Zq3n48

We provide a prepopulated command to login NGC account. Users just need to specify their container to be pulled and command to run in the command box. 

Hardware

We provided a variety of architectures for your job, like different CPU and GPU core types. Here we choose 4 GPUs for TensorFlow training.

Status

Click the Submit button to submit the job. Once the job is running,  you can check process_output.log for the output.

OwZVPa5Cmdy4MWaovj4RzdaCPK

Results

You can also open a terminal by clicking the Open in new window button to debug your script or run nvidia-smi to check the GPU usage and results while the job is running.

zTUJnT3iC QsCh 5sN 1iMz7nBHB7bR9vx xM ul8 B1oHIYp gk MIoBslCVNr8MwZSdyPF9qXxF IHTanFpkfI6DVgmz9AA6YOB5LwZt3MH7

TKyLrVGsW sQj34ElXPZvObfgC2iVN1iOEnLp76BRKfh1zP4f79rHvL9x3Ap5klLj4DzSXxPOLC7vuF3VhwKEfGHR HfixWjdLJXm81hV u7jZne7fVTfyVK7I7PKbtFd0161fqsZuRD7W27INB5E4E

After the job is finished, all results data files will be saved to the Results tab. We can also open the file and check the results.

i7hUB00shCV7kNoWbAwYtTUugc9ybanG0Pbtn1y51qFpujZ2xiz8v0TXQpawb6NyVb9K4k

In the process_output.log file, we can see the NGC Catalog account is logged in successfully.

j yoBV83nPER44n5cTZJ4cUTOdmpoJw6WWhYePOaFgTQnWOYFFfuwWYSCXGtyEcGYjuAlaYPW6s2QK tmkIIG5cpM6qq4dqMmumquUuQuTzYe5T1btFKUWCnatAc0JLCBfc0DWCFGhp Quq2ZItEufs

Here are the final output and results for the sample job.

twVQGw3MicNT umNGq4z0rPzm8 nX4pIHO0UtGsy5zDrhWximNsxDVnNolG4I4biMRr62MDkVkerNVlXX

Other use cases

NVIDIA Modulus

Input file is an example.zip download from Modulus official website. For this use case, the sample job can be loaded through this link and results here. 

Cmd (pull latest version NVIDIA Modulus from NGC. Run lid driven cavity (LDC) flow case in the  Modulus container with docker run ). For this container, NGC_API key is not required.

docker pull nvcr.io/nvidia/modulus/modulus:22.03.1
docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864            
--runtime nvidia -v ${PWD}/examples:/examples nvcr.io/nvidia/modulus/modulus:22.03.1 
bash -c 'cd ldc;python ldc_2d.py'

Monitor the process_output.log and check the outputs.

d4nOknXSedfP rSJs7s WZRmRgrRov8Dj6UE FIFOa0QMSX08Zp7fiEUK0PuekYryIEHdD3Ua4S5h E6 b5 njCjGUx8IVFOyHP2aZuVSV8OQNLibgPIe5KdUHc701yXJ 1oLL37HD1rMcw7iqPCn5o

Ex xy50NRvBMxx8bJTNLfF9GxfbxMT6oXPbfdh ZR bIByHsHTbMaI49JUyohe3kA1dpAqX1KUBpDYlo6BrS i8w35MVjRfBfFqCA5JtD5NcgGaF6XuBcjz v8T6M9QwTPAV6Y8TXABTfAGusNR1GIY

Workstations

Workstations Video Tutorial

Sample Workstation

On the Rescale platform, users can also work interactively with Workstations. To submit a job, click the New workstation button. Here is a sample job for reference.

iks26pctnyRBhJ9KCJN zkAMgiM3B7rzD4HhD sfFz3i7td34GXxtaOmNqjdRmQgu1WwVQgpW7pNHTcLfld5gBfnsonaXmh1LQj1fuCIafg9emQoWQ2jy DvpXl04M0 TSoW2Vm4VdiNIeNr2fBjBHo

Configuration

For the Input, upload the scripts you need to run. Next, choose NGC Catalog Interactive Workflow tile as Software. Unlike batch jobs, we don’t need to specify a command script to run a Workstation; instead you will interact with it using a GUI. 

y4mYyk55D0aOY58Aom62w2YroEGWc9OAjUqS iZwHHGLlJmP5XXQjaFj0YbrC7eLPIHh dDPD1vXAxG7VkyiZEXKZ t9qQz89uBwYjCIHpiTXL54Y3iCEvXGT0jTXVPk

Check the Use Existing License box in the License Options to login NGC account. Input your NGC API Key and Org Code here. The generation steps of the NGC API key and Hardware settings are the same as batch jobs and not repeated here. 

o7E741Xwl UYksbFmo2RO3RWl5AlxwFagbi2ZEN6pC zh9UIQb h16oRwPbCofQnm7pQpPCwpwB FoRML iP816e3

Once the workstation is ready, click the Connect button on the top or Connect using local client (Download and install the NiceDCV client) to log into the Workstation.

KwQWBzHOVHMIzSUK9VRw4nR5nCO 5rp8xiBZ7AVdExRtD7M7CrJoWWegIfQ9tXDSW1P6 Lh2gb RrJx1Q9oFHABznyjcMHlA8DSTu7V6

In the Workstation, we can open a terminal window by clicking Activities on the top left corner and find the Terminal and launch it.

HFDbdxgALe t9DHIsc8oMjzbZVUwlmlAjRhbEf9PbM6HbqnrLIZTJqle4U6hVv0twzICP0aw5qjdRtPTSZQmxMHqnb6DGmrQNEM SIL0WFSgUi9EfiJ6k9WNNdvj56QEadSvawuEBupiYc N9alGwk8

In the opened terminal, we can do interactive work. In a terminal window,  run the following two lines cmd to login the NGC account.

$ echo -e "\n \n $(echo $NGC_ORG_ID) \n \n \n" | ngc config set
$ docker login -u '$oauthtoken' --password-stdin nvcr.io <<< $(echo $NGC_CLI_API_KEY)

joDUc0LCXbJs2srLVl TEPYJFIQ0GNQ35KR7l9uyWvmx3O5ZO8uBN0CJgYdJWoJJy2UM2s9gOg5MwJDqQf0rqUqivwkPXGvrM9Rqu0Al8rHXIzZsgC4NdW XFC

Then we are ready to pull containers from NGC Catalog.

Tensorflow Container

In this example, we pull the Tensorflow container:

$ docker pull nvcr.io/nvidia/tensorflow:22.06-tf1-py3

Start the Tensorflow container by:

$ docker run --gpus all -it --rm -v ${PWD}/test:/test nvcr.io/nvidia/tensorflow:22.06-tf1-py3 bash

DaJ0ZMA42l l bhEyji5e2GOTJO3W5LWDKuOQxmXB4AML4SOCg0NoOxJpW3Cupr9N 0Y7005wkA K RYSMnpy mpl8i cq KO

Run the following simple case in the container (Reference link):

$ python3
$ import tensorflow as tf
$ tf.enable_eager_execution()
$ mnist = tf.keras.datasets.mnist
$ (x_train, y_train), (x_test, y_test) = mnist.load_data()
$ x_train, x_test = x_train / 255.0, x_test / 255.0
$ model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(input_shape=(28, 28)),
  tf.keras.layers.Dense(128, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10)
])
$ predictions = model(x_train[:1]).numpy()
$ predictions

If everything works well, we can check the predicted results output on the terminal.

Zm0egQOHi65HeQdhKc1 k10URrxWoO4rgvzwluCwowCpWhAmEfzlnbsgIsqKjo6BI3gI5MAEBasNDJG3j2oaketMVV3B8H0c6LK qU6WNY6w6 ST s6hw83Flgc3S9gzBA9KoNOVJ64cmKW0qS5 8a8

Modulus Container

Pull the latest version of NVIDIA Modulus from NGC.

$ docker pull nvcr.io/nvidia/modulus/modulus:22.03.1

Start the Modulus container by the directory:

$ docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864
--runtime nvidia -it -v ${PWD}/examples:/examples
nvcr.io/nvidia/modulus/modulus:22.03.1 bash 

Successfully load the latest Modulus.

u7Oma2vF1uml4x5N8mAIGt55Gmtz0jBmYJMbuS428bCbvu0G9GaVJiJxBmWj6

Run the lid driven cavity case in the container by:

$ cd ldc
$ python ldc_2d.py

H4Bk wqLxHznxHUd0dsFhxfKGISma3yWEZaJPHwrpsSAnwhV ADMffsxPpusqoS5e6VRn7C9lMPl4PV0gaX6tRapQrsumWiRQB IjrDN kbN4beA2DHU8U746aPUa lO2DjrMQdn7UmWKFFIrnm7I9I

As the above simulation is running, we can pull the Paraview GUI container for visualization purposes.

ParaView GUI container

Open a new terminal window and pull the paraview docker container from NGC.

$ docker pull nvcr.io/nvidia-hpcvis/paraview:egl-py3-5.9.0
$ docker run --gpus all -p 8080:8080 -v ${PWD}:/ldc     
nvcr.io/nvidia-hpcvis/paraview:egl-py3-5.9.0         ./bin/pvpython 
share/paraview-5.9/web/visualizer/server/pvw-visualizer.py         --content 
share/paraview-5.9/web/visualizer/www/         --port 8080         --data /ldc

iJguHtgd7fo6M1sDCxLVWReAcbMXaDNvxEgZnZNhRBgnlZK0iqTZl7RyIqHchMHP1eNWE9Vl61jDL0iISnDo w1uxGQJeQR0pmzHt47sF498I 6atfBHmBF qXgFR l w2BGGRF Vu2QgUka9m1Wy s

Open the ParaView in DCV web browser (like Firefox web) by http://localhost:8080/

Then you can open the GUI and visualization the result files.

N7roWJwLLrZzrAI1CWYXP1If2 dssBLkUCmVfUdMOQP11u1czPjLEuao5t Kg37kQ4kR5dMT05bvXf3T yiQ6IbYgLDPeVqHiL

Conclusion

We hope that this tutorial helped you get started with running NGC Catalog on Rescale. As you can see, you can pull and run all the containers available on NGC Catalog with multiple GPUs through a Rescale batch job. Or you can run and launch containers interactively on a Rescale Workstation, for developing code, monitoring training and post processing the results with multiple GPUs. Our platform provides various GPU architectures for you to choose from and all the assets you need to build your AI workflow. For additional information you can visit the NVIDIA documentation page for NGC here.