In this article:
About Paperspace Jobs
The Paperspace Job Runner is designed for executing code (such as training a deep neural network) on a cluster of GPUs without managing any infrastructure.
The Job Runner is part of a larger suite of tools that work seamlessly with Gradient Notebooks, and our Core product, which together form a production-ready ML/AI pipeline.
A Job consists of:
- a collection of files (code, resources, etc) from your local computer or GitHub
- a container (with code dependencies and packages pre-installed)
- a command to execute (i.e. python main.py or nvidia-smi)
Running a Job in Gradient
Optional Job features
There are many features you will want to check out like outputting your model to the /artifacts directory, the persistent data layer at /storage, graphing with Job Metrics, sharing Jobs with the Public Jobs feature, and opening ports eg for accessing TensorBoard.
Jobs can be chosen to run on a variety of hardware. Pricing and details for all available options can be found here.
A detailed guide of all data and storage options exists here. Here's an overview:
There is a persistent data layer at /storage that is automatically available across all of your Jobs as well as your Notebooks and VMs. This storage is backed by a standard filesystem (read/write).
Anything you add to /artifacts when you run your job will be available in the CLI and web UI after the Job is complete. Artifacts are read-only.
Jobs have access to a read-only directory that is mounted at /datasets which includes a set of popular datasets.