Skip to main content

Accelerate

Project description



License Documentation GitHub release Contributor Covenant

Run your *raw* PyTorch training script on any kind of device

Easy to integrate

🤗 Accelerate was created for PyTorch users who like to write the training loop of PyTorch models but are reluctant to write and maintain the boilerplate code needed to use multi-GPUs/TPU/fp16.

🤗 Accelerate abstracts exactly and only the boilerplate code related to multi-GPUs/TPU/fp16 and leaves the rest of your code unchanged.

Here is an example:

  import torch
  import torch.nn.functional as F
  from datasets import load_dataset
+ from accelerate import Accelerator

+ accelerator = Accelerator()
- device = 'cpu'
+ device = accelerator.device

  model = torch.nn.Transformer().to(device)
  optimizer = torch.optim.Adam(model.parameters())

  dataset = load_dataset('my_dataset')
  data = torch.utils.data.DataLoader(dataset, shuffle=True)

+ model, optimizer, data = accelerator.prepare(model, optimizer, data)

  model.train()
  for epoch in range(10):
      for source, targets in data:
          source = source.to(device)
          targets = targets.to(device)

          optimizer.zero_grad()

          output = model(source)
          loss = F.cross_entropy(output, targets)

-         loss.backward()
+         accelerator.backward(loss)

          optimizer.step()

As you can see in this example, by adding 5-lines to any standard PyTorch training script you can now run on any kind of single or distributed node setting (single CPU, single GPU, multi-GPUs and TPUs) as well as with or without mixed precision (fp16).

In particular, the same code can then be run without modification on your local machine for debugging or your training environment.

🤗 Accelerate even handles the device placement for you (which requires a few more changes to your code, but is safer in general), so you can even simplify your training loop further:

  import torch
  import torch.nn.functional as F
  from datasets import load_dataset
+ from accelerate import Accelerator

- device = 'cpu'
+ accelerator = Accelerator()

- model = torch.nn.Transformer().to(device)
+ model = torch.nn.Transformer()
  optimizer = torch.optim.Adam(model.parameters())

  dataset = load_dataset('my_dataset')
  data = torch.utils.data.DataLoader(dataset, shuffle=True)

+ model, optimizer, data = accelerator.prepare(model, optimizer, data)

  model.train()
  for epoch in range(10):
      for source, targets in data:
-         source = source.to(device)
-         targets = targets.to(device)

          optimizer.zero_grad()

          output = model(source)
          loss = F.cross_entropy(output, targets)

-         loss.backward()
+         accelerator.backward(loss)

          optimizer.step()

Want to learn more? Check out the documentation or have look at our examples.

Launching script

🤗 Accelerate also provides an optional CLI tool that allows you to quickly configure and test your training environment before launching the scripts. No need to remember how to use torch.distributed.launch or to write a specific launcher for TPU training! On your machine(s) just run:

accelerate config

and answer the questions asked. This will generate a config file that will be used automatically to properly set the default options when doing

accelerate launch my_script.py --args_to_my_script

For instance, here is how you would run the GLUE example on the MRPC task (from the root of the repo):

accelerate launch examples/nlp_example.py

Launching multi-CPU run using MPI

🤗 Here is another way to launch multi-CPU run using MPI. You can learn how to install Open MPI on this page. You can use Intel MPI or MVAPICH as well. Once you have MPI setup on your cluster, just run:

mpirun -np 2 python examples/nlp_example.py

Launching training using DeepSpeed

🤗 Accelerate supports training on single/multiple GPUs using DeepSpeed. to use it, you don't need to change anything in your training code; you can set everything using just accelerate config. However, if you desire to tweak your DeepSpeed related args from your python script, we provide you the DeepSpeedPlugin.

from accelerator import Accelerator, DeepSpeedPlugin

# deepspeed needs to know your gradient accumulation steps before hand, so don't forget to pass it
# Remember you still need to do gradient accumulation by yourself, just like you would have done without deepspeed
deepspeed_plugin = DeepSpeedPlugin(zero_stage=2, gradient_accumulation_steps=2)
accelerator = Accelerator(fp16=True, deepspeed_plugin=deepspeed_plugin)

# How to save your 🤗 Transformer?
accelerator.wait_for_everyone()
unwrapped_model = accelerator.unwrap_model(model)
unwrapped_model.save_pretrained(save_dir, save_function=accelerator.save, state_dict=accelerator.get_state_dict(model))

Note: DeepSpeed support is experimental for now. In case you get into some problem, please open an issue.

Launching your training from a notebook

🤗 Accelerate also provides a notebook_launcher function you can use in a notebook to launch a distributed training. This is especially useful for Colab or Kaggle notebooks with a TPU backend. Just define your training loop in a training_function then in your last cell, add:

from accelerate import notebook_launcher

notebook_launcher(training_function)

An example can be found in this notebook. Open In Colab

Why should I use 🤗 Accelerate?

You should use 🤗 Accelerate when you want to easily run your training scripts in a distributed environment without having to renounce full control over your training loop. This is not a high-level framework above PyTorch, just a thin wrapper so you don't have to learn a new library, In fact the whole API of 🤗 Accelerate is in one class, the Accelerator object.

Why shouldn't I use 🤗 Accelerate?

You shouldn't use 🤗 Accelerate if you don't want to write a training loop yourself. There are plenty of high-level libraries above PyTorch that will offer you that, 🤗 Accelerate is not one of them.

Installation

This repository is tested on Python 3.6+ and PyTorch 1.4.0+

You should install 🤗 Accelerate in a virtual environment. If you're unfamiliar with Python virtual environments, check out the user guide.

First, create a virtual environment with the version of Python you're going to use and activate it.

Then, you will need to install PyTorch: refer to the official installation page regarding the specific install command for your platform. Then 🤗 Accelerate can be installed using pip as follows:

pip install accelerate

Supported integrations

  • CPU only
  • multi-CPU on one node (machine)
  • multi-CPU on several nodes (machines)
  • single GPU
  • multi-GPU on one node (machine)
  • multi-GPU on several nodes (machines)
  • TPU
  • FP16 with native AMP (apex on the roadmap)
  • DeepSpeed support (experimental)

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

accelerate-0.4.0.tar.gz (43.7 kB view details)

Uploaded Source

Built Distribution

accelerate-0.4.0-py3-none-any.whl (55.3 kB view details)

Uploaded Python 3

File details

Details for the file accelerate-0.4.0.tar.gz.

File metadata

  • Download URL: accelerate-0.4.0.tar.gz
  • Upload date:
  • Size: 43.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.9

File hashes

Hashes for accelerate-0.4.0.tar.gz
Algorithm Hash digest
SHA256 4f1c5153ab855943662b02a648e038f4fc9255525f03b2494534527e7339dd07
MD5 6226455c963bb4b61bad837ae5ee873b
BLAKE2b-256 6b39891c6f67027fb1e5dfe5f95f79bfc7ec8e3e4fffe94a17d0760712a0cb02

See more details on using hashes here.

File details

Details for the file accelerate-0.4.0-py3-none-any.whl.

File metadata

  • Download URL: accelerate-0.4.0-py3-none-any.whl
  • Upload date:
  • Size: 55.3 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.2.0 pkginfo/1.5.0.1 requests/2.24.0 setuptools/47.1.0 requests-toolbelt/0.9.1 tqdm/4.49.0 CPython/3.7.9

File hashes

Hashes for accelerate-0.4.0-py3-none-any.whl
Algorithm Hash digest
SHA256 08e37ce6d314e79aaf2ef01782564b7588f7163623fc01bfd2b57e226d579453
MD5 e858488bdf4823c9318cd2b37088c04d
BLAKE2b-256 b31034c6f17375b1be01a9aad4e11dfd60f4dae3341e7f69aaaea411bd7421ca

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page