PyTorch

Pytorch Distributed Data Parallel With Model Parallel in an HPC Environment

Distributed Data Parallel with Model Parallel in an HPC environment Objective This tutorial is on : how to separate a model and put it on multiple GPUs. how to train such model in a distributed data parallel fashion. how to use torch.distributed.launch and create a slurm job script for HPC environment. Model Parallel (Pipelining) When a model is too large to fit in one GPU device, we can cut it in half and put each part on different GPU device. To do this, we need to partition the model into “head” and “tail” and specify which device to put them on. In the following toy example, we simply put the first part in to current GPU device and the second part to the next device.

Documentation

Thursday, December 12, 2019 | 5 minutes Read

Learning PyTorch Part I

Introduction Currently, I am participating the deep learning part1v2 course as an “international fellow”. This course is taught by Jeremy Howard from fast.ai. The course is not available to the public yet, but it will be in future. During the course, Jeremy introduced PyTorch and the fastai package built on top of PyTorch. Before this, I only used Tensorflow and Keras. PyTorch is quite different (in a good way). I am very impressed by the elegant and flexible design of PyTorch. I would like to introduce some features I think interesting.

Monday, November 13, 2017 | 3 minutes Read