Deep learning algorithms in fields like computer vision and natural language
processing have seen a movement towards increasingly larger Neural Networks in
terms of the depth and the number of parameters. This creates two major
downsides for Deep Learning researchers -
- It takes a lot of time to train these neural networks even on GPUs.
- The memory footprint of these neural networks is so large that they can’t fit on a typical GPU DRAM.
This research project aims to explore and develop algorithms for parallel deep
learning. We are working on improving both the time as well as the memory
efficiency for training large neural networks in a distributed setting. We also
seek to scale beyond the current state-of-the-art to train even larger
architectures. The aim is to develop a robust and user-friendly deep learning
framework that makes it extremely easy for the end user to train large neural
networks in distributed environments.