Parallel and Distributed Deep Learning

This report explores ways to parallelize/distribute deep learning in multi-core and distributed setting. We have analyzed (empirically) the speedup in training a CNN using conventional single core CPU and GPU and provide practical suggestions to improve training times. In the distributed setting, we study and analyze synchronous and asynchronous weight update algorithms (like Parallel SGD, ADMM and Downpour SGD) and come up with worst case asymptotic communication cost and computation time for each of the these algorithms. Read More

#distributed-learning, #machine-learning, #split-learning