Member-only story

Daily Data Science Tip #10

Why do we batch the dataset before training?

Feb 3, 2021

As a Machine Learning practitioner, you’ve probably wondered why is it a standard processing tp batch training data before feeding it to a neural network?

A straightforward answer is that training data or data used within neural networks are batched mainly for memory optimisation purposes. Placing a whole dataset, for example, all 60,000 of the MNIST training dataset, in a GPU’s memory is very expensive. You would probably run into the infamous “RuntimeError: CUDA error: out of memory”.

To avoid memory issues when training a neural network, large datasets are batched in sets of 16, 32, or 128. The batch number depends on the compute resource's memory capacity.

Daily Data Science Tip #10

Why do we batch the dataset before training?

Written by Richmond Alake

No responses yet