Henry I want to understand what early stopping means in deep learning. How does it help prevent overfitting during the training process? Can someone also explain how to determine the optimal point to stop training?