Reduce Training Time and Cost with SageMaker Debugger [Level 300]

June 9, 2021
Manual debugging is a common productivity drain in the machine learning lifecycle. Identifying underperforming training jobs requires constant developer attention and deep domain expertise. Just as unit tests boost traditional software development, an automated ML debugging library can save time and money. Amazon SageMaker Debugger is a ML feature that provides a set of rules that automatically identify model- and performance-related issues and stops underperforming training jobs. Debugger automatically captures relevant data during training and evaluation and presents it for online and offline inspection. In this session, we walk through how to use the real-time training metrics and set up alerts so you can reduce troubleshooting time, training costs and improve model quality. 

Speaker: Nathalie Rauschmayr, AWS Applied Scientist
Previous Video
Comparing Models in Production with Multi-Armed Bandits and Reinforcement Learning [Level 300]
Comparing Models in Production with Multi-Armed Bandits and Reinforcement Learning [Level 300]

Using the popular Hugging Face Transformers open source library for BERT to train and deploy multiple natur...

Next Video
Standardize and Automate your Feature Engineering Workflows with SageMaker Feature Store [Level 300]
Standardize and Automate your Feature Engineering Workflows with SageMaker Feature Store [Level 300]

Learn how to solve all these problems with Amazon SageMaker Feature Store, and how to use it with both the ...