Standardize and Automate your Feature Engineering Workflows with SageMaker Feature Store [Level 300]

June 9, 2021
As a data scientist, you certainly spend a lot of time crafting feature engineering code. Indeed, given the experimental nature of this work, even a small project can lead to multiple iterations. Thus, you’ll often run the same feature engineering code again and again, wasting time and compute resources on repeating the same operations. In large organizations, this may cause an even greater loss of productivity, as different teams often run identical jobs, or even write duplicate feature engineering code because they have no knowledge of prior work. As models are trained on engineered datasets, it’s also imperative that you apply the same transformations to data used for prediction. This often means rewriting your feature engineering code (sometimes in a different language), integrating it in your prediction workflow, and running it at prediction time. This whole process is not only time-consuming, it can also introduce inconsistencies, as even the tiniest variation in your data transforms can have a large impact on predictions. In this hands-on session, you’ll learn how to solve all these problems with Amazon SageMaker Feature Store, and how to use it with both the SageMaker Studio user interface and the SageMaker SDK. You’ll also see how it works together with SageMaker Data Wrangler to simplify your end to end data preparation workflows.

Speaker: Julien Simon, AWS Principal Advocate, ML/AI
Previous Video
Reduce Training Time and Cost with SageMaker Debugger  [Level 300]
Reduce Training Time and Cost with SageMaker Debugger [Level 300]

In this session, we walk through how to use the real-time training metrics and set up alerts so you can red...

Next Video
Scale your Large Distributed Training Jobs with Data and Model Parallelism Optimized for Amazon SageMaker [Level 300]
Scale your Large Distributed Training Jobs with Data and Model Parallelism Optimized for Amazon SageMaker [Level 300]

In this session, explore how to choose the proper instance for ML training and inference based on model siz...