Deploy ML models for inference at high performance and low cost

High-performance, cost-effective model deployment is critical to maximize the return on your machine learning (ML) investments. Amazon SageMaker provides the breadth and depth of fully managed deployment features for you to achieve optimal inference performance and cost while reducing the operational burden of deploying and managing models in production. In this session, learn how to use SageMaker inference capabilities to quickly deploy ML models in production at scale. Discover SageMaker deployment options, including infrastructure choices; real-time, serverless, asynchronous, and batch inference; single-model, multi-model, and multi-container endpoints; auto scaling; SageMaker Inference Recommender; model monitoring; and SageMaker MLOps integration. We also cover how to validate the performance of new ML models against production models to prevent costly outages.

Previous Video
Train ML models at scale with Amazon SageMaker
Train ML models at scale with Amazon SageMaker

In this session, learn how Amazon SageMaker reduces the time and cost to train and tune large-scale machine...

Next Video
Practical decision-making using no-code ML
Practical decision-making using no-code ML

Organizations everywhere are using ML to accurately predict outcomes. In this session, you will learn how t...