×

Enter your
email for instant access to
Mod Week AI/ML
on-demand

Opt-in to Future Emails
Yes, I'd like Amazon Web Services (AWS) to share the latest news about AWS services and related offerings with me by email, post or telephone.

You may unsubscribe from receiving AWS news and offers at any time by following the instructions in the communications received. AWS handles your information as described in the AWS Privacy Notice.
Thank you!
Error - something went wrong!

Achieve high performance and cost-effective model deployment

To maximize your ML investments, high performance and cost-effective techniques, including real-time, asynchronous, and batch, are needed to scale model deployments.

In this session, discuss the different inference options available in Amazon SageMaker such as multi-container endpoints, inference pipelines, multi-model endpoints, frameworks such as TensorFlow and PyTorch, Python-based backend servers, and C++/Go-based backend servers. Pick the best inference option for your ML use case so you can scale to thousands of models across your business.

Click here and download the presentation deck to learn more!

Previous Video
Implementing MLOps practices with Amazon SageMaker
Implementing MLOps practices with Amazon SageMaker

Explore the features in Amazon SageMaker Pipelines that help you increase automation, track data lineage, c...

Next Video
Easily deploy models for the best performance and cost using Amazon SageMaker
Easily deploy models for the best performance and cost using Amazon SageMaker

Learn how Amazon SageMaker Inference Recommender auto-selects the compute instance type, instance count, co...

Questions about AI/ML?

Get in touch »