Skip to main content

Easily deploy models for the best performance and cost using Amazon SageMaker

Optimizing cloud resources to achieve the best cost and performance for your ML model is critical.

In this session, learn how Amazon SageMaker Inference Recommender automatically selects the compute instance type, instance count, container parameters, and model optimizations for inference to maximize performance and minimize cost.

You can then deploy your model to one of the recommended instances or run a fully managed load test on a set of instance types you choose without worrying about testing infrastructure.

You can review results of the load test in SageMaker Studio and evaluate the tradeoffs between latency, throughput, and cost to select the most optimal deployment configuration for your use case.

Click here and download the presentation deck to learn more!