Seamlessly Integrating SageMaker Inference Endpoint with API Gateway REST API

Takahiro Iwasa

Jul 13, 2023

3 min read

API Gateway SageMaker

Introduction

Amazon SageMaker provides powerful inference endpoints that can be accessed directly or indirectly through API Gateway REST APIs. This guide demonstrates how to set up such integration using API Gateway’s Integration Request feature, eliminating the need for AWS Lambda functions.

The integration process outlined here does not involve AWS Lambda. Instead, we directly connect the API Gateway to SageMaker’s inference endpoint for streamlined operations.

Diagram Overview

Finding SageMaker Inference Endpoint

To locate your SageMaker inference endpoint:

Navigate to SageMaker Console and go to Endpoint summary > URL.

Endpoint Example

The endpoint format is as follows:

https://runtime.sagemaker.<ENDPOINT_REGION>.amazonaws.com/endpoints/<ENDPOINT_NAME>/invocations

Ensure a valid Authorization header is included for the endpoint to work correctly. Read more on AWS Documentation.

Endpoints are scoped to an individual account, and are not public. The URL does not contain the account ID, but Amazon SageMaker determines the account ID from the authentication token that is supplied by the caller.

Building REST API Integrated with SageMaker Inference Endpoint

Step 1: Create a REST API

Select REST API in the API Gateway Console.

Assign a name to your API.

API Name Setup

Step 2: Create a Method

Select Actions -> Create Method.

Choose the HTTP method type (POST is used in this example).

Method Type

Step 3: Configure Integration Request

Provide the following details:

Field	Value
Integration type	AWS Service
AWS Service	SageMaker Runtime (NOT SageMaker)
HTTP method	POST
Action Type	Use path override
Path override	`endpoints/<ENDPOINT_NAME>/invocations`
Execution role	IAM role for API (must include the `sagemaker:InvokeEndpoint` action)
Content Handling	Passthrough

Integration Request Configuration

Step 4: Configure Binary Media Types (Optional)

If your model accepts binary input (e.g., images), add the relevant MIME type (e.g., image/*) under Binary Media Types.

Binary Media Configuration

Without this configuration, you may encounter the following error:

{
  "ErrorCode": "CLIENT_ERROR_FROM_MODEL",
  "LogStreamArn": "arn:aws:logs:ap-northeast-1:xxxxxxxxxxxx:log-group:/aws/sagemaker/Endpoints/<ENDPOINT_NAME>",
  "Message": "Received client error (400) from primary with message \"unable to evaluate payload provided\". See https://ap-northeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-northeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/<ENDPOINT_NAME> in account xxxxxxxxxxxx for more information.",
  "OriginalMessage": "unable to evaluate payload provided",
  "OriginalStatusCode": 400
}

Deployment

Deploy the API:

Select Deploy API in the API Gateway console.

Assign a stage for the deployment.

Deployment Steps

Once deployed, the API endpoint will be available for testing.

API Endpoint Example

Testing

You can test the deployed API using the following curl command:

curl --location '<API_ENDPOINT>' \
  --header 'Content-Type: image/jpeg' \
  --header 'Accept: application/json' \
  --data-binary '@/path/to/image.jpg'

Conclusion

Integrating SageMaker inference endpoints with API Gateway REST APIs is a cost-effective, low-code solution. However, if complex request and response transformations are needed, consider using AWS Lambda for greater flexibility.

By following this guide, you can effectively bridge the gap between SageMaker and API Gateway without additional compute layers.

Happy Coding! 🚀