Seamlessly Integrating SageMaker Inference Endpoint with API Gateway REST API

Seamlessly Integrating SageMaker Inference Endpoint with API Gateway REST API

Takahiro Iwasa
Takahiro Iwasa
3 min read
API Gateway SageMaker

Introduction

Amazon SageMaker provides powerful inference endpoints that can be accessed directly or indirectly through API Gateway REST APIs. This guide demonstrates how to set up such integration using API Gateway’s Integration Request feature, eliminating the need for AWS Lambda functions.

The integration process outlined here does not involve AWS Lambda. Instead, we directly connect the API Gateway to SageMaker’s inference endpoint for streamlined operations.

Diagram Overview

Finding SageMaker Inference Endpoint

To locate your SageMaker inference endpoint:

Navigate to SageMaker Console and go to Endpoint summary > URL.

Endpoint Example

The endpoint format is as follows:

https://runtime.sagemaker.<ENDPOINT_REGION>.amazonaws.com/endpoints/<ENDPOINT_NAME>/invocations

Endpoints are scoped to an individual account, and are not public. The URL does not contain the account ID, but Amazon SageMaker determines the account ID from the authentication token that is supplied by the caller.

Building REST API Integrated with SageMaker Inference Endpoint

Step 1: Create a REST API

Select REST API in the API Gateway Console.

Assign a name to your API.

API Name Setup

Step 2: Create a Method

Select Actions -> Create Method.

Choose the HTTP method type (POST is used in this example).

Method Type

Step 3: Configure Integration Request

Provide the following details:

FieldValue
Integration typeAWS Service
AWS ServiceSageMaker Runtime (NOT SageMaker)
HTTP methodPOST
Action TypeUse path override
Path overrideendpoints/<ENDPOINT_NAME>/invocations
Execution roleIAM role for API (must include the sagemaker:InvokeEndpoint action)
Content HandlingPassthrough

Integration Request Configuration

Step 4: Configure Binary Media Types (Optional)

If your model accepts binary input (e.g., images), add the relevant MIME type (e.g., image/*) under Binary Media Types.

Binary Media Configuration

Without this configuration, you may encounter the following error:

{
  "ErrorCode": "CLIENT_ERROR_FROM_MODEL",
  "LogStreamArn": "arn:aws:logs:ap-northeast-1:xxxxxxxxxxxx:log-group:/aws/sagemaker/Endpoints/<ENDPOINT_NAME>",
  "Message": "Received client error (400) from primary with message \"unable to evaluate payload provided\". See https://ap-northeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-northeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/<ENDPOINT_NAME> in account xxxxxxxxxxxx for more information.",
  "OriginalMessage": "unable to evaluate payload provided",
  "OriginalStatusCode": 400
}

Deployment

Deploy the API:

Select Deploy API in the API Gateway console.

Assign a stage for the deployment.

Deployment Steps

Once deployed, the API endpoint will be available for testing.

API Endpoint Example

Testing

You can test the deployed API using the following curl command:

curl --location '<API_ENDPOINT>' \
  --header 'Content-Type: image/jpeg' \
  --header 'Accept: application/json' \
  --data-binary '@/path/to/image.jpg'

Conclusion

Integrating SageMaker inference endpoints with API Gateway REST APIs is a cost-effective, low-code solution. However, if complex request and response transformations are needed, consider using AWS Lambda for greater flexibility.

By following this guide, you can effectively bridge the gap between SageMaker and API Gateway without additional compute layers.

Happy Coding! 🚀

Takahiro Iwasa

Takahiro Iwasa

Software Developer at KAKEHASHI Inc.
Involved in the requirements definition, design, and development of cloud-native applications using AWS. Now, building a new prescription data collection platform at KAKEHASHI Inc. Japan AWS Top Engineers 2020-2023.