Seamlessly Integrating SageMaker Inference Endpoint with API Gateway REST API
Introduction
Amazon SageMaker provides powerful inference endpoints that can be accessed directly or indirectly through API Gateway REST APIs. This guide demonstrates how to set up such integration using API Gateway’s Integration Request feature, eliminating the need for AWS Lambda functions.
The integration process outlined here does not involve AWS Lambda. Instead, we directly connect the API Gateway to SageMaker’s inference endpoint for streamlined operations.
Finding SageMaker Inference Endpoint
To locate your SageMaker inference endpoint:
Navigate to SageMaker Console and go to Endpoint summary > URL
.
The endpoint format is as follows:
https://runtime.sagemaker.<ENDPOINT_REGION>.amazonaws.com/endpoints/<ENDPOINT_NAME>/invocations
Authorization
header is included for the endpoint to work correctly. Read more on AWS Documentation.
Endpoints are scoped to an individual account, and are not public. The URL does not contain the account ID, but Amazon SageMaker determines the account ID from the authentication token that is supplied by the caller.
Building REST API Integrated with SageMaker Inference Endpoint
Step 1: Create a REST API
Select REST API
in the API Gateway Console.
Assign a name to your API.
Step 2: Create a Method
Select Actions -> Create Method
.
Choose the HTTP method type (POST
is used in this example).
Step 3: Configure Integration Request
Provide the following details:
Field | Value |
---|---|
Integration type | AWS Service |
AWS Service | SageMaker Runtime (NOT SageMaker) |
HTTP method | POST |
Action Type | Use path override |
Path override | endpoints/<ENDPOINT_NAME>/invocations |
Execution role | IAM role for API (must include the sagemaker:InvokeEndpoint action) |
Content Handling | Passthrough |
Step 4: Configure Binary Media Types (Optional)
If your model accepts binary input (e.g., images), add the relevant MIME type (e.g., image/*
) under Binary Media Types.
Without this configuration, you may encounter the following error:
{
"ErrorCode": "CLIENT_ERROR_FROM_MODEL",
"LogStreamArn": "arn:aws:logs:ap-northeast-1:xxxxxxxxxxxx:log-group:/aws/sagemaker/Endpoints/<ENDPOINT_NAME>",
"Message": "Received client error (400) from primary with message \"unable to evaluate payload provided\". See https://ap-northeast-1.console.aws.amazon.com/cloudwatch/home?region=ap-northeast-1#logEventViewer:group=/aws/sagemaker/Endpoints/<ENDPOINT_NAME> in account xxxxxxxxxxxx for more information.",
"OriginalMessage": "unable to evaluate payload provided",
"OriginalStatusCode": 400
}
Deployment
Deploy the API:
Select Deploy API
in the API Gateway console.
Assign a stage for the deployment.
Once deployed, the API endpoint will be available for testing.
Testing
You can test the deployed API using the following curl
command:
curl --location '<API_ENDPOINT>' \
--header 'Content-Type: image/jpeg' \
--header 'Accept: application/json' \
--data-binary '@/path/to/image.jpg'
Conclusion
Integrating SageMaker inference endpoints with API Gateway REST APIs is a cost-effective, low-code solution. However, if complex request and response transformations are needed, consider using AWS Lambda for greater flexibility.
By following this guide, you can effectively bridge the gap between SageMaker and API Gateway without additional compute layers.
Happy Coding! 🚀