Integrating SageMaker Inference Endpoint with API Gateway REST API

Integrating SageMaker Inference Endpoint with API Gateway REST API

Takahiro Iwasa
(岩佐 孝浩)
Takahiro Iwasa (岩佐 孝浩)
2 min read
API Gateway SageMaker

SageMaker offers its inference endpoint. Users can access it not only directly but also indirectly through API Gateway’s REST API using Integration Request.


The example below uses no AWS Lambda functions.

Finding SageMaker Inference Endpoint

You can find your inference endpoint in the SageMaker console by navigating to Endpoint summary > URL.

The endpoint format is https://runtime.sagemaker.<ENDPOINT_REGION><ENDPOINT_NAME>/invocations. Because you are required to include a valid Authorization header, this endpoint works properly despite lack of your AWS account id.

Endpoints are scoped to an individual account, and are not public. The URL does not contain the account ID, but Amazon SageMaker determines the account ID from the authentication token that is supplied by the caller.

Building REST API Integrated with SageMaker Inference Endpoint

Select REST API.

Specify a name for the API.

Select Actions -> Create Method.

Select a method type. POST is used in the example below.

Specify the required parameters according to the following table.

Integration typeAWS Service
AWS ServiceSageMaker Runtime (NOT SageMaker)
Action TypeUse path override
Path override (optional)endpoints/<ENDPOINT_NAME>/invocations
Execution roleIAM role for API
(The sagemaker:InvokeEndpoint action must be authorized)
Content HandlingPassthrough

Creating the method completed.


If your model expects binary data as input, add the MIME type like image/* to Binary Media Types.

If the MIME type is not added, you should see the following response.

    "LogStreamArn": "arn:aws:logs:ap-northeast-1:xxxxxxxxxxxx:log-group:/aws/sagemaker/Endpoints/<ENDPOINT_NAME>",
    "Message": "Received client error (400) from primary with message \"unable to evaluate payload provided\". See<ENDPOINT_NAME> in account xxxxxxxxxxxx for more information.",
    "OriginalMessage": "unable to evaluate payload provided",
    "OriginalStatusCode": 400


Deploy the API.

After deploying, you can find the API endpoint.


Access the API endpoint with the following command.

curl --location '<API_ENDPOINT>' \
  --header 'Content-Type: image/jpeg' \
  --header 'Accept: application/json' \
  --data-binary '@/path/to/image.jpg'
Takahiro Iwasa
(岩佐 孝浩)

Takahiro Iwasa (岩佐 孝浩)

Software Developer at iret, Inc.
Architecting and developing cloud native applications mainly with AWS. Japan AWS Top Engineers 2020-2023