Counting Objects with SageMaker's Built-In Object Detection Algorithm

Takahiro Iwasa

Jul 18, 2023

8 min read

Ground Truth Object Detection SageMaker

Introduction

Object detection algorithms are highly effective for counting objects in images. SageMaker provides a suite of built-in algorithms to make this process efficient and scalable. This guide walks you through:

Labeling images using Ground Truth
Training and deploying your models
Performing inference

Images in this post are illustrative and not associated with specific customer projects.

Overview

SageMaker inference endpoints are not recommended for direct production use due to potential security risks. However, they are excellent for testing purposes, as demonstrated in this post.

Labeling with Ground Truth

Ground Truth offers a robust feature set for labeling datasets. Detailed information can be found in the Ground Truth documentation.

Creating a Labeling Job

Creating Labeling Workforces

To start labeling, you first need to set up your labeling workforce. In this post, a private workforce is created. Team members can authenticate using Cognito or OIDC.

Once the workforce is created, an invitation email is sent to the workers. This email includes the URL for the labeling portal. You can also retrieve the labeling portal URL by navigating to:

Private workforce summary > Labeling portal sign-in URL in the SageMaker management console.

Signing Up and Signing In to the Labeling Portal

Workers must follow the instructions in the invitation email to sign up and access the labeling portal.

An example of the invitation email is as follows:

Hi,

You are invited by jane.doe@xyz.com from <COMPANY> to work on a labeling project.

Click on the link below to log into your labeling project.
"https://<LABELING_PORTAL_URL>"

You will need the following username and temporary password provided below to login for the first time.
User name: <USER_NAME>
Temporary password: <PASSWORD>

Once you log in with your temporary password, you will be required to create a new password for your account.
After creating a new password, you can log into your private team to access your labeling project.

If you have any questions, please contact us at jane.doe@xyz.com.

After accessing the URL, workers must enter the username and temporary password from the invitation email.

They will then be prompted to change their temporary password to a new one.

Upon successful login, workers are redirected to the top page of the labeling portal. Any assigned labeling jobs will appear on this page.

Creating Labeling Job

Navigate back to the SageMaker management console and start creating a new labeling job. Fill in the necessary fields as shown in the image below.

After creating the labeling job, it cannot be deleted. Use a unique value, such as one generated with the following command: uuidgen | tr "[:upper:]" "[:lower:]".

Make sure to click Complete data setup to finalize the process.

For complex labeling tasks, consider specifying a longer value for the Task timeout parameter.

Start Labeling

After signing in to the labeling portal, the labeling job you just created should appear. Click the Start working button to begin.

It may take some time for the labeling job to appear in the list.

Follow the job instructions to label the dataset. Below is an example of a labeled dataset.

Once all workers have completed their tasks, stop the labeling job.

Labeling Output

Directories

After the labeling job is stopped, the final output will be available in the specified S3 bucket. For object detection tasks, the file manifests/output/output.manifest is particularly important.

Refer to the SageMaker Ground Truth documentation for more details.

<YOUR_OUTPUT_PATH>/
|-- annotation-tool/
|-- annotations/
|   |-- consolidated-annotation/
|   `-- worker-response/
|-- manifests/
|   |-- intermediate/
|   `-- output/
|       `-- output.manifest
`-- temp/

Ground Truth produces labeling results in Augmented Manifest format. For additional details, check the official documentation.

Training with SageMaker

Training

Once the labeling process is complete, proceed to train your model using the SageMaker console. Configure the training job as follows:

Job Settings
- Job Name: Use a unique value (e.g., uuidgen | tr "[:upper:]" "[:lower:]").
- Algorithm Source: Amazon SageMaker built-in algorithm.
- Algorithm: Vision - Object Detection (MXNet).
- Input Mode: Pipe.
Resource Configuration
- Instance Type: Use a GPU instance like ml.p2.xlarge.
- Only GPU instances support SageMaker object detection algorithms.
Hyperparameters
- num_classes: Set to the number of object classes (e.g., 1 in this post).
- num_training_samples: Equal to the number of lines in the manifest file.
Input Data Configuration
- Training Channel
  - Channel Name: train
  - Input Mode: Pipe
  - Content Type: application/x-recordio
  - Record Wrapper: RecordIO
  - Data Source: S3 (Augmented Manifest File)
    - Include attributes like source-ref and bounding box data keys.
    - Specify the S3 URI for the training data manifest file.
- Validation Channel
  - Channel Name: validation
Output Data Configuration
- Specify the S3 URI for storing model artifacts.

By utilizing the Augmented Manifest format, you can use Pipe input mode and the RecordIO wrapper type without the need to create additional RecordIO files.

Refer to the Object Detection documentation for more details:

The augmented manifest format enables you to do training in pipe mode using image files without needing to create RecordIO files.

When using Object Detection with Augmented Manifest, the value of parameter RecordWrapperType must be set as “RecordIO”.

Inference

Creating Model from Training Job

To create a model from the completed training job, click Create model in the SageMaker console.

Deploying the Model

After creating the model, deploy it by clicking Create endpoint. For cost-efficient and infrequent usage, consider using a serverless endpoint.

Serverless Inference is a cost-effective option if you have an infrequent or unpredictable traffic pattern. During times when there are no requests, Serverless Inference scales your endpoint down to 0, helping you to minimize your costs.

Making Requests

Using SageMaker Inference Endpoint

Locate the SageMaker inference endpoint on the endpoint detail page. This endpoint can be accessed using tools like curl, Postman, or custom applications.

Directly using SageMaker inference endpoints for production workloads is not recommended. This post demonstrates endpoint usage for testing purposes only.

Example: Postman Configuration

Use the following parameters for authentication with AWS Signature V4:

AccessKey
SecretKey
Session Token: Use a temporary credential instead of a permanent one.
AWS Region: The region of your SageMaker endpoint.
Service Name: sagemaker

Set the Accept: application/json header in the request.

Since the trained model expects binary image input, ensure your image is provided in binary format.

Example: Using AWS SDK (boto3)

You can also perform inference programmatically using the boto3 invoke_endpoint API. Below is an example script:

import json

import boto3

# Initialize SageMaker runtime client
runtime = boto3.client('sagemaker-runtime')

# Define endpoint and input details
endpoint_name = '<YOUR_ENDPOINT_NAME>'
content_type = 'application/x-image'
payload = None

# Read the image file in binary mode
with open('/path/to/image.jpg', 'rb') as f:
    payload = f.read()

# Invoke the endpoint
response = runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType=content_type,
    Body=payload
)

# Parse and display the response
body = response['Body'].read()
predictions = json.loads(body.decode())
print(json.dumps(predictions, indent=2))

# Save the response to a file
with open('./response.json', 'w') as f:
    json.dump(predictions, f, indent=2)

Inference Response

Checking the Response

The response is returned in JSON format, containing:

Class Label IDs
Confidence Scores
Bounding Box Coordinates

The bounding box coordinates are relative to the actual image dimensions.

For more details, refer to the official documentation.

{
  "prediction": [
    [
      0.0,
      0.9953756332397461,
      0.3821756839752197,
      0.007661208510398865,
      0.525381863117218,
      0.19436971843242645
    ],
    [
      0.0,
      0.9928023219108582,
      0.3435703217983246,
      0.23781903088092804,
      0.5533013343811035,
      0.6385164260864258
    ],
    [
      0.0,
      0.9911478757858276,
      0.15510153770446777,
...
      0.9990172982215881
    ]
  ]
}

Visualizing Inference Response

To interpret the results of the inference visually, you can use Jupyter Notebook along with matplotlib. The following Python script demonstrates how to overlay bounding boxes and annotations on the input image.

import json

import matplotlib.patches as patches
import matplotlib.pyplot as plt
from PIL import Image

# Configure plot
plt.figure()
axes = plt.axes()

# Read an image
im = Image.open('/path/to/image.jpg')
# Display the image
plt.imshow(im)

# Read SageMaker inference predictions
with open('response.json') as f:
    predictions = json.loads(f.read())['prediction']

# Set initial count
count = 0

# Create rectangles
for prediction in predictions:
    score = prediction[1]
    if score < 0.2:
        continue

    # Count up
    count += 1

    x = prediction[2] * im.width
    y = prediction[3] * im.height
    width = prediction[4] * im.width - x
    height = prediction[5] * im.height - y

    rect = patches.Rectangle((x, y), width, height, linewidth=1, edgecolor='r', facecolor='none')
    axes.annotate(count, (x + width / 2, y + height / 2), color='yellow', weight='bold', fontsize=18, ha='center', va='center')
    axes.add_patch(rect)

# Display the rectangles
plt.show()

The script reads the prediction response (in JSON format), extracts the bounding box coordinates, and draws rectangles around the detected objects. Each bounding box is annotated with its corresponding object count.

Conclusion

Machine learning with object detection is a versatile approach for tasks like object counting. SageMaker provides a comprehensive ecosystem to streamline this process, from labeling and training to inference.

I hope this guide helps you build effective solutions for similar tasks.

Happy Coding! 🚀