Comprehensive Guide to Edge AI Inference with AWS IoT Greengrass V2 and SageMaker
Introduction
I recently hosted a study session titled Introduction to Edge AI Inference with AWS IoT Greengrass V2/SageMaker, where I discussed the integration of AWS IoT Greengrass V2 and SageMaker for edge AI inference. In this blog post, I will share the presentation content, aiming to provide insights and practical steps to explore edge inference capabilities with AWS.
You can access the sample code used in this guide from my GitHub repository.
Prerequisites
Target Readers
This post is intended for readers who:
- Are interested in edge AI inference on AWS.
- Have a basic understanding of machine learning and AWS services.
Goals
- Understand the fundamentals of edge AI inference with AWS.
- Learn to utilize tools like IoT Greengrass V2, SageMaker, SageMaker Neo, and SageMaker Edge Manager.
What Is Edge Computing
Edge computing is a distributed computing paradigm that processes data closer to its source, resulting in reduced latency and bandwidth usage.
Benefits
- Low Latency: Ideal for real-time applications like autonomous vehicles.
- Enhanced Security: Limits data exposure during transit.
- Reduced Communication Costs: Decreases dependency on centralized data centers.
Challenges
- Scaling: Vertical scaling is less flexible than in the cloud.
- Infrastructure Management: Requires capacity planning and maintenance.
AWS IoT Greengrass V2
Overview
AWS IoT Greengrass is an open-source IoT edge runtime and cloud service designed for edge application deployment and management.
Features
- Cross-platform support for Linux and Windows.
- Compatibility with both x86 and ARM architectures.
- Integrated Lambda functions and Deep Learning Runtime (DLR).
Concepts
The IoT Greengrass V2 architecture revolves around the following key components:
-
Greengrass Core Devices:
- Serve as the main compute devices running Greengrass Core software at the edge.
- Are registered as AWS IoT Things.
- Facilitate communication with AWS cloud services.
-
Greengrass Client Devices:
- Connect to Greengrass Core Devices via MQTT protocol.
- Are registered as AWS IoT Things.
- Enable communication with other client devices when using a Greengrass Core Device as a message broker.
-
Greengrass Components:
- Represent software modules running on Greengrass Core Devices.
- Are customized and registered by users to extend functionality.
-
Deployment:
- Comprises instructions managed by AWS for distributing configurations and components to Greengrass Core Devices.
This modular approach enables scalable and flexible edge computing, making IoT Greengrass V2 ideal for diverse IoT applications. For more information, refer to the Key concepts for AWS IoT Greengrass
Amazon SageMaker and Its Ecosystem
SageMaker Overview
Amazon SageMaker is a fully managed service for building, training, and deploying machine learning models. It supports major deep learning frameworks such as TensorFlow and PyTorch.
Please also refer to the following article.
SageMaker Neo
Neo optimizes machine learning models for edge devices, ensuring compatibility and enhanced performance.
SageMaker Edge Manager
This service manages and monitors edge-deployed machine learning models, offering capabilities like model optimization and security.
Implementing Edge AI Inference
In this example, we demonstrate how to use an EC2 instance as an edge device to perform edge inference. The process involves a series of steps to set up, train, deploy, and test an edge inference pipeline.
Steps to Implement Edge Inference
-
Setting Up
- Prepare AWS Resources: Set up the necessary IAM roles, S3 buckets, and other AWS resources.
- Implement Training Script: Write a script to train your model.
- Implement Inference Script: Develop a script for model inference at the edge.
-
Using SageMaker
- Train with SageMaker: Leverage SageMaker to train the model.
- Compile Model with SageMaker Neo: Optimize the trained model for edge devices.
- Package Model with SageMaker Edge Manager: Prepare the model for deployment to Greengrass Core.
-
Configuring Greengrass
- Set Up Greengrass Core: Install and configure Greengrass Core on the EC2 instance.
- Register Greengrass Component: Create a Greengrass Component for edge inference and register it.
- Deploy Greengrass Component: Deploy the component to the Greengrass Core device.
-
Testing
- Convert Test Data: Prepare input data in a format compatible with the edge model (e.g., Numpy arrays).
- Deploy Test Data: Transfer the test data to the designated folder on the Greengrass Core device.
- Check Results: Observe inference results in the Greengrass Core logs.
Preparing AWS Resources
Prepare the following AWS resources beforehand. For specific details, refer to the CloudFormation template in the GitHub repository.
Resource | Name | Description |
---|---|---|
IAM User | greengrass-core-setup-user | For setting up Greengrass Core |
IAM Role | sagemaker-execution-role | SageMaker execution role |
IAM Role | GreengrassV2TokenExchangeRole | Greengrass Core role |
S3 | sagemaker-ml-model-artifacts-{account_id}-{region} | Bucket for ML models |
Run the following command to create these resources:
aws cloudformation deploy --template-file ./cfn.yaml --stack-name greengrass-sample --capabilities CAPABILITY_NAMED_IAM
Implementing Training Script
Installing Dependencies
This example uses PyTorch’s pre-trained VGG16 model. Install it using:
pip install torch torchvision
Writing the Script
Save the following script as training.py
to execute in SageMaker:
import argparse
import os
from datetime import datetime
import torch
from torchvision import models
def fit(model: torch.nn.modules.Module) -> None:
# Add training code here
pass
def save(model: torch.nn.modules.Module, path: str) -> None:
suffix = datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
path = os.path.join(path, f'model-{suffix}.pt')
# If you use `model.state_dict()`, SageMaker compilation will fail.
torch.save(model, path)
def parse_args() -> argparse.Namespace:
# hyperparameters sent by the client are passed as command-line arguments to the script.
parser = argparse.ArgumentParser()
parser.add_argument('--model_dir', type=str)
parser.add_argument('--sm_model_dir', type=str, default=os.environ.get('SM_MODEL_DIR'))
parser.add_argument('--train', type=str, default=os.environ.get('SM_CHANNEL_TRAIN'))
parser.add_argument('--test', type=str, default=os.environ.get('SM_CHANNEL_TEST'))
args, _ = parser.parse_known_args()
return args
if __name__ == '__main__':
args = parse_args()
vgg16 = models.vgg16(pretrained=True)
fit(vgg16)
save(vgg16, args.sm_model_dir)
Refer to the official SageMaker documentation for additional details on runtime arguments, environment variables, and data handling.
Implementing Inference Script
Installing Dependencies
Install Deep Learning Runtime (DLR) for model inference:
pip install dlr
Writing the Script
Save the following script as inference.py
to run on your Greengrass Core:
import argparse
import glob
import json
import os
import time
import numpy as np
from dlr import DLRModel
def load_model() -> DLRModel:
return DLRModel('/greengrass/v2/work/vgg16-component')
def load_labels() -> dict:
path = os.path.dirname(os.path.abspath(__file__))
# See https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
path = os.path.join(path, 'imagenet_class_index.json')
with open(path, 'r') as f:
return json.load(f)
def iter_files(path: str) -> str:
path = path[:-1] if path.endswith('/') else path
files = glob.glob(f'{path}/*.npy')
for file in files:
yield file
def predict(model: DLRModel, image: np.ndarray) -> np.ndarray:
return model.run(image)[0]
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser()
parser.add_argument('--test_dir', type=str)
parser.add_argument('--interval', type=int, default=300)
args, _ = parser.parse_known_args()
return args
def start(model: DLRModel, path: str, labels: dict) -> None:
for file in iter_files(path):
image = np.load(file)
y = predict(model, image)
index = int(np.argmax(y))
label = labels.get(str(index), '')
print(f'Prediction result of {file}: {label}')
if __name__ == '__main__':
args = parse_args()
print(f'args: {args}')
model = load_model()
labels = load_labels()
if args.interval == 0:
start(model, args.test_dir, labels)
else:
while True:
start(model, args.test_dir, labels)
print(f'Sleep in {args.interval} seconds...')
time.sleep(args.interval)
torch.Tensor
format, whereas models compiled by SageMaker Neo require numpy.ndarray
. Refer to the PyTorch pre-trained models documentation for input shape details.
Registering an Inference Component
Upload your inference script and files as a zip file to an S3 bucket:
cd vgg16-inference-component
zip vgg16-inference-component-1.0.0.zip inference.py imagenet_class_index.json
aws s3 cp vgg16-inference-component-1.0.0.zip s3://{YOUR_BUCKET}/artifacts/
Training with SageMaker
Install the SageMaker Python SDK:
pip install sagemaker
To queue a training job, create a script (training_job.py
) to define the job:
from sagemaker.pytorch import PyTorch
AWS_ACCOUNT_ID = '123456789012'
S3_BUCKET = f's3://sagemaker-ml-model-artifacts-{AWS_ACCOUNT_ID}-ap-northeast-1'
if __name__ == '__main__':
pytorch_estimator = PyTorch(
entry_point='training.py',
source_dir='./',
role='sagemaker-execution-role',
instance_count=1,
instance_type='ml.m5.large',
framework_version='1.10.0',
py_version='py38',
output_path=f'{S3_BUCKET}/models/trained',
hyperparameters={}
)
pytorch_estimator.fit()
The model will be stored in your S3 bucket as output/model.tar.gz
, after which it will be compiled and optimized with SageMaker Neo.
Compiling Models with SageMaker Neo
Initiate a SageMaker compilation job, which in this example, completed in approximately 4 minutes.
Define the input configuration as follows:
Field | Value |
---|---|
Artifact | S3 URI of model.tar.gz |
Input shape | Model input shape |
Framework | PyTorch |
Framework version | 1.8 |
For input shape details, the official documentation specifies:
All pre-trained models expect input images normalized in the same way, i.e., mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224.
Specify the output configuration based on your preferences.
rasp4b
as the Target device
, the models will be compiled for a 64-bit architecture. Consequently, these models cannot be loaded onto a 32-bit operating system, such as the Raspberry Pi 32-bit OS. In this scenario, ensure you use a 64-bit OS instead.
Although this detail is not mentioned in the official AWS documentation, it has been discussed in the AWS Forum. At the bottom of the page, you may observe the following:
The library libdlr.so compiled by SageMaker Neo with target rasp4b returns “ELF-64 bit LSB pie executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked.
If no specific configurations are required, you can proceed with the default settings.
Packaging Model with SageMaker Edge Manager
To package your model for deployment, create a SageMaker Edge Packaging Job.
Enter the SageMaker Neo compilation job name to proceed.
If you choose the Greengrass V2 component as the deployment preset, the compiled model will be:
- Registered as a Greengrass V2 component by SageMaker Edge.
- Saved to
/greengrass/v2/work/vgg16-component/
on the Greengrass Core device.
Setting Up Greengrass Core
Set up Greengrass Core on your edge device. In this example, an EC2 instance running Ubuntu 20.04.03 is used. For detailed installation instructions, refer to the AWS Greengrass Core documentation.
Please note that MQTT over TLS requires port 8883. If this port is not open, follow the manual setup guide.
Install JDK
sudo apt install default-jdk
java -version
Add a User and Group for Greengrass Core
sudo useradd --system --create-home ggc_user
sudo groupadd --system ggc_group
Configure AWS Credentials
# Set the credentials for greengrass-core-setup-user provisioned by CloudFormation
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
Install Greengrass Core
curl -s https://d2s8p88vqu9w66.cloudfront.net/releases/greengrass-nucleus-latest.zip > greengrass-nucleus-latest.zip
unzip greengrass-nucleus-latest.zip -d GreengrassInstaller && rm greengrass-nucleus-latest.zip
sudo -E java -Droot="/greengrass/v2" -Dlog.store=FILE \
-jar ./GreengrassInstaller/lib/Greengrass.jar \
--aws-region ap-northeast-1 \
--thing-name MyGreengrassCore \
--thing-group-name MyGreengrassCoreGroup \
--thing-policy-name GreengrassV2IoTThingPolicy \
--tes-role-name GreengrassV2TokenExchangeRole \
--tes-role-alias-name GreengrassCoreTokenExchangeRoleAlias \
--component-default-user ggc_user:ggc_group \
--provision true \
--setup-system-service true
Check the Greengrass Core Service
Ensure the device has at least 2GB of memory for optimal performance.
% sudo systemctl status greengrass
● greengrass.service - Greengrass Core
Loaded: loaded (/etc/systemd/system/greengrass.service; enabled; vendor preset: enabled)
Active: active (running) since Wed 2022-02-16 05:09:16 UTC; 1 day 2h ago
Main PID: 1454 (sh)
Tasks: 51 (limit: 2197)
Memory: 734.2M
CGroup: /system.slice/greengrass.service
Automatic Resource Provisioning
This guide uses automatic resource provisioning. If preferred, resources can also be manually provisioned using the manual provisioning guide.
Resource | Name |
---|---|
Thing | MyGreengrassCore |
Thing Group | MyGreengrassCoreGroup |
Thing Policy | GreengrassV2IoTThingPolicy |
Token Exchange Role | GreengrassV2TokenExchangeRole |
Role Alias | GreengrassCoreTokenExchangeRoleAlias |
Registering Greengrass Component for Edge Inference
Creating Component Recipe
To enable edge inference, create a recipe.yaml
file to register a Greengrass Component. For more details on component recipes, refer to the official documentation.
RecipeFormatVersion: '2020-01-25'
ComponentName: vgg16-inference-component
ComponentVersion: 1.0.0
ComponentDescription: Inference component for VGG16
ComponentPublisher: Iret
# Arguments to be passed
ComponentConfiguration:
DefaultConfiguration:
Interval: 60
# Dependencies which will be installed with this component
ComponentDependencies:
variant.DLR:
VersionRequirement: ">=1.6.5 <1.7.0"
DependencyType: HARD
vgg16-component:
VersionRequirement: ">=1.0.0"
DependencyType: HARD
Manifests:
- Name: Linux
Platform:
os: linux
Lifecycle:
Run:
RequiresPrivilege: true
Script: |
. {variant.DLR:configuration:/MLRootPath}/greengrass_ml_dlr_venv/bin/activate
python3 -u {artifacts:decompressedPath}/vgg16-inference-component-1.0.0/inference.py --interval {configuration:/Interval} --test_dir {work:path}/images/
Artifacts:
- Uri: s3://sagemaker-ml-model-artifacts-123456789012-ap-northeast-1/artifacts/vgg16-inference-component-1.0.0.zip
Unarchive: ZIP
In this example, Interval
specifies the time between inference runs.
Specifying Component Dependencies
Dependencies are listed in ComponentDependencies
. For this example, the following components are required:
variant.DLR
: Required for loading models compiled by SageMaker Neo. It includes a Python virtual environment located at/greengrass/v2/work/variant.DLR/greengrass_ml/greengrass_ml_dlr_venv
. More details are available in the official documentation.vgg16-component
: The model compiled by SageMaker Neo and registered by SageMaker Edge Manager.
Creating the Component
Once recipe.yaml
is complete, create the Greengrass component.
Deploying Greengrass Component
Configure Deployment
Press the Create
button.
Enter vgg16-inference-deployment
in the Name
field and press the Next
button.
Select Components to Deploy.
-
My Components:
vgg16-component
: The VGG16 model packaged by SageMaker Edge Manager.vgg16-inference-component
: The inference component.
-
Public Components:
variant.DLR
: Required for loading models.aws.greengrass.Nucleus
: Core functionality for Greengrass.
Configure Components
Press Next
without making configuration changes.
Review and Deploy
After reviewing the deployment configuration, press Deploy
to start deploying the components.
Testing
To test inference on your Greengrass Core, follow these steps:
-
Prepare Input Data:
Pre-trained PyTorch models require a 4-dimensional tensor(N, C, H, W)
as input. Convert your image into a Numpy array. For more details, refer to the official documentation. -
Transfer Data to Greengrass Core:
Move the converted data to the directory/greengrass/v2/work/vgg16-inference-component/images/
on your Greengrass Core device. -
View Inference Logs:
Check the file/greengrass/v2/logs/vgg16-inference-component.log
for inference results on your Greengrass Core device.
Python Script to Convert Images to Numpy Array
You can use the following Python script to prepare images for inference:
import argparse
import os
from PIL import Image
import numpy as np
import torch
from torchvision import transforms
def load_image_to_tensor(path: str) -> torch.Tensor:
preprocess = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
img = Image.open(path)
tensor_3d = preprocess(img)
return torch.unsqueeze(tensor_3d, 0)
def save(tensor: torch.Tensor, path: str) -> None:
np.save(path, tensor.numpy())
def parse_args() -> argparse.Namespace:
parser = argparse.ArgumentParser()
parser.add_argument('image', type=str)
args, _ = parser.parse_known_args()
return args
if __name__ == '__main__':
args = parse_args()
image = args.image
tensor = load_image_to_tensor(image)
save(tensor, os.path.basename(image) + '.npy')
Run the Script
Use the script to convert an image to Numpy array format:
python convert_img_to_npy.py <YOUR_IMAGE>
Transfer Data
Transfer the converted Numpy array to your Greengrass Core device:
scp xxx.jpg.npy <GREENGRASS_HOST>://greengrass/v2/work/vgg16-inference-component/images/
Check Inference Logs
SSH into the Greengrass Core device and check inference logs:
ssh <GREENGRASS_HOST>
tail -f /greengrass/v2/logs/vgg16-inference-component.log
Example Inference Results
Below are examples of inference results logged in /greengrass/v2/logs/vgg16-inference-component.log
:
2022-02-19T21:32:21.993Z [INFO] (Copier) vgg16-inference-component: stdout. Prediction result of /greengrass/v2/work/vgg16-inference-component/images/keyboard.jpg.npy: ['n03085013', 'computer_keyboard']. {scriptName=services.vgg16-inference-component.lifecycle.Run.Script, serviceName=vgg16-inference-component, currentState=RUNNING}
2022-02-19T21:32:22.257Z [INFO] (Copier) vgg16-inference-component: stdout. Prediction result of /greengrass/v2/work/vgg16-inference-component/images/pen.jpg.npy: ['n03388183', 'fountain_pen']. {scriptName=services.vgg16-inference-component.lifecycle.Run.Script, serviceName=vgg16-inference-component, currentState=RUNNING}
The inference results for the images below are as follows:
- Image:
computer_keyboard
- Image:
fountain_pen
Conclusion
The integration of AWS IoT Greengrass V2 and SageMaker empowers developers to bring intelligent machine learning capabilities to edge devices efficiently. Throughout this guide, we explored the lifecycle of edge AI inference, from model training and optimization to deployment and testing on Greengrass Core devices.
Happy Coding! 🚀