In my last post, I detailed setting up a simple web page powered with AWS Cognito to make predictions against an AWS Machine Learning hosted model. While the UX on that page is not exactly delightful, it is functional. Since this is running under my own personal account structure, I want to ensure that costs for running a solution exposed to entire internet are managed. Costs for AWS ML are relatively minor, with the current toy OSHA model costing approximately one cent an hour to run and each individual prediction cost 1/100th of a cent. Hardly going to make a dent in the household budget, yet the concepts applied for this problem go to potentially much large environments (large EMR clusters, big GPU-enabled EC2 instances, etc.) This post discusses a small CloudWatch and Lambda solution that monitors the use of the prediction endpoint and shuts down the environment when not in use. As usual, code is provided to follow along.

My first move on managing this solution was to leave the endpoint in a shutdown state by default. When a visitor comes to the osha-challenge site, the AWS JavaScript SDK checks for the status of the endpoint. If the endpoint is not in an available state, the prediction functions are disabled and the button to start the endpoint is enabled. Clicking on the start endpoint button sends a API request to start the real-time prediction endpoint and sets an auto-refresh on the page.

Once the prediction endpoint is enabled, all visitors can fill in values into the parameters and try out the prediction feature. A CloudWatch alarm is set on the AWS/ML/Predict metric and is triggered when the number of predictions requested over the past 45 minutes is zero. Once triggered, a message is sent to SNS, which forwards the message to a Python AWS Lambda function (code available in this gist). This small script calls up the AWS ML API and shuts down the endpoint.

The entire infrastructure is configured and maintained with Terraform, with the code available in this public GitHub repository. I originally wanted to use a serverless framework such as Apex to provide for environment variable injection, but I hit a roadblock with builds seeming to be broken under Windows. For a small example like this, rolling a Lambda ZIP artifact is a trivial exercise. I was only forced to hard code the parameter for the ML model into the deployed script rather than using environment variable injection, which tools such as Apex make much easier.

I’ve used Lambda a few times in the past, mainly for infrastructure-type tasks such as creating custom CloudFormation resources and relaying SNS messages to Slack. Using Lambda to help manage costs is a very common pattern . I’m still collecting my thoughts on the serverless frameworks and hope to do another post soon covering my experiences. For now, I’m happy to have the basic environment up and managed, allowing me to go back to R Studio for a while and work with the data without having to worry about infrastructure.