AWS Lambda batch jobs

posted: August 20, 2022

tl;dr: AWS’s Lambda service makes it easy to write, test, and deploy batch jobs...

I laugh every time I contemplate doing it. It seems like a complete misapplication of technology. And yet I cannot help myself: it is so easy, and works so well, that I just go ahead and do it. “It” is using Amazon Web Service’s Lambda function-as-a service to do what is basically just a batch job: a time-of-day based job that runs on a fixed schedule, and performs some basic housekeeping tasks.

In the punch card era of the 1960s and 1970s, which predates me, all jobs were batch jobs. A programmer handed over a deck of punch cards containing the program to the mainframe operator, who would run the cards through the computer at some later point in time. Often, for undergraduate students at the bottom of the university hierarchy, this was in the middle of the night. A programmer would stop by the computer room the next morning to pick up the printout produced by the job, and to retrieve the deck. If the programmer made a syntax error, the printout would denote that fact, and there’d be an opportunity that night to fix it and try again. Hence the debugging cycle time was once every 24 hours.

Lambda is one of AWS’s most innovative services, because it allows code to be run in the cloud with hardly any concern about server setup or configuration. The main limitation is that the job has to finish within 15 minutes, which is the maximum lifetime of a Lambda instance. You may have to fine tune the memory size and increase the allowed time for the job, perhaps up to the 15 minute limit. You also have to set up an IAM role for the Lambda, and give it permissions to interact with other AWS services and objects needed by the job, but that’s pretty easily done.

A deck of computer punch cards bound by a rubber band, with writing in red on the side of the deck

What does AWS Lambda have in common with this deck of punch cards?

A Lambda has to be triggered by something, and one of the possible triggers is a time-of-day timer. Lambdas themselves are stateless, but they can change the state of other things in AWS and they can do input and output. It is also possible for a Lambda to change its own environment variables; this can be a quick-and-dirty way to store a job’s bookmark. AWS’s Simple Notification Service can be used to compose and send an email from a Lambda. So by triggering the Lambda with a timer, and sending an email at the conclusion of the job, you get the equivalent of a 1960s era mainframe batch job, with the email functioning as the job printout.

This is not at all the primary use case for Lambdas. They are typically triggered by events other than a timer, such as something else happening in another AWS service. Lambdas provide great scalability and multiple instances can easily run concurrently, whereas a simple batch job just needs a single instance. But because of their great power and flexibility, Lambdas can be used to run simple batch jobs.

Some of the tasks I have used Lambda batch jobs for:

Wake up in the middle of the night, see if certain expected things happened during the work day, and perhaps send appropriately worded emails to notify people
Wake up in the middle of the night, check an S3 bucket for new directories and files, clean up and move the files, and notify people via email
Wake up in the middle of the night, check on the status of various subsystems, and send an email summarizing where things stand

AWS Lambda supports a variety of languages, but for devops housekeeping tasks I find Python the easiest to use. The Lambda Python runtime environment comes with AWS’s excellent, richly-featured boto3 library built in; there is no need to package it with your code into the Lambda. boto3 lets you do nearly anything you’d want to programmatically do in AWS. Between boto3 and the rich Python standard library, with its support for operating system functions, file management (e.g. ZIP and CSV), and many other features, I usually don’t have to package anything at all into the Lambda. If I do need to use a third-party library from pypi, such as the requests HTTP client, it is not too hard to package it and use Layers to bring it into the Lambda runtime. This keeps the code in the Lambda to the bare minimum: it is just the Python script with the desired imports at the top.

This makes it easy to incrementally develop the Python script. I write it in a text editor (Visual Studio Code), copy/paste it into the code window for the AWS Lambda using keyboard shortcuts, press the Deploy button, wait a second or two, then press the Test button to run it. So it’s just the tiniest bit harder and slower than testing a script locally in the Python REPL, but you can still do rapid fire incremental development and testing.

When I’m done, I just let the timer take over. The job can be monitored by checking for the email produced by the job, or by looking in AWS CloudWatch, or by having the job produce some other type of notification, such as a Slack post. It’s hard to imagine how it could be much easier.