Increase Data Processing Throughput of a Step Function with Dynamic Parallelism

Is your data growing beyond what a Lambda-based Step Function Lambda can handle? Are you having problems with AWS Step Functions timing out in 900 seconds because your Lambda function was too slow at processing data? Maybe your data is large but not large enough to justify using Apache Spark yet. While you can add a retry logic this does not make your processing faster. One way to process data faster and to get around the 900 second time out is by incorporating dynamic parallelism into your AWS Step Function State Machine. The particular pattern to use is the fan-out pattern in building your State Machine to handle Dynamic Parallelism.

How does it work?

In a state machine without Dynamic Parallelism, only one instance of the Lambda will run. With the fan out Dynamic Parallelism, you can have multiple instances of your Lambda function running in parallel ie the function fans out. Imagine if you had Lambda function that batch processed logs daily. If there are too many logs or records, your Lambda will time out in 900 seconds or 15 mins. What if you've determined that if you only ran one instance of the Lambda with retry logic, the Lambda still takes 150 mins? If you can implement the fan out method, you can set to have 10 instances of the Lambda function running at the same time. This ends up looking like how Spark works. Each Lambda, like a worker in a Spark cluster will process a subset of data in parallel, effectively reducing the time needed to process that all the data.

General Steps Needed to Implement Dynamic Parallelism:

To use Dynamic Parallelism, you need to use the Map state for your Step Function State Machine. Using the Map State, you can setup up an iterator to pass in parameters to your multiple Lambda instances on what subset of data to process. One way to determine what subset each Lambda instance should process is if you have a step before the Map step. This step before the Map step, you can have another Lambda function inspect your data and create partitions as the Lambda function's ouput. Then the you have your Map step tell your mulitple instances of your processing Lambda function on which partition of data it needs to process.

More information on Dynamic Parallelism can be found on here .