Luca Cavazzana



Min $: Optimizing Spark Costs on AWS with AI-Powered Tuning

Big data performance tuning on cloud infrastructure involves complex trade-offs. We use AI techniques to identify the optimal ones.

In this session, we’ll showcase how we used AI-powered techniques to cut AWS Elastic Map Reduce costs required to run batch jobs on an Apache Spark big data implementation. The target application is a business intelligence application for the video-on-demand industry. The intervention resulted in cost savings of over 40%.
Performance tuning for big data frameworks can be challenging. The sheer number of parameters on different layers (i.e., Spark framework, JVM, YARN, etc.) and their interdependencies make predicting and optimizing performance immensely complex.

Running big data applications on the cloud adds further complexity, with even more options to find the optimal cluster configuration, such as instance family, size and number.

As a result, teams have to rely on vendor guidelines and generic rules-of-thumbs, which may lead to wasting the potential of an expensive cluster.

Our approach uses automation and AI techniques to iteratively identify optimal stack configurations regardless of its complexity. In this study, we tuned both Apache Spark parameters and EC2 cluster size, finding an optimal trade-off between resource allocation and execution time that minimizes the overall cost.

About Luca

Luca is Software Engineer at Akamas, a company dedicated to developing autonomous performance optimization solutions. Before Akamas, he was Technical Solutions Engineer at ContentWise, where he worked to deploy analytics and personalization solutions for some of the largest media and streaming brands worldwide. Luca’s background is in DevOps and cloud infrastructures.
He holds a Master’s degree in Computer Engineering from Politecnico di Milano.