Evaluating Apache Spark and Alluxio for Data Analytics

Benchmarking Recommendations and Results

This whitepaper details how to evaluate Alluxio’s data orchestration platform as a distributed cache for Apache Spark in a public cloud or on-premises. We discuss best practices and benchmarking results with a combination of standard industry benchmarking suites, such as TPC-DS and HiBench, on cloud storage. This guide serves as a reference for reproducing similar experiments in your own environment as part of a Proof of Concept (PoC) to evaluate the use of Alluxio with Apache Spark.

Evaluating Apache Spark and Alluxio for Data Analytics

Benchmarking Recommendations and Results

This whitepaper details how to evaluate Alluxio’s data orchestration platform as a distributed cache for Apache Spark in a public cloud or on-premises. We discuss best practices and benchmarking results with a combination of standard industry benchmarking suites, such as TPC-DS and HiBench, on cloud storage. This guide serves as a reference for reproducing similar experiments in your own environment as part of a Proof of Concept (PoC) to evaluate the use of Alluxio with Apache Spark.

Download

Complete the form below to access the full overview:

Whitepaper

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer