Best Practice in Accelerating Data Applications with Spark+Alluxio
October 12, 2021
By 
David Zhu

Apache Spark and Alluxio were both born in UC Berkeley’s AMPLab as research projects.  As an open source data orchestration platform, Alluxio is able to achieve seamless docking and acceleration of different data sources, and improve the efficiency and fault tolerance of Spark’s big data computing business.

Alluxio has been deployed and running on a large scale managing petabytes level data in the production environment of companies such as Microsoft, Tiktok, Tencent, Singapore Development Bank, China Unicom, etc.

This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.

ALLUXIO DAY VI 2021

October 12, 2021

Apache Spark and Alluxio were both born in UC Berkeley’s AMPLab as research projects.  As an open source data orchestration platform, Alluxio is able to achieve seamless docking and acceleration of different data sources, and improve the efficiency and fault tolerance of Spark’s big data computing business.

Alluxio has been deployed and running on a large scale managing petabytes level data in the production environment of companies such as Microsoft, Tiktok, Tencent, Singapore Development Bank, China Unicom, etc.

This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.

Video:

Presentation Slides:

Best Practice in Accelerating Data Applications with Spark+Alluxio from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer