On-Demand Videos

video

Building an Efficient AI Training Platform at bilibili with Alluxio

In this talk, Lei Li and Zifan Ni share the experience of applying Alluxio in their AI platform to increase training efficiency at bilibili. The talk also includes technical architecture and specific issues addressed.

Watch now

video

Architecting a Heterogeneous Data Platform Across Clusters, Regions, and Clouds

Data platform teams are increasingly challenged with accessing multiple data stores that are separated from compute engines, such as Spark, Presto, TensorFlow or PyTorch. Whether your data is distributed across multiple datacenters and/or clouds, a successful heterogeneous data platform requires efficient data access. Alluxio enables you to embrace the separation of storage from compute and use Alluxio data orchestration to simplify adoption of the data lake and data mesh paradigms for analytics and AI/ML workloads.

Join Alluxio’s Sr. Product Mgr., Adit Madan, to learn:

Key challenges with architecting a successful heterogeneous data platform
How data orchestration can overcome data access challenges in a distributed, heterogeneous environment
How to identify ways to use Alluxio to meet the needs of your own data environment and workload requirements

Watch now

video

Industrial Bank’s Alluxio Deployment

ALLUXIO DAY IX 2022 January 21, 2022 Video: Presentation Slides: Industrial Bank's Alluxio Deployment from Alluxio, Inc. ‍

Watch now

video

Vipshop Offline Data Cache Acceleration System – Alluxio Integration

ALLUXIO DAY IX 2022 January 21, 2022 Video: Presentation Slides: Vipshop Offline Data Cache Acceleration System – Alluxio Integration from Alluxio, Inc. ‍

Watch now

video

The Evolution of an Open Data Platform with Alluxio

ALLUXIO DAY IX 2022 January 21, 2022 Video: Presentation Slides: The Evolution of an Open Data Platform with Alluxio from Alluxio, Inc. ‍ ‍

Watch now

video

Alluxio + Spark: Accelerating Auto Data Tagging in WeRide

Feifei Cai & Hao Zhu from WeRide provide an overview of Alluxio + Spark use case, which has been deployed and running in production to accelerate auto data tagging in the autonomous driving development.

Watch now

video

Building an Open Data Platform with Apache Iceberg

This talk will introduce Apache Iceberg and its place in a modern and open data platform. It will cover the motivation for creating Iceberg at Netflix, as well as the data architecture that Iceberg makes possible.

Watch now

video

Iceberg + Alluxio for Fast Data Analytics

This talk provides an overview of the read-after-write data consistent mechanism in the Alluxio system. Alluxio Core Maintainer and Presto Committer share their recent work on Alluxio and Apache Iceberg integration, as well as some recent work from the Presto community on Iceberg connector.

Watch now

video

Best Practice in Accelerating Data Applications with Spark+Alluxio

Apache Spark and Alluxio were both born in UC Berkeley’s AMPLab as research projects. As an open source data orchestration platform, Alluxio is able to achieve seamless docking and acceleration of different data sources, and improve the efficiency and fault tolerance of Spark’s big data computing business.

Alluxio has been deployed and running on a large scale managing petabytes level data in the production environment of companies such as Microsoft, Tiktok, Tencent, Singapore Development Bank, China Unicom, etc.

This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.

Watch now

video

Apache Hudi : The Path Forward

In this talk, we will provide a complete picture of the Hudi platform components, along with their unique design choices. We will then deep dive into two important areas of active development going forward – table metadata management and caching. Specifically, we will discuss gaps in the data lake ecosystem around these aspects and provide strawman design approaches for Hudi aims to solve them going forward.

Watch now

video

Enabling Presto Caching at Uber with Alluxio

This talk discusses the opportunities and problems when Uber meets Alluxio. Zhongting from Uber will provide an overview of Uber traffic, cloud, distribution, invalidation, and consistent hashing. Beinan from Alluxio will provide a deep dive of metadata and monitoring metrics.

Watch now

video

Improve Presto Architectural Decisions with Shadow Cache

This talk describes the design of shadow cache, a lightweight component to track the working set size of Alluxio cache. Shadow cache can keep track of the working set size over the past window dynamically, and is implemented by a series of bloom filters. We’ve deployed the shadow cache in Facebook Presto and leverage the result to understand the system bottleneck and help with routing design decisions.

Watch now

Alluxio Enterprise AI

Alluxio Enterprise Data

On-Demand Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer