Alluxio Community Newsletter

October 2023

Introducing Alluxio Enterprise AI

Alluxio Enterprise AI brings together performance, data accessibility, scalability, and cost-efficiency to fuel next-generation data-intensive applications. As the celebration continues for the release of Alluxio Enterprise AI, hear what our team members have to say on what they are most excited about for this launch!

Watch Announcement Video | Read Press Release

Download Free Trial

Product Vision Blog | Introducing Alluxio Enterprise AI and A Vision Beyond Unintelligent Storage

Alluxio’s Senior Solutions Engineer Roland Theron shares how Alluxio benefits model training workflows by reducing data loading times, allowing for better utilization of your compute resources.

Read Now

Product Engineering Blog | Introducing DORA: The Next-generation Alluxio Architecture

Alluxio’s Chief Architect Bin Fan and Senior Staff Engineer Beinan Wang discuss the development of the DORA architecture, including our motivation, design decisions, and implementation.

Read Now

Product Demo | End-to-End Machine Learning Pipeline with Alluxio

Watch the Alluxio Enterprise AI end-to-end ML pipeline demo, and see for yourself the significant performance improvements as well as increased GPU utilization! Alluxio’s Solution Engineer Tarik Bennett walks through a short end-to-end machine learning pipeline with Alluxio provisioned or mounted as a local folder for PyTorch dataloader.

Watch Now

Product Demo | Solving the Data Loading Challenge for Machine Learning with Alluxio

Alluxio’s Senior Solutions Engineer Roland Theron shares how Alluxio benefits model training workflows by reducing data loading times, allowing for better utilization of your compute resources.

Watch Now

AI Infra Day Recap

Thank you for joining us for AI Infra Day 2023! Check out insightful presentations from Uber, Meta, Intel and Alluxio on the challenges and various approaches to building scalable AI infrastructure.

Watch On-demand

Mini Videos

Words From Alluxio’s CEO Haoyuan Li

Alluxio’s Founder and CEO Haoyuan Li reveals how Alluxio Enterprise AI fits into the journey of Alluxio and what the future holds for Alluxio.

What’s New in Alluxio Enterprise AI

Alluxio’s Director of Product Management Adit Madan shares product and feature highlights of Alluxio Enterprise AI, including enhancements in performance and resource efficiency with POSIX API.

Customer Success at Alluxio

Alluxio’s SVP of Customer Success Omid Razavi describes what customer success looks like at Alluxio and how we continuously work with our customers to ensure their success.

DORA: Alluxio’s Next-Gen Architecture for AI

Alluxio’s Founding Engineer & VP of Open Source Bin Fan and Architect Beinan Wang introduce Alluxio’s new architecture and the driving force behind revamping it.

We have new videos releasing every 2 weeks. Subscribe to our channel and stay tuned!

GOOD READS

ITOpsTimes | GPUs Are Fast, I/O is Your Bottleneck

Modern GPUs enable faster model training. However, I/O speed is an often overlooked bottleneck. If data cannot be fed to the GPU at the rate that matches its computations, GPU cycles are wasted waiting for data. This article raises awareness of I/O bottlenecks and provides architectural guidance to optimize the data pipeline to maximize GPU value.

Read Now

A Deep Dive into Caching in Presto

Caching avoids expensive disk or network trips to refetch data by storing frequently accessed data in memory or on fast local storage, speeding up overall query execution. In this article, we provide a deep dive into Presto’s caching mechanisms and how you can use them to boost query speeds and reduce costs.

Read Now

Upcoming Events

Ray Meetup @ PingCAP | Wednesday, November 1, 5:30 PM @ Sunnyvale

Beinan Wang, Architect of Alluxio, will be speaking at the upcoming in-person Ray meetup in Sunnyvale on Wednesday, November 1st. He will share how Alluxio distributed caching can integrate with Ray in single-region/cloud and multi-region/cloud scenarios.

Register Now

Past events on-demand

Webinar On-Demand | Efficient Data Loading for Model Training on AWS

In this webinar, Greg Palmer discusses best practices for efficient data loading during model training on AWS. He demonstrates how to use Alluxio on EKS as a distributed cache to accelerate PyTorch training jobs that read datasets from S3. This architecture significantly improves the utilization of GPUs from 30% to 90%+, archives ~5x faster training, and lower cloud storage costs.

Watch Now

Got a tech question for the Alluxio Community? Chat with us on Slack!

Be our stargazers on GitHub ⭐

If you like our product, please give it a star on GitHub, and share the goodness!

WHITEPAPERs

“Zero-Copy” Hybrid Bursting with no App Changes:

Alluxio Architecture and Data Flow

Evaluating Apache Spark and Alluxio for Data Analytics Benchmarking Recommendations and Results

Spark with Alluxio Overview – Pair Spark with Alluxio to Modernize Your Data Platform

Presto with Alluxio Overview – Architecture Evolution for Interactive Queries

Accelerating Machine Learning / Deep Learning in the Cloud: Architecture and Benchmark

HOT JOBS

We currently have 30+ opportunities across the globe! Learn more about our job openings in Customer Success, Sales, Product, and Engineering teams. Are you awesome or know of anyone to refer? Check out the full list of opportunities and apply here.

Senior Account Support Engineer (San Mateo, California)

Senior Solutions Engineer (San Mateo, California)

Senior Account Executive (San Mateo, California)

Software Engineering Manager (San Mateo, California)

‍

Slack is our main hub to receive technical support as you use Alluxio and to stay up to date with our latest news and events

Join Slack

We host monthly in-person and online events, come meet the Alluxio team and indulge in technical discussions with data and AI/ML enthusiasts

See upcoming events

We welcome you to contribute to the Alluxio Open Source project!

Contribute

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo