New Alluxio Enterprise AI Innovations Accelerate GPUs Anywhere with 97%+ GPU Utilization

July 9, 2024

Features new native integration with Python ecosystem and expanded cache management

‍SAN MATEO, CA – July 9, 2024 - Alluxio, the developer of the open-source data platform, today announced the immediate availability of the latest enhancements in Alluxio Enterprise AI. Version 3.2 showcases the platform's capability to utilize GPU resources universally, improvements in I/O performance, and competitive end-to-end performance with HPC storage. It also introduces a new Python interface and sophisticated cache management features. These advancements empower organizations to fully exploit their AI infrastructure, ensuring peak performance, cost-effectiveness, flexibility and manageability.

AI workloads face several challenges, including the mismatch between data access speed and GPU computation, which leads to underutilized GPUs due to slow data loading in frameworks like Ray, PyTorch and TensorFlow. Alluxio Enterprise AI 3.2 addresses this by enhancing I/O performance and achieving over 97% GPU utilization. Additionally, while HPC storage provides good performance, it demands significant infrastructure investments. Alluxio Enterprise AI 3.2 offers comparable performance using existing data lakes, eliminating the need for extra HPC storage. Lastly, managing complex integrations between compute and storage is challenging, but the new release simplifies this with a Pythonic filesystem interface, supporting POSIX, S3, and Python, making it easily adoptable by different teams.

"At Alluxio, our vision is to serve data to all data-driven applications, including the most cutting-edge AI applications," said Haoyuan Li, Founder and CEO, Alluxio. "With our latest Enterprise AI product, we take a significant leap forward in empowering organizations to harness the full potential of their data and AI investments. We are committed to providing cutting-edge solutions that address the evolving challenges in the AI landscape, ensuring our customers stay ahead of the curve and unlock the true value of their data."

Alluxio Enterprise AI includes the following key features:

Leverage GPUs Anywhere for Speed and Agility - Alluxio Enterprise AI 3.2 empowers organizations to run AI workloads wherever GPUs are available, ideal for hybrid and multi-cloud environments. Its intelligent caching and data management bring data closer to GPUs, ensuring efficient utilization even with remote data. The unified namespace simplifies access across storage systems, enabling seamless AI execution in diverse and distributed environments, allowing for scalable AI platforms without data locality constraints.
Comparable Performance to HPC Storage - MLPerf benchmarks show Alluxio Enterprise AI 3.2 matches HPC storage performance, utilizing existing data lake resources. In tests like BERT and 3D U-Net, Alluxio delivers comparable model training performance on various A100 GPU configurations, proving its scalability and efficiency in real production environments without needing additional HPC storage infrastructure.
Higher I/O Performance and 97%+ GPU Utilization - Alluxio Enterprise AI 3.2 enhances I/O performance, achieving up to 10GB/s throughput and 200K IOPS with a single client, scaling to hundreds of clients. This performance fully saturates 8 A100 GPUs on a single node, showing over 97% GPU utilization in large language model training benchmarks.New checkpoint read/write support optimizes training recommendation engines and large language models, preventing G
New Filesystem API for Python Applications - Version 3.2 introduces the Alluxio Python FileSystem API, an FSSpec implementation, enabling seamless integration with Python applications. This expands Alluxio's interoperability within the Python ecosystem, allowing frameworks like Ray to easily access local and remote storage systems.
Advanced Cache Management for Efficiency and Control - The 3.2 release offers advanced cache management features, providing administrators precise control over data. A new RESTful API facilitates seamless cache management, while an intelligent cache filter optimizes disk usage by caching hot data selectively. The cache free command offers granular control, improving cache efficiency, reducing costs, and enhancing data management flexibility.

"The latest release of Alluxio Enterprise AI is a game-changer for our customers, delivering unparalleled performance, flexibility, and ease of use," said Adit Madan, Director of Product at Alluxio."By achieving comparable performance to HPC storage and enabling GPU utilization anywhere, we're not just solving today's challenges – we're future-proofing AI workloads for the next generation of innovations. With the introduction of our Python FileSystem API, Alluxio empowers data scientists and AI engineers to focus on building groundbreaking models without worrying about data access bottlenecks or resource constraints."

“We have successfully deployed a secure and efficient data lake architecture built on Alluxio. This strategic initiative has significantly enhanced the performance of our compute engines and simplified data engineering workflows, making data processing and analysis seamless and more efficient,” said Hu Zhicheng, Data Architect at Geely (parent company of Volvo). “We are honored to collaborate with Alluxio in creating an industry-leading data and AI platform, driving the future of data-driven intelligent development.”

Availability

‍Alluxio Enterprise AI version 3.2 is immediately available for download here: https://www.alluxio.io/download/.

Supporting Resources

Download a trial version: https://www.alluxio.io/download/
Product announcement blog: https://www.alluxio.io/blog/whats-new-in-3-2/
Webinar registration link: https://us06web.zoom.us/webinar/register/WN_Hg7hQoBBTHObfbH8dTI3Hw#/registration
Documentation: https://docs.alluxio.io/ee-ai/user/stable/en/Overview.html
GPU utilization rate testing tool: https://www.alluxio.io/gpu-test-tool/

About Alluxio

Alluxio, a leading provider of the high performance data platform for analytics and AI, accelerates time-to-value of data and AI initiatives and maximizes infrastructure ROI. Uniquely positioned at the intersection of compute and storage systems, Alluxio has a universal view of workloads on the data platform across stages of a data pipeline. This enables Alluxio to provide high performance data access regardless of where the data resides, simplify data engineering, optimize GPU utilization, and reduce cloud and storage costs. With Alluxio, organizations can achieve magnitudes faster model training and serving without the need for specialized storage, and build AI infrastructure on existing data lakes. Backed by leading investors, Alluxio powers technology, internet, financial services, and telecom companies, including 9 out of the top 10 internet companies globally. To learn more, visit www.alluxio.io.

Media Contact:
Beth Winkowski
Winkowski Public Relations, LLC for Alluxio
978-649-7189
beth@alluxio.com

News & Press

Storage Newsletter | Alluxio Partners with vLLM Production Stack to Accelerate LLM Inference

Joint solution moves beyond traditional 2-tier memory management, enabling efficient KV Cache sharing across GPU, CPU, and distributed storage layer.

‍

Alluxio: Speeding AI inference & training data with portable caching

Since our last coverages of Alluxio in 2023, 2021 and 2019, this cloud native distributed cache vendor has found itself a new lane in the exploding AI/ML space, as model training runs and AI inference workloads can benefit the most from access to speedier local NVMe drives and streams without incurring excess cloud tolls for each ingress or API call.

Alluxio and vLLM Production Stack Partner to Enhance LLM Inference Performance

Alluxio, a leading data platform provider for AI and analytics, has partnered with vLLM Production Stack, an open-source serving system developed by LMCache Lab at the University of Chicago, to significantly accelerate large language model (LLM) inference. This collaboration integrates advanced KV Cache management, dramatically enhancing AI infrastructure by providing faster response times, improved scalability, and cost-effective deployment options for enterprise applications.

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo