Subscribe
Highlights
Mengyu Hu and Chengkun Jia, both from Zhihu’s data platform team, discuss their evolution from HDFS to Alluxio as a high-performance data access layer for LLM training and serving. Alluxio has accelerated model training by 2~3x, increased GPU utilization to 90%, and enabled model deployment every minute instead of hours or days.
Alluxio Enterprise Release | 2.10 is here!
We are thrilled to announce the release of Alluxio Enterprise 2.10! We have made significant progress in improving high availability and reducing resource consumption. You can now scale Alluxio with improved speed at lower cost.
Great Things with Great Tech Podcast | Fast and Efficient Hybrid Data Access with Alluxio
Listen to this latest podcast to learn about how the field of analytics and AI has been changing, the key challenges of and different approaches to addressing the needs of the data platform.
TDWI article | Executive Q&A on Controlling Cloud Egress Costs
As enterprises move to the cloud, many are getting sticker shock from out of control cloud costs. This Q&A covers several best practices that can help reduce egress costs as businesses scale their cloud usage and continue to evolve their data platform architecture.
Mini Video Series
We have new videos releasing every 2 weeks. Subscribe to our channel and stay tuned!
Getting Started with Alluxio on Kubernetes
Getting Started with Alluxio on Kubernetes is complete! Learn about the architecture, deployment and best practices of Alluxio on Kubernetes with Shawn Sun, Software Engineer at Alluxio.
- Part I | Alluxio on Kubernetes Architecture
- Part II | Deploy Alluxio on Kubernetes
- Part III | Alluxio on Kubernetes Best Practice
Expedia Group’s User Journey with Alluxio
We have the latest mini video series coming out! Explore Expedia Group’s data landscape, see why data replication was not the right solution, and learn how Expedia Group reduced egress costs by unifying cross-region access in the cloud.
Part I | Explore Expedia Group’s Data Landscape
Past events on-demand
On-demand Webinar | Maximize GPU Utilization for Model Training
When training models on ultra-large datasets, one of the biggest challenges is low GPU utilization. These powerful processors are often underutilized due to inefficient I/O and data access. This mismatch between computation and storage leads to wasted GPU resources, low performance, and high cloud storage costs. The rise of generative AI and GPU scarcity is only making this problem worse.
In this webinar, Tarik and Beinan discuss strategies for transforming idle GPUs into optimal powerhouses. They will focus on cost-effective management of ultra-large datasets for AI and analytics.
June was a month packed full of talks! Take a look at what our team has been up to:
- Presto Con Day
- Speeding Up Presto in ByteDance – Shengxuan Liu, Bytedance & Beinan Wang, Alluxio
- Presto on ARM – Chunxu Tang & Jiaming Mai, Alluxio
- Trino Fest | Trino Optimization With Distributed Caching on Data Lake – Beinan Wang & Hope Wang, Alluxio
- Data + AI Summit | Data Caching Strategies for Data Analytics and AI – Beinan Wang & Chunxu Tang
Upcoming Events
[New Weekly Event] | Alluxio PR Power
We have a new WEEKLY event called Alluxio PR Power Hour! Get live feedback on your Github PRs/Issues or join to learn about what others are working on. Every Thursday 8pm PDT // Friday 11am CST. You can find more details and past meeting notes here.
By David Loshin | President of Knowledge Integrity
In today’s competitive landscape, companies are eager to harness the power of AI for competitive advantage. However, efforts to effectively access and utilize GPUs often lead to extensive data engineering managing data copies or specialized storage leading to out of control cloud and infra costs. Join us for this TDWI webinar to learn more about the infrastructure hurdles associated with AI/ML model training and deployment and how to overcome these challenges. Topics include:
- The challenges of AI and model training
- GPU utilization in machine learning and the need for specialized hardware
- Managing data access and maintaining a source of truth in data lakes
- Best practices for optimizing ML training
Got a tech question for the Alluxio Community? Chat with us on Slack!
WHITEPAPERS
Be our stargazers on GitHub ⭐
If you like our product, please give it a star on GitHub, and share the goodness!
HOT JOBS
We currently have 30+ opportunities across the globe! Learn more about our job openings in Customer Success, Sales, Product, and Engineering teams. Are you awesome or know of anyone to refer? Check out the full list of opportunities and apply here.
Senior Account Support Engineer (San Mateo, California)
Senior Solutions Engineer (San Mateo, California)
Senior Account Executive (San Mateo, California)
Software Engineering Manager (San Mateo, California)