On September 13th, we held our firstNew York City Alluxio Meetup!Work-Benchwas very generous for hosting the Alluxio meetup in Manhattan. This was the first US Alluxio meetup outside of the Bay Area, so it was extremely exciting to get to meet Alluxio enthusiasts on the east coast! The meetup focused on users of Alluxio with different applications from Hive and Presto. As an introduction, Haoyuan Li (creator and founder of Alluxio) and Bin Fan (founding engineer of Alluxio) gave an overview of Alluxio and the new features and enhancements of the new v1.8.0 release. Next, Tao Huang and Bing Bai fromJD.com, one of the largest e-commerce companies in China, shared how they have been running Presto and Alluxio in production for almost a year. Their big data platform is running Alluxio on over 100 machines, and can achieve speed ups of over 10x. They also discussed their open source contributions to the Alluxio community and their plans for future work. Thai Bui fromBazaarvoice, a digital marketing company in Texas, presented how they effectively cache S3 data with Alluxio for Hive queries. By using Alluxio to serve their S3 data, they experienced 5x-10x speedups in their Hive queries. The talk slides are online:
- Alluxio: An overview and what's new in 1.8 (Haoyuan Li, Bin Fan)
- Using Alluxio as a fault-tolerant pluggable optimization component of JD.com's compute frameworks (Tao Huang and Bing Bai)
- Hybrid collaborative tiered-storage with Alluxio (Thai Bui)
We had a great time learning more about Alluxio use cases, and interacting with Alluxio users on the east coast! We look forward to the next chance to hold another NYC Alluxio meetup!
Blog
We are thrilled to announce the general availability of Alluxio Enterprise for Data Analytics 3.2! With data volumes continuing to grow at exponential rates, data platform teams face challenges in maintaining query performance, managing infrastructure costs, and ensuring scalability. This latest version of Alluxio addresses these challenges head-on with groundbreaking improvements in scalability, performance, and cost-efficiency.
We’re excited to introduce Rapid Alluxio Deployer (RAD) on AWS, which allows you to experience the performance benefits of Alluxio in less than 30 minutes. RAD is designed with a split-plane architecture, which ensures that your data remains secure within your AWS environment, giving you peace of mind while leveraging Alluxio’s capabilities.
PyTorch is one of the most popular deep learning frameworks in production today. As models become increasingly complex and dataset sizes grow, optimizing model training performance becomes crucial to reduce training times and improve productivity.