Whats new in Alluxio 2.4

October 21, 2020

Lu Qiu

We are extremely excited to announce the release of Alluxio 2.4.0!

Alluxio 2.4.0 focuses on features critical to large scale, production deployments in Cloud and Hybrid Cloud environments. Enterprises leverage Alluxio at enormous scale in many dimensions, including number of files, total volume of data, requests per second, and number of concurrent clients.

Downloads can be found here. Join thousands of members in our Slack channel to ask any questions and provide your feedback! Thank you to everyone who contributed to this release!

Meeting the Scale of Hybrid Cloud

Alluxio’s ability to enable zero copy bursting of compute to cloud allows organizations to successfully adopt the cloud. Alluxio 2.4 aims to address some of the common challenges of scaling up these deployments.

Features such as highly scalable metadata journaling, aggregate cluster metrics monitoring, and automated detection of JVM pauses further improve Alluxio’s suitability for demanding workloads. Devops tools are also key for triaging issues when they occur. In Alluxio 2.4 we further improve the cluster wide log collection framework. Finally, Alluxio is continually expanding its state of the art integrations with frameworks and storage systems. Alluxio 2.4 introduces and improves integrations with Kubernetes, Azure Data Lake Storage, and Apache Ozone. Alluxio 2.4 is also the first Alluxio release that has support for Java 11.

Expanded Metadata Service

At the core of the Alluxio Data Orchestration Platform is a metadata service, a scalable, distributed data service for management across multiple sources like traditional Hadoop-based data lakes on-premises or modern cloud-based data lakes. Leveraged to unify data lakes at enormous scale, both in data size and number of files, Alluxio has expanded this service to provide support for billions of files while removing third-party system dependencies. Breaking away from dependencies on traditional Hadoop components, Alluxio has bolstered support for cloud native and container based deployments. The lifecycle management of the Alluxio’s metadata service now also supports automatic backups without impacting the live system to further reduce the platform management overhead.

Cloud native deployment

Spawning analytics clusters in AWS and GCP is now easier than ever. Based on Terraform, Alluxio now makes it easy to launch pre-configured clusters programmatically using a single command. Alluxio has been featured as a recommended data lake partner for data lake modernization solution with Google Cloud, including the ability to launch an Alluxio-enabled cluster using the Dataproc component exchange console.

Simplified DevOps and system monitoring

Alluxio 2.4 adds several system enhancements to simplify and improve cluster management and maintenance. The system provides an aggregated cluster view of key performance metrics like I/O throughput and metadata request rate through the UI and programmatic monitoring endpoints. Internal monitoring for failures and system slow downs has been added, further improving the operator view of the health and performance of the system.

Support for Java 11

Java 11 is the latest long term support version of Java. Alluxio 2.4 provides compatibility with Java 11 while maintaining support for Java 8. Users looking to move their compute engines or Alluxio systems to Java 11 can now do so without any concerns.

More Info

You can find more information in the 2.4.0 official release notes.
Have questions? Come join the Community Slack Channel.

Read the latest product blog on the new Data Orchestration Hub to learn about how the management console makes it easy to manage an analytics cluster and connect it with multiple data sources to unify data lakes.

Share this post

Blog

Uptycs Chooses Alluxio to Power GenAI Natural Language Analytics at Terabyte Scale

Suresh Kumar Veerapathiran and Anudeep Kumar, engineering leaders at Uptycs, recently shared their experience of evolving their data platform and analytics architecture to power analytics through a generative AI interface. In their post on Medium titled Cache Me If You Can: Building a Lightning-Fast Analytics Cache at Terabyte Scale, Veerapathiran and Kumar provide detailed insights into the challenges they faced (and how they solved them) scaling their analytics solution that collects and reports on terabytes of telemetry data per day as part of Uptycs Cloud-Native Application Protection Platform (CNAPP) solutions.

AI/ML Infra Meetup at Uber Seattle: Tackling Scalability Challenges of AI Platforms

Insights from from Uber, Snap, and Alluxio on LLM training, fine-tuning, deployment, designing scalable architectures, GPU optimization, and building recommendations systems.

New Features in Alluxio Enterprise AI 3.5

With the new year comes new features in Alluxio Enterprise AI! Just weeks into 2025 and we are already bringing you exciting new features to better manage, scale, and secure your AI data with Alluxio. From advanced cache management and improved write performance to our Python SDK and S3 API enhancements, our latest release of Alluxio Enterprise AI delivers more power and performance to your AI workloads. Without further ado, let’s dig into the details.

‍

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo