Whats New in Alluxio 2.9: Multi-Alluxio Synchronization, Kubernetes Operator and Flexible S3 Access Control
November 16, 2022
By
Adit Madan

Today, we are thrilled to announce that Alluxio 2.9 is generally available (GA) for both the free open source Alluxio Community Edition and Alluxio Enterprise Edition!  With GA, you can expect stability, support, and enterprise-readiness from Alluxio. In this blog post, we explore how Alluxio is enabling growth and agility for analytics and AI applications at the world’s leading companies, often running across regions, compute engines, and storage systems.

The Alluxio 2.9 version delivers support for a scale-out & multi-tenant architecture with a new cross-cluster synchronization feature, enhanced manageability with significant improvement in the tooling and guidelines for deploying Alluxio on Kubernetes, and improved security and performance with a strengthened S3 API. 

Alluxio enables a compute & storage-agnostic multi-cloud data platform. Alluxio can be used with Spark, Presto, Trino, PyTorch, and Tensorflow amongst others on various cloud platforms, such as AWS, GCP, and Azure, and also on Kubernetes across private data centers or public clouds.

Alluxio Community Edition Highlights

The following features are included in both the Alluxio Community and Enterprise editions.

Master Health Status

The Alluxio master now periodically checks a combination of resource usage, including CPU and memory usage, and several performance critical internal data structures to infer the overall state of the system. The possible statuses, which can be retrieved by inspecting the master.system.status metric, are:

  • IDLE
  • ACTIVE
  • STRESSED
  • OVERLOADED

To get started, view the documentation for more information about this monitoring heuristic.

Paging Storage on Workers (Experimental)

The new release includes support for fine-grained paging-level (e.g., 1MB) storage representation for caching on Alluxio workers as an alternative option to the existing block-based (e.g. 64MB) storage.

This feature promises to improve caching efficiency and improve performance by reducing amplification of the amount of data read by applications when accessing the underlying storage sources for the first time.

To get started, view the documentation here.

Alluxio Enterprise Edition Highlights

The following features are part of the Alluxio Enterprise Edition only.

Multi-Cluster Synchronization

Tenant isolation rigorously prevents different teams from competing for access to shared data lake storage. With the new cross-cluster synchronization feature, Alluxio 2.9 improves scalability when deploying multiple Alluxio clusters across tenants in Kubernetes or across environments.

Federation of multiple Alluxio clusters makes one instance of Alluxio aware of another by actively synchronizing metadata with a stream of update events. This feature is particularly useful when adopting a satellite architecture with data producers updating data lake storage with isolation from data consumers.

To get started, view the documentation here.

Manageability with new Kubernetes Operator

Running Alluxio on Kubernetes helps standardize deployment methodologies to make the data stack portable to any environment. This new release introduces an Alluxio Operator, which simplifies deploying and managing multiple Alluxio clusters.

Administrators can now deploy and manage Alluxio using a CRD (Custom Resource Definition). Using the Alluxio operator reduces the burden of managing multiple instances of Alluxio.

To get started, view the documentation here.

Enhanced S3 API Security

Authentication and access control policies can be centrally managed using a unified namespace via the Alluxio S3 API to provide a unified security experience across heterogeneous storage, either on-premise or in the cloud.

By adopting the open authentication protocol for S3 API, user identities will be verified before their requests are processed. This new feature allows connections to identity management systems, such as PingFederate, and leverage Single Sign On (SSO).

To get started, view the documentation here.

If you’d like to speak with a solutions engineer to learn more about the latest in Alluxio 2.9, you can directly book a meeting here

More Info

For an exhaustive list of major features and bug fixes of Alluxio 2.9, please refer to the Community Edition release notes and Enterprise Edition release notes.  

Free downloads of Alluxio 2.9 open source Community Edition and trials of Alluxio Enterprise Edition are immediately available here: https://www.alluxio.io/download/. Join 9000+ members in our community slack channel to ask any questions and provide your feedback.

Share this post

Blog

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

No items found.