Enabling Data Location Awareness for Optimized Performance and Lower Cost With Alluxio Tiered Locality

March 12, 2019

Caching frequently used data in memory is not a new computing technique, however it is a concept that Alluxio has taken to the next level with the ability to aggregate data from multiple storage systems in a unified pool of memory. Alluxio capabilities extend further to intelligently managing the data within that virtual data layer. Tiered locality uses awareness of network topology and configurable policies to manage data placement for performance and cost optimizations. This feature is particularly useful with cloud deployments across multiple availability zones. It can also be useful for cost savings in environments where cross-zone or cross-location traffic is more expensive than intra-zone data traffic.

Here is a simple scenario where Alluxio will use network topology information to prefer more local reads and writes with Alluxio workers in two different AWS Availability Zones.

Using this setup with m5.xlarge EC2 instances, the application gets different read performance depending on which worker the data is read from.

Unsurprisingly, performance is fastest when reading from the local Alluxio worker and slows when read from a non-local worker. The performance difference between worker 2 and worker 3 is due to the difference in bandwidth between Availability Zones (AZs). Worker 2 is in the same AZ as the application with about 10 gigabits per second of bandwidth. Reading from worker 3 is slower because the bandwidth across AZs is only about 5 gigabits per second.

Without tiered locality, the application is just as likely to read from worker 2 as worker 3 if both workers have the data cached. Configuring tiered locality gives a preference for worker 2 and faster performance. The situation is similar for writes. When applications write data through Alluxio, they will prefer to write to more-local workers for improved performance.

Configuration

To enabled tiered locality, and the associated performance benefit, every actor (clients and workers) must be configured to know its tiered identity. Tiered identity is a mapping from locality tier (e.g. Availability Zone) to the value for that tier (e.g. us-east-1a). For the above cluster setup example, the tiered identities would be:

ActorTiered IdentityApplication"node=worker1,az=us-east-1a"Alluxio Worker 1"node=worker1,az=us-east-1a"Alluxio Worker 2"node=worker2,az=us-east-1a"Alluxio Worker 3"node=worker3,az=us-east-1b"Alluxio Worker 4"node=worker4,az=us-east-1b"

Configure with alluxio-site.properties

The simplest way to configure tiered locality is to use alluxio-site.properties to set it like any other. Please refer to: https://www.alluxio.org/docs/1.8/en/Configuration-Settings.html?q=configuration

Properties for Application, Worker 1, and Worker 2

alluxio-site.properties
alluxio.locality.az="us-east-1a"alluxio.locality.order="node,az" # custom locality hierarchy

Properties for Worker 3 and Worker 4

alluxio-site.properties
alluxio.locality.az="us-east-1b"alluxio.locality.order="node,az" # custom locality hierarchy

We set "alluxio.locality.order" to introduce the "az" locality tier and show its order in the locality hierarchy. By default the locality tiers are "node,rack". Note that we don't need to explicitly configure "node" identity because it is determined automatically via localhost lookup.

Configure with alluxio-locality.sh

When the cluster is set up automatically, or there are many workers, it can be convenient to set the locality information via script instead of using a static value in alluxio-site.properties. If a script exists at ${ALLUXIO_HOME}/conf/alluxio-locality.sh, it will be executed to determine tiered identity.

alluxio-locality.sh
#!/bin/bashecho "az=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)"

The same script can be set on every node to look up and report the availability zone. The script name can be configured by setting the alluxio.locality.script property key.

Custom Tiers

In this example we used the tiers "node" and "az", but the tier configuration is fully customizable, so you can use whatever tiers make sense for your deployment, e.g. "rack", "zone", "region", etc. Just make sure each region is contained within the next in "alluxio.locality.order", since locality decisions prefer to match in the earliest tier possible.

Interaction with Location Policies

In addition to tiered locality, Alluxio has a concept of using a location policy during reads and writes to help select the worker to use. The policies interact with tiered locality in different ways.

LocalFirstPolicy: Choose a worker that matches in the most local tier possible
LocalFirstAvoidEvictionPolicy: Like LocalFirstPolicy, but avoiding eviction is given a higher priority than locality.
MostAvailableFirstPolicy: Unaffected by locality information
RoundRobinPolicy: Unaffected by locality information
SpecificHostPolicy: Unaffected by locality information

Cluster Partitioning with Strict Locality

Alluxio Enterprise Edition adds an additional capability: strict locality tiers. Users can define a tier to be "strict", meaning that no traffic is allowed between actors that don't match in the tier. This will partition the cluster into zones which handle data independently, sharing only metadata. For the availability zone example, this would mean that applications in us-east-1a will only communicate with workers in us-east-1a. Even if data is cached on a worker in us-east-1b, an application in us-east-1a will prefer to read the data from the UFS, caching it to a worker in us-east-1a.

Strict locality is necessary when cross-zone traffic is impossible, e.g. to comply with security or other data policies.

To configure strict locality for the "az" tier in enterprise edition, set

alluxio.locality.az.strict=true

Conclusion

When clusters have non-uniform networking capabilities, it makes sense to configure Alluxio tiered locality to improve performance. It can also save on cost in environments where cross-zone network transfer is more expensive than intra-zone data transfer.

Share this post

Blog

New Features in Alluxio Enterprise AI 3.5

With the new year comes new features in Alluxio Enterprise AI! Just weeks into 2025 and we are already bringing you exciting new features to better manage, scale, and secure your AI data with Alluxio. From advanced cache management and improved write performance to our Python SDK and S3 API enhancements, our latest release of Alluxio Enterprise AI delivers more power and performance to your AI workloads. Without further ado, let’s dig into the details.

‍

Alluxio Enterprise for Data Analytics Scales to New Heights

We are thrilled to announce the general availability of Alluxio Enterprise for Data Analytics 3.2! With data volumes continuing to grow at exponential rates, data platform teams face challenges in maintaining query performance, managing infrastructure costs, and ensuring scalability. This latest version of Alluxio addresses these challenges head-on with groundbreaking improvements in scalability, performance, and cost-efficiency.

Introducing Rapid Alluxio Deployer On AWS: Experience The Benefits Of Alluxio Enterprise AI In A Few Clicks

We’re excited to introduce Rapid Alluxio Deployer (RAD) on AWS, which allows you to experience the performance benefits of Alluxio in less than 30 minutes. RAD is designed with a split-plane architecture, which ensures that your data remains secure within your AWS environment, giving you peace of mind while leveraging Alluxio’s capabilities.

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo