Scalable and Highly-available Distributed File System Metadata Service Using gRPC, RocksDB and RAFT
April 7, 2020
By 
Bin Fan

Alluxio (alluxio.io) is an open-source data orchestration system that provides a single namespace federating multiple external distributed storage systems. It is critical for Alluxio to be able to store and serve the metadata of all files and directories from all mounted external storage both at scale and at speed.

This talk shares our design, implementation, and optimization of Alluxio metadata service (master node) to address the scalability challenges. Particularly, we will focus on how to apply and combine techniques including tiered metadata storage (based on off-heap KV store RocksDB), fine-grained file system inode tree locking scheme, embedded state-replicate machine (based on RAFT), exploration and performance tuning in the correct RPC frameworks (thrift vs gRPC) and etc. As a result of the combined above techniques, Alluxio 2.0 is able to store at least 1 billion files with a significantly reduced memory requirement, serving 3000 workers and 30000 clients concurrently.

In this Office Hour, we will go over how to:

  • Metadata storage challenges
  • How to combine different open source technologies as building blocks
  • The design, implementation, and optimization of Alluxio metadata service
ALLUXIO COMMUNITY OFFICE HOUR

Alluxio (alluxio.io) is an open-source data orchestration system that provides a single namespace federating multiple external distributed storage systems. It is critical for Alluxio to be able to store and serve the metadata of all files and directories from all mounted external storage both at scale and at speed.

This talk shares our design, implementation, and optimization of Alluxio metadata service (master node) to address the scalability challenges. Particularly, we will focus on how to apply and combine techniques including tiered metadata storage (based on off-heap KV store RocksDB), fine-grained file system inode tree locking scheme, embedded state-replicate machine (based on RAFT), exploration and performance tuning in the correct RPC frameworks (thrift vs gRPC) and etc. As a result of the combined above techniques, Alluxio 2.0 is able to store at least 1 billion files with a significantly reduced memory requirement, serving 3000 workers and 30000 clients concurrently.

In this Office Hour, we will go over how to:

  • Metadata storage challenges
  • How to combine different open source technologies as building blocks
  • The design, implementation, and optimization of Alluxio metadata service

Video:

Slides:

Scalable and High available Distributed File System Metadata Service Using gRPC, RocksDB and RAFT from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer