Announcing Alluxio Data Orchestration Hub

November 3, 2020

Adit Madan

We’re pleased to announce the general availability of Alluxio Data Orchestration Hub, your single pane of glass to orchestrate data for analytics and AI. The data ecosystem is complex with the separation of storage and compute across data centers and cloud providers. With this release we’ve made great strides towards simplifying data access and management across multiple environments.

Data Orchestration Hub, or the Hub, is a management console that makes it easy to manage an analytics cluster and connect it with multiple data sources to unify data lakes. The service provides an easy to use unified management view for configuration and monitoring, and wizard based curation of deployment workflows.

Connect Your Data Sources: Connect Alluxio to data storage and catalogs across multiple clouds, single cloud or on-premises using guided wizards.
Monitor Your Alluxio Cluster: Monitor your Alluxio cluster.
Manage Configuration: Set and distribute configuration for a cluster.

Alluxio Data Orchestration Hub is available immediately for all Alluxio deployment scenarios with compute engines like Presto, Spark and Tensorflow. The Hub is ready to use out of the box with Amazon EMR and Google Dataproc. Other platforms are also available for use. Please visit the documentation here for more information to try out the Hub.

When to Use

Connecting to data sources across regions

The Hub provides self-guided wizards to allow users to connect to data sources and catalogs in the same or remote data centers. A user is guided through the required configuration steps along with validation of the connection.

These wizards are applicable for multiple scenarios including: hybrid cloud, cross-data center, single cloud or private data center deployments. Manage your compute clusters with Alluxio using these easy-to-use wizards.

*Connect Alluxio to all your data sources across multiple clouds, single cloud or on-premises using self-guided wizards.*

Managing an Alluxio cluster

The Hub can be used to view a dashboard to monitor the state of processes on the cluster, as well as update configuration and restart processes. This is especially useful for cloud deployments without access to SSH for configuration and process management.

*Monitor the status of an Alluxio cluster anywhere. You can start or stop cluster components from an intuitive UI.*

What’s Next

To start using Alluxio Data Orchestration Hub, simply launch Alluxio enabled clusters in your on-premises or cloud deployment. Further changes and monitoring of the cluster is managed can now be managed using the Hub:

Process Management: Monitor status of each process part of the Alluxio cluster, and start / stop processes.
Connect Data Storage: Connect Alluxio to your data sources, such as HDFS / S3 / GCS, across a hybrid cloud, single cloud or on-premises.
Connect Data Catalog: Configure structured data catalogs for OLAP engines like Presto on Alluxio. Connect to existing catalog definitions to prevent re-definition of table metadata.
Advanced Configuration: Customize your Alluxio cluster with advanced options for setting and distributing configuration from the central console.

If you would like more information on Data Orchestration Hub and the supported toolset please read the release notes.

Have questions? Come join the Community Slack Channel.

Read the Alluxio 2.4 release product blog to learn more about the expanded features and capabilities to advance analytics and AI in the cloud.

Share this post

Blog

Uptycs Chooses Alluxio to Power GenAI Natural Language Analytics at Terabyte Scale

Suresh Kumar Veerapathiran and Anudeep Kumar, engineering leaders at Uptycs, recently shared their experience of evolving their data platform and analytics architecture to power analytics through a generative AI interface. In their post on Medium titled Cache Me If You Can: Building a Lightning-Fast Analytics Cache at Terabyte Scale, Veerapathiran and Kumar provide detailed insights into the challenges they faced (and how they solved them) scaling their analytics solution that collects and reports on terabytes of telemetry data per day as part of Uptycs Cloud-Native Application Protection Platform (CNAPP) solutions.

AI/ML Infra Meetup at Uber Seattle: Tackling Scalability Challenges of AI Platforms

Insights from from Uber, Snap, and Alluxio on LLM training, fine-tuning, deployment, designing scalable architectures, GPU optimization, and building recommendations systems.

New Features in Alluxio Enterprise AI 3.5

With the new year comes new features in Alluxio Enterprise AI! Just weeks into 2025 and we are already bringing you exciting new features to better manage, scale, and secure your AI data with Alluxio. From advanced cache management and improved write performance to our Python SDK and S3 API enhancements, our latest release of Alluxio Enterprise AI delivers more power and performance to your AI workloads. Without further ado, let’s dig into the details.

‍

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo