Customers can improve performance and reduce cost with location-aware data management -- new object storage optimizations ease migration from HDFS to cloud
SAN MATEO, Calif., July 31, 2018 (GLOBE NEWSWIRE) -- Alluxio, developer of the world's first software system that unifies data at memory speed, today announced the release of Alluxio 1.8 to accelerate cloud adoption for analytics and machine learning workloads. Location-aware data management tools provide a wide range of policy-based control within hybrid cloud environments and within cloud availability zones. Optimizations for object storage, and each major cloud provider, close semantic differences with the Hadoop Distributed File System (HDFS) and ease application portability between cloud platforms. The new Filesystem in Userspace (FUSE) interface enables machine learning frameworks to access cloud data as if it were in a local file system.
Additionally, developers now have superior insight into data metrics within an Alluxio cluster as well as expanded visibility into the application and persistent storage layer. Increased metrics coverage within the Alluxio cluster, the application layer and underlying storage systems makes it easier to connect and manage multiple data sources. A new dashboard in the UI provides overall health and utilization metrics with accompanying command line interface (CLI) tools for live cluster statistics. All remote procedure call (RPC) requests are recorded, providing a detailed set of machine-consumable metrics with API-based statistics generation for third party tools such as Grafana and Prometheus. Developers can quickly diagnose storage system performance issues with new tools that include latency histograms and capacity utilization. The application configuration checker provides a one-click integration check for third party applications such as Hive, MapReduce, Spark, and more.
"Innovative companies are looking for new ways to interact with data in a complex ecosystem with a wide variety of application frameworks, heterogeneous storage systems, and hybrid cloud environments," said Haoyuan Li, co-founder and CEO of Alluxio. "Alluxio is innovating rapidly and leveraging the open source model to give developers new capabilities to extract value from their data and build new services as enterprises navigate the digital transformation."
Developing for cloud environments (public, private, and hybrid) is paramount for modern application deployment. New data management features in Alluxio include location-aware data capabilities so that companies can set policies and control data placement across availability zones as well as simplifying tiered data placement. These features help organizations ensure business continuity and also boost performance by putting data close to compute resources for cloud workloads. Department isolation can be enforced, and expensive data transfers between zones can be avoided via a single persistent data source. Alluxio can be deployed in a container and is certified with Kubernetes container orchestration.
This release includes optimizations that apply object storage semantics to the interfaces for major public cloud providers. A standard Amazon Web Services (AWS) S3 interface ensures broad support for independent storage vendors. Applications connect to Alluxio and can access data from cloud storage without code changes. The standard interface also ensures application portability across cloud vendors.
Object storage optimizations ease migration from expensive HDFS storage solutions to more cost-effective object storage. This also helps decouple compute and storage in Big Data deployments by making it easier to move live datasets to HDFS when required and putting the rest of the data in object storage. Performance optimizations include intelligent metadata services to speed up critical big data operations such as directory listing ('ls') and 'rename'. Alluxio implements POSIX-style security to maintain compatibility with frameworks such as Access Control Lists (ACLs) when moving to the cloud.
The new Filesystem in Userspace (FUSE) interface brings the power of Alluxio to data scientists and analysts without involving IT operations while providing improved insight and control for both developers and administrators. Data sources mount like a local filesystem, a key feature simplifying self-service data access and particularly useful for applications like TensorFlow.
To help developers and data scientists take advantage of Alluxio faster, the latest release of Alluxio includes a new starter kit. The kit includes:
- Alluxio pre-built binaries
- How to guide: Install Alluxio on a local machine
- How to guide: Install, plus mount an S3 bucket and accelerate remote reads
- Video: walk-through of install through accelerating remote reads
- How to: Running Spark on Alluxio
- Learn more: Architecture overview
To improve usability, the new release simplifies cluster configuration management. Centralized configuration settings can be applied at the master and propagated automatically through the cluster. Different client applications, such as Spark jobs, can initialize their configuration by retrieving the default from the master. Improved journaling and snapshot provide guaranteed data consistency, faster restart, and disaster recovery support.
Alluxio solutions help customers in a wide range of use cases to maximize the value of their data. Alluxio enables a flexible data infrastructure that meets the volume, variety and velocity challenges of data-driven enterprises with a scalable virtual data layer in the cloud, on-premises or hybrid. With Alluxio, customers can scale beyond petabytes across storage silos, geographic locations and cloud providers allowing concurrent access to shared data sources without modifying applications. Alluxio provides standard access to multiple object or file data sources concurrently to deliver data at memory speed regardless of physical location.
About Alluxio
Alluxio, a leading provider of the high performance data platform for analytics and AI, accelerates time-to-value of data and AI initiatives and maximizes infrastructure ROI. Uniquely positioned at the intersection of compute and storage systems, Alluxio has a universal view of workloads on the data platform across stages of a data pipeline. This enables Alluxio to provide high performance data access regardless of where the data resides, simplify data engineering, optimize GPU utilization, and reduce cloud and storage costs. With Alluxio, organizations can achieve magnitudes faster model training and serving without the need for specialized storage, and build AI infrastructure on existing data lakes. Backed by leading investors, Alluxio powers technology, internet, financial services, and telecom companies, including 9 out of the top 10 internet companies globally. To learn more, visit www.alluxio.io.
Media Contact:
Beth Winkowski
Winkowski Public Relations, LLC for Alluxio
978-649-7189
beth@alluxio.com
News & Press
The Global Data Center Market achieved a valuation of $196.9 Billion in 2023. It is projected to exhibit steady growth, reaching $464.6 Billion by 2032, with a compound annual growth rate (CAGR) of 10.30% during the forecast period (2024–2032). However, resolving security, operational efficiency, and environmental impact issues will be critical to continuing this growth trajectory, reports Straits Research.
Here, experts in the field offer their predictions for what 2025 holds for data centers