Tech Talk: From limited Hadoop compute capacity to increased data scientist efficiency

October 17, 2019

No items found.

Using “zero-copy” hybrid bursting with Spark to solve capacity problems

Want to leverage your existing investments in Hadoop with your data on-premise and still benefit from the elasticity of the cloud?

Like other Hadoop users, you most likely experience very large and busy Hadoop clusters, particularly when it comes to compute capacity. Bursting HDFS data to the cloud can bring challenges – network latency impacts performance, copying data via DistCP means maintaining duplicate data, and you may have to make application changes to accomodate the use of S3.

“Zero-copy” hybrid bursting with Alluxio keeps your data on-prem and syncs data to compute in the cloud so you can expand compute capacity, particularly for ephemeral Spark jobs.

In this tech talk, we’ll discuss:

Approaches to burst data to the cloud
How Alluxio can enable “zero-copy” bursting of Spark workloads to cloud data services like EMR and Dataproc
How DBS Bank uses Alluxio to solve for limited on-prem compute capacity by zero-copy bursting Spark workloads to AWS EMR

Using “zero-copy” hybrid bursting with Spark to solve capacity problems

Want to leverage your existing investments in Hadoop with your data on-premise and still benefit from the elasticity of the cloud?

“Zero-copy” hybrid bursting with Alluxio keeps your data on-prem and syncs data to compute in the cloud so you can expand compute capacity, particularly for ephemeral Spark jobs.

In this tech talk, we’ll discuss:

Approaches to burst data to the cloud
How Alluxio can enable “zero-copy” bursting of Spark workloads to cloud data services like EMR and Dataproc
How DBS Bank uses Alluxio to solve for limited on-prem compute capacity by zero-copy bursting Spark workloads to AWS EMR

‍

Using “zero-copy” hybrid bursting with Spark to solve capacity problems

Want to leverage your existing investments in Hadoop with your data on-premise and still benefit from the elasticity of the cloud?

“Zero-copy” hybrid bursting with Alluxio keeps your data on-prem and syncs data to compute in the cloud so you can expand compute capacity, particularly for ephemeral Spark jobs.

In this tech talk, we’ll discuss:

Approaches to burst data to the cloud
How Alluxio can enable “zero-copy” bursting of Spark workloads to cloud data services like EMR and Dataproc
How DBS Bank uses Alluxio to solve for limited on-prem compute capacity by zero-copy bursting Spark workloads to AWS EMR

Videos:

Presentation Slides:

Tech Talk: From limited Hadoop compute capacity to increased data scientist efficiency from Alluxio, Inc.

Complete the form below to access the full overview:

Videos

GTC 2025 | Alluxio Decouples Storage and Compute for a Faster AI Future

April 9, 2025

Inside Deepseek 3FS: A Deep Dive into AI-Optimized Distributed Storage

Deepseek’s recent announcement of the Fire-flyer File System (3FS) has sparked excitement across the AI infra community, promising a breakthrough in how machine learning models access and process data.

In this webinar, an expert in distributed systems and AI infrastructure will take you inside Deepseek 3FS, the purpose-built file system for handling large files and high-bandwidth workloads. We’ll break down how 3FS optimizes data access and speeds up AI workloads as well as the design tradeoffs made to maximize throughput for AI workloads.

This webinar you’ll learn about how 3FS works under the hood, including:

✅ The system architecture

✅ Core software components

✅ Read/write flows

✅ Data distribution/placement algorithms

✅ Cluster/node management and disaster recovery

Whether you’re an AI researcher, ML engineer, or infrastructure architect, this deep dive will give you the technical insights you need to determine if 3FS is the right solution for you.

‍

April 1, 2025

AI/ML Infra Meetup | Building Production Platform for Large-Scale Recommendation Applications

March 6, 2025

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer

Request a demo

Alluxio Enterprise AI

Alluxio Enterprise Data

Videos:

Presentation Slides:

Complete the form below to access the full overview:

Videos

Sign-up for a Live Demo or Book a Meeting with a Solutions Engineer