On Demand Video

Model Training Across Regions and Clouds – Challenges, Solutions and Live Demo

AI training workloads running on compute engines like PyTorch, TensorFlow, and Ray require consistent, high-throughput access to training data to maintain high GPU utilization. However, with the decoupling of compute and storage and with today’s hybrid and multi-cloud landscape, AI Platform and Data Infrastructure teams are struggling to cost-effectively deliver the high-performance data access needed for AI workloads at scale.

Join Tom Luckenbach, Alluxio Solutions Engineering Manager, to learn how Alluxio enables high-speed, cost-effective data access for AI training workloads in hybrid and multi-cloud architectures, while eliminating the need to manage data copies across regions and clouds.

What Tom will share:

  • AI data access challenges in cross-region, cross-cloud architectures.
  • ​The architecture and integration of Alluxio with frameworks like PyTorch, TensorFlow, and Ray using POSIX, REST, or Python APIs across AWS, GCP and Azure.
  • A live demo of an AI training workload accessing cross-cloud datasets leveraging Alluxio's distributed cache, unified namespace, and policy-driven data management.
  • MLPerf and FIO benchmark results and cost-savings analysis.

Tom Luckenbach is an ardent technologist and thought leader with experience at some of the Bay Area's top companies, including MongoDB, Oracle, DataStax, and NetApp. Currently, he is a Solutions Engineering Manager at Alluxio, a leading data orchestration solution for data analytics in the era of AI. Tom is passionate about creating solutions to help customers solve complex problems.