Data scientists or platform engineers often face the following challenge when the input data for machine learning jobs are stored in remote storage like NFS or cloud storage like S3. Making direct data access is slow, unstable and expensive; manually duplicating data to the training clusters also introduces large overhead, complicated data curation and often requires engineers to build ETL pipelines.
This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.
In this online meetup, you will learn about:
- When Alluxio can help for machine learning platform;
- How to setup and create POSIX endpoint for Alluxio service to unify the file system data access to S3, HDFS and Azure blob storage;
- How to run TensorFlow to train models backed by accessing remote input data like access local file system.
feat. Apple Case Study using Tensorflow, NFS, DC/OS, and Alluxio
ALLUXIO ONLINE MEETUP
Data scientists or platform engineers often face the following challenge when the input data for machine learning jobs are stored in remote storage like NFS or cloud storage like S3. Making direct data access is slow, unstable and expensive; manually duplicating data to the training clusters also introduces large overhead, complicated data curation and often requires engineers to build ETL pipelines.
This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.
In this online meetup, you will learn about:
- When Alluxio can help for machine learning platform;
- How to setup and create POSIX endpoint for Alluxio service to unify the file system data access to S3, HDFS and Azure blob storage;
- How to run TensorFlow to train models backed by accessing remote input data like access local file system.
Video:
Slides:
Data scientists or platform engineers often face the following challenge when the input data for machine learning jobs are stored in remote storage like NFS or cloud storage like S3. Making direct data access is slow, unstable and expensive; manually duplicating data to the training clusters also introduces large overhead, complicated data curation and often requires engineers to build ETL pipelines.
This talk will guide the audience on how Alluxio can greatly simplify the data preparation phase in with remote and possibly multiple data sources. We will share the lessons and benchmark from Bill Zhao an engineer led in Apple when building a Machine Learning platform using Tensorflow, NFS, DC/OS and Alluxio.
In this online meetup, you will learn about:
- When Alluxio can help for machine learning platform;
- How to setup and create POSIX endpoint for Alluxio service to unify the file system data access to S3, HDFS and Azure blob storage;
- How to run TensorFlow to train models backed by accessing remote input data like access local file system.
Videos:
Presentation Slides:
Complete the form below to access the full overview:
![](https://cdn.prod.website-files.com/66d9c7ee21a87e61fe5d0e22/66f52d24efdfb829fb980070_rainbow-vortex%201%20(1).png)