Running Tensorflow on Alluxio FUSE
With Unified Namespace, Alluxio serves as the single unified access point for all your Tensorflow training data, transparently connecting to your existing storage systems.
Co-locating with Tensorflow applications, Alluxio caches the remote data locally for future access, providing data locality. Without Alluxio, slow remote storage may result in bottleneck on I/O and leave GPU resources underutilized.
Why Tensorflow + Alluxio
With Alluxio POSIX API, users can access training data transparently through Alluxio FUSE with no application rewrite. This greatly simplifies the development process, without complex requirements for different integration setups and credential configurations for each under storage.
By co-locating Tensorflow applications with Alluxio workers, Alluxio utilizes intelligent strategies tailored to the I/O patterns of AI/ML workload to cache the remote data locally for future access, providing 2x performance improvements with data locality.
When reading data remote from the computation, slow remote storage often results in bottleneck on I/O and leaves GPU resources underutilized. Caching frequently used data, Alluxio eliminates the I/O stall so that your GPUs are continuously fed with data, increasing your GPU utilization to 97%+.