On-Demand Videos
Scaling experimentation in digital marketplaces is crucial for driving growth and enhancing user experiences. However, varied methodologies and a lack of experiment governance can hinder the impact of experimentation leading to inconsistent decision-making, inefficiencies, and missed opportunities for innovation.
At Poshmark, we developed a homegrown experimentation platform, Lightspeed, that allowed us to make reliable and confident reads on product changes, which led to a 10x growth in experiment velocity and positive business outcomes along the way.
This session will provide a deep dive into the best practices and lessons learned from successful implementations of large-scale experiments. We will explore the importance of experimentation, overcome scalability challenges, and gain insights into the frameworks and technologies that enable effective testing.
In the rapidly evolving world of e-commerce, visual search has become a game-changing technology. Poshmark, a leading fashion resale marketplace, has developed Posh Lens – an advanced visual search engine that revolutionizes how shoppers discover and purchase items.
Under the hood of Posh Lens lies Milvus, a vector database enabling efficient product search and recommendation across our vast catalog of over 150 million items. However, with such an extensive and growing dataset, maintaining high-performance search capabilities while scaling AI infrastructure presents significant challenges.
In this talk, Mahesh Pasupuleti shares:
- The architecture and strategies to scale Milvus effectively within the Posh Lens infrastructure
- Key considerations include optimizing vector indexing, managing data partitioning, and ensuring query efficiency amidst large-scale data growth
- Distributed computing principles and advanced indexing techniques to handle the complexity of Poshmark’s diverse product catalog
As machine learning and deep learning models grow in complexity, AI platform engineers and ML engineers face significant challenges with slow data loading and GPU utilization, often leading to costly investments in high-performance computing (HPC) storage. However, this approach can result in overspending without addressing the core issues of data bottlenecks and infrastructure complexity.
A better approach is adding a data caching layer between compute and storage, like Alluxio, which offers a cost-effective alternative through its innovative data caching strategy. In this webinar, Jingwen will explore how Alluxio's caching solutions optimize AI workloads for performance, user experience and cost-effectiveness.
What you will learn:
- The I/O bottlenecks that slow down data loading in model training
- How Alluxio's data caching strategy optimizes I/O performance for training and GPU utilization, and significantly reduces cloud API costs
- The architecture and key capabilities of Alluxio
- Using Rapid Alluxio Deployer to install Alluxio and run benchmarks in AWS in just 30 minutes
ALLUXIO DAY IX 2022 January 21, 2022 Video: Presentation Slides: Vipshop Offline Data Cache Acceleration System – Alluxio Integration from Alluxio, Inc.
ALLUXIO DAY IX 2022 January 21, 2022 Video: Presentation Slides: Industrial Bank's Alluxio Deployment from Alluxio, Inc.
This talk provides an overview of the read-after-write data consistent mechanism in the Alluxio system. Alluxio Core Maintainer and Presto Committer share their recent work on Alluxio and Apache Iceberg integration, as well as some recent work from the Presto community on Iceberg connector.
This talk will introduce Apache Iceberg and its place in a modern and open data platform. It will cover the motivation for creating Iceberg at Netflix, as well as the data architecture that Iceberg makes possible.
Feifei Cai & Hao Zhu from WeRide provide an overview of Alluxio + Spark use case, which has been deployed and running in production to accelerate auto data tagging in the autonomous driving development.
This talk describes the design of shadow cache, a lightweight component to track the working set size of Alluxio cache. Shadow cache can keep track of the working set size over the past window dynamically, and is implemented by a series of bloom filters. We’ve deployed the shadow cache in Facebook Presto and leverage the result to understand the system bottleneck and help with routing design decisions.
This talk discusses the opportunities and problems when Uber meets Alluxio. Zhongting from Uber will provide an overview of Uber traffic, cloud, distribution, invalidation, and consistent hashing. Beinan from Alluxio will provide a deep dive of metadata and monitoring metrics.
In this talk, we will provide a complete picture of the Hudi platform components, along with their unique design choices. We will then deep dive into two important areas of active development going forward – table metadata management and caching. Specifically, we will discuss gaps in the data lake ecosystem around these aspects and provide strawman design approaches for Hudi aims to solve them going forward.
Apache Spark and Alluxio were both born in UC Berkeley’s AMPLab as research projects. As an open source data orchestration platform, Alluxio is able to achieve seamless docking and acceleration of different data sources, and improve the efficiency and fault tolerance of Spark’s big data computing business.
Alluxio has been deployed and running on a large scale managing petabytes level data in the production environment of companies such as Microsoft, Tiktok, Tencent, Singapore Development Bank, China Unicom, etc.
This talk shares the designs and use cases of the Alluxio and Spark integrated solutions, as well as the best practice and “what not to do” in designing and implementing Alluxio distributed systems.
Alluxio’s capabilities as a Data Orchestration framework have encouraged users to onboard more of their data-driven applications to an Alluxio powered data access layer. Driven by strong interests from our open-source community, the core team of Alluxio started to re-design an efficient and transparent way for users to leverage data orchestration through the POSIX interface. This effort has a lot of progress with the collaboration with engineers from Microsoft, Alibaba and Tencent. Particularly, we have introduced a new JNI-based FUSE implementation to support POSIX data access, created a more efficient way to integrate Alluxio with FUSE service, as well as many improvements in relevant data operations like more efficient distributedLoad, optimizations on listing or calculating directories with a massive amount of files, which are common in model training. We will also share our engineering lessons and roadmap in future releases to support Machine Learning applications.
Driven by strong interests from our open source community, the Alluxio core engineering team re-designed things to come up with a more efficient and transparent way for users to leverage data orchestration through the POSIX interface. This enables much better performance for ML workloads where data is accessed via the POSIX interface.
In this 20 minute community session, you’ll hear from Lu Qiu, one of Alluxio’s lead engineers on the POSIX implementation project.
In this session, you’ll learn:
- How Alluxio’s new JNI-based FUSE implementation supports more efficient POSIX data access
- How improvements to multiple data operations, including distributedLoad, optimizations on listing or calculating directories with a massive amounts of files, etc., improve performance. In model training
- How these latest enhancements improve performance on TensorFlow and PyTorch training workloads, even with GPU-based training and compute
ALLUXIO DAY V 2021 August 27, 2021