Zipline: Airbnb's data management platform for machine learning

案例来源:Airbnb
会议地点:上海
分享时间: 2019-05-18 16:50-17:50

Patrick Yoon  |

Airbnb Software Engineer

Patrick Yoon is a software engineer on the Machine Learning Infrastructure Team at Airbnb working on tools and frameworks for building and productionizing machine learning models. Previously, he worked on translations infrastructure and email notifications infrastructure at Airbnb’s Growth Team and ads bidding platform at TellApart. He holds a bachelor’s and a master’s degree in computer science from the University of Pennsylvania.

课程概要

Zipline is Airbnb’s data management platform specifically designed for machine learning use cases. It allows users to define features in an easy-to-use configuration language and provides access to the following features:
- Resource efficient and point-in-time correct training set backfills and scheduled updates
- Batch correction with lambda architecture that combines offline batch sources and online streaming sources and provides a single data source in the online scoring environment
- Feature visualizations and automatic data quality monitoring
- Collaboration and sharing of features, and data ownership and management.
Apache Spark powers many of Zipline’s features, especially offline tasks for efficient training set backfills and feature computation, and Apache Flink powers stream processing of online scoring data. Zipline is widely used at Airbnb and reduced the time that ML practitioners spend collecting and developing a reliable dataset from months to days. Despite ML feature management being widespread, there is no open source software to address these problems. As a result, we intend to open source our work.

听众收益

- Explore the architecture of Zipline, Airbnb’s data management platform specifically designed for ML use cases.
- The main problems that Zipline solves.
- Understand how to solve problems regarding training data generation with point-in-time correctness, feature consistency for online scoring, collaborating on training data, and data management.

Patrick Yoon  |

Airbnb Software Engineer

Patrick Yoon is a software engineer on the Machine Learning Infrastructure Team at Airbnb working on tools and frameworks for building and productionizing machine learning models. Previously, he worked on translations infrastructure and email notifications infrastructure at Airbnb’s Growth Team and ads bidding platform at TellApart. He holds a bachelor’s and a master’s degree in computer science from the University of Pennsylvania.

课程概要

Zipline is Airbnb’s data management platform specifically designed for machine learning use cases. It allows users to define features in an easy-to-use configuration language and provides access to the following features:
- Resource efficient and point-in-time correct training set backfills and scheduled updates
- Batch correction with lambda architecture that combines offline batch sources and online streaming sources and provides a single data source in the online scoring environment
- Feature visualizations and automatic data quality monitoring
- Collaboration and sharing of features, and data ownership and management.
Apache Spark powers many of Zipline’s features, especially offline tasks for efficient training set backfills and feature computation, and Apache Flink powers stream processing of online scoring data. Zipline is widely used at Airbnb and reduced the time that ML practitioners spend collecting and developing a reliable dataset from months to days. Despite ML feature management being widespread, there is no open source software to address these problems. As a result, we intend to open source our work.

听众收益

- Explore the architecture of Zipline, Airbnb’s data management platform specifically designed for ML use cases.
- The main problems that Zipline solves.
- Understand how to solve problems regarding training data generation with point-in-time correctness, feature consistency for online scoring, collaborating on training data, and data management.

详情咨询:400-8128-020
赞助合作:sissi
联系电话:130-4321-8801
邮箱:market@msup.com.cn
CopyRight © 2008-2019 Msup