Session + Live Q&A
Microservices to Async Processing Migration at Scale
Netflix creates and analyzes operational and analytical data associated with playback of thousands of titles by over 200 Million members worldwide. The data powers product features such as members’ ability to see and manage their viewing history. The data also feeds into the core business analytics as well as to the personalization and recommendation engines.
Previously built systems utilized microservices to ingest playback data in a synchronous manner, which potentially propagates any intermittent back pressure to the edge, and sometimes all the way to the clients on member devices. Utilizing an asynchronous processing model for the playback data, with a durable queue to absorb intermittent back pressure, we have migrated the systems seamlessly with no interruption to, and no changes required from, both the upstream and the downstream services.
We share our experience from the migration along with our design and implementation choices. Asynchronous processing at scale requires attention to managing any data loss with highly available infrastructure, elasticity to handle bursts without a high latency, fault tolerance with graceful degradation, as well as handling out of order and duplicate data. Specifically, we share the lessons learned from migrating the viewing history subsystem that provides the product feature as well as powers the personalization and recommendation engines.
Main Takeaways
1 How we approached migrating our playback data processing service seamlessly from a synchronous model to an asynchronous model.
2 Design choices and trade offs to consider when processing durable queue based data at scale.
3 Strategies for testing and validation for a seamless migration, without impacting the rest of the data pipeline ecosystem.
Speaker
Sharma Podila
Software Engineer @Netflix
Software Engineering leader, system builder, collaborator, mentor. Deep expertise in cloud resource management, distributed systems, data infrastructure. Proven track record of delivering impactful large scale distributed systems of cross functional scope.
Read moreFind Sharma Podila at:
From the same track
Building & Operating High-Fidelity Data Streams
Monday Nov 8 / 11:10AM EST
The world we live in today is fed by data. From self-driving cars and route planning to fraud prevention, to content and network recommendations, to ranking and bidding, our world not only consumes low-latency data streams, it adapts to changing conditions modeled by that data. While...
Sid Anand
Chief Architect @Datazoom, PMC @ApacheAirflow
Protecting User Data via Extensions on Metadata Management Tooling
Monday Nov 8 / 01:10PM EST
In a world where data collection is ever-increasing and new and expanded data protection laws like GDPR and CCPA are introduced yearly, metadata management, the act of storing contextual information about collected and stored data, has become a required staple for many companies. This talk gives...
Alyssa Ransbury
Security Engineer @Square
Managing Data at Scale
Monday Nov 8 / 02:10PM EST
Since the advent of the internet, the need for reliable, low latency access to data has grown at a rapid pace. Data Infrastructure, which was once a single monolithic database, has evolved into a tapestry of point solutions tied together by data movement infrastructure (e.g. data replication...
Mark Grover
Co-founder @Stemma_ai & co-creator of Amundsen
Shirshanka Das
Founder of LinkedIn DataHub, Apache Gobblin, Acryl Data
Chris Riccomini
Distinguished Engineer @WePay