PANEL DISCUSSION + Live Q&A
Managing Data at Scale
Since the advent of the internet, the need for reliable, low latency access to data has grown at a rapid pace. Data Infrastructure, which was once a single monolithic database, has evolved into a tapestry of point solutions tied together by data movement infrastructure (e.g. data replication streams). What was once the domain of DBAs is now accessed by engineers, analysts, ops, and often non-technical folks as well. A simple set of tables has become a complex latticework of data sets, streams, batch jobs, and the like. With this increase in complexity comes challenges and new concerns.
Some of the concerns we will tackle will be:
- How do companies manage the ever-growing complexity in modern data ecosystems?
- How does data operations keep track of tens of thousands of daily job executions and particularly failures?
- How do the security, governance, and compliance folks ensure that the right people have access to the right data fields in order to preserve end-user privacy?
- What are the contracts between data producers & data consumers & how are they enforced?
- How do data producers shield data consumers from breaking changes in schemas?
- How do data consumers find the data sets they need and how are they notified if those data sets are end-of-life’d?
Speaker

Mark Grover
Co-founder @Stemma_ai & co-creator of Amundsen
Mark is the co-founder of Stemma. He is the co-creator of the leading open-source data catalog, Amundsen, used by Lyft, Instacart, Square, ING, Snap and many more!Mark was previously a developer on Apache Spark at Cloudera and is a committer and PMC member on a few open-source Apache...
Read moreFind Mark Grover at:
Speaker

Shirshanka Das
Founder of LinkedIn DataHub, Apache Gobblin, Acryl Data
Shirshanka is co-founder and CEO of Acryl Data, the company which is commercializing the open source DataHub project, a real-time metadata platform used by LinkedIn, Expedia, Saxo Bank, Klarna, Viasat, and many others.Prior to founding Acryl, he was the overall architect for...
Read moreFind Shirshanka Das at:
Speaker

Chris Riccomini
Distinguished Engineer @WePay
Chris Riccomini is a software engineer, startup investor, and advisor with more than a decade of experience at major tech companies such as PayPal, LinkedIn, and WePay. He has been involved in open source throughout his career and is the author of Apache Samza. He's recently written The...
Read moreFind Chris Riccomini at:
From the same track
Building & Operating High-Fidelity Data Streams
Monday Nov 8 / 11:10AM EST
The world we live in today is fed by data. From self-driving cars and route planning to fraud prevention, to content and network recommendations, to ranking and bidding, our world not only consumes low-latency data streams, it adapts to changing conditions modeled by that data. While...

Sid Anand
Chief Architect @Datazoom, PMC @ApacheAirflow
Microservices to Async Processing Migration at Scale
Monday Nov 8 / 12:10PM EST
Netflix creates and analyzes operational and analytical data associated with playback of thousands of titles by over 200 Million members worldwide. The data powers product features such as members’ ability to see and manage their viewing history. The data also feeds into the core business...

Sharma Podila
Software Engineer @Netflix
Protecting User Data via Extensions on Metadata Management Tooling
Monday Nov 8 / 01:10PM EST
In a world where data collection is ever-increasing and new and expanded data protection laws like GDPR and CCPA are introduced yearly, metadata management, the act of storing contextual information about collected and stored data, has become a required staple for many companies. This talk gives...

Alyssa Ransbury
Security Engineer @Square