The video on-demand of this session is available to logged in QCon attendees only. Please login to your QCon account to watch the session.

Session + Live Q&A

Netflix Drive: Building a Cloud Native Filesystem for Media Assets

Netflix Studios produces hundreds to thousands of movies, shows, trailers, and other forms of media content each year which amount to hundreds of petabytes of storage and billions of media assets. These assets are created, edited, managed, encoded, and rendered by artists working on a multitude of workstation environments and operating systems such as Windows, Linux and MacOS, from all around the globe. These assets need to be immediately available to different globally distributed teams within Netflix Studios. What kind of architecture can work for this scale and provide artists with a secure, performant and seamless storage interface?

In this talk, we present Netflix Drive, a generic cloud drive for storing and retrieving media assets, i.e., a collection of media files and folders in Netflix. Netflix Drive ties together disparate data (such as: AWS S3, Ceph Storage, Google Cloud Storage, and others) and metadata stores (such as: DynamoDB, RDS, Redis, CockroachDB, and others) in a cogent form for creating, cataloging and serving these assets to applications and workflows.

Main Takeaways

1 Learn about Netflix Drive, what it is, what it does and how it differentiates from other storage providers.

2 Find out some of the plans they have for the future including open sourcing it.


What is the focus of your work these days?

The focus of my work these days is something called Netflix Drive, and that's the topic of presentation as well. And just a brief background on it. Netflix Drive is the paved path for storing and retrieving and managing tons of media assets that are generated by Netflix Studios and streaming platforms. You can think of it as Google Drive. But on Google Drive you have files and folders. Netflix Drive is something that artists and studios will use for backing up their data and globally distributing it in a performant and scalable manner. So that's the focus of what I do right now and my talk.

How would you describe the persona and the level of the target audience?

Netflix Drive is a generic file system for storage, and any software engineer or architect that is looking to use a cloud and on premise based storage, any sort of a backend can use Netflix Drive. We plan to open source it soon. So it is a framework where different types of data storage and metadata storage can be plugged in. So you can imagine, it can have S3, Dynamo DB, MongoDB or other types of databases plugged in. So our target audience is any architect that is working on any files or asset storage.

You mentioned some of the more traditional consumer cloud services for storage. What will be the main difference between the Netflix Drive and any of the other cloud services?

When you use cloud services, the number one reason that you would want to use something like Netflix Drive is that you can configure Netflix Drive using REST-end points. So it is not just a file system. It also has APIs built into it. And for artists that work on images or photos, they need to have the ability to configure their workflows. So you can imagine, let's say there's a big movie that has been created and there are a bunch of artists that are working on different parts of the movie. You want these artists to only have access to the parts that they care about and not the entire corpus of data. Netflix Drive can enable that seamlessly with different workflows and pipelines that are done in Netflix Studios today. That is what differentiates it. The other thing is that security and scalability and latency are things that are first class citizens in Netflix Drive. So Netflix Drive intelligently places the data on premise local and cloud systems. So it's a hybrid system that can work in conjunction with multiple back-ends. The other part is because it's a framework, we are exposing it to the audience so that they can plug in any sort of database or any sort of data store on the backend. There may be folks that work with multiple cloud providers, folks that do not work with cloud at all. They can still use the benefits of Netflix Drive, without being tied to any cloud provider.

And you mentioned something about general availability, open sourcing it. Are there any plans for that?

We do plan to do it pretty soon. We are working on our end to make it more robust. And then we would definitely announce that. We also have a blog post on Netflix Drive on the Netflix technology blog, where we talk more about the design and our plans to open source it.


Speaker

Tejas Chopra

Senior Software Engineer Data Storage Platform team @Netflix

Tejas Chopra is a Senior Software Engineer, working in the Data Storage Platform team at Netflix, where he is responsible for architecting storage solutions to support Netflix Studios and Netflix Streaming Platform. Prior to Netflix, Tejas was working on designing and implementing the storage...

Read more
Find Tejas Chopra at:

Date

Wednesday Nov 3 / 11:10AM EDT (40 minutes)

Track

The Cloud Operating Model

Topics

Cloud NativeCloud ComputingStorageDevopsInfrastructure

Add to Calendar

Add to calendar

Share

From the same track

Session + Live Q&A Cloud Computing

Optimizing Efficiency & Capacity Management at Web Scale on the Cloud

Wednesday Nov 3 / 12:10PM EDT

Managing capacity demands while maintaining efficiency for a web-scale workload running on a public cloud is a challenging task. In this talk, Molly will share insight on how Pinterest optimizes their use of the cloud, concurrently maintaining demands across key domains of security, availability,...

Molly Junck

Technical Program Manager & Infra Governance and Cloud Vendor Management @Pinterest

Session + Live Q&A Cloud Computing

K8s: Rampant Pragmatism in the Cloud at Starling Bank

Wednesday Nov 3 / 01:10PM EDT

Starling Bank’s back end is made up of 40 - 50 individual services. These were all being deployed to the cloud as a monolith, an approach that was slowing us down. We migrated our delivery pipelines to deliver these services in six separate groups, and are now starting to use Kube. But why...

Jason Maude

Lead Engineer @StarlingBank

PANEL DISCUSSION + Live Q&A Cloud Computing

Panel: Kubernetes at Web Scale on the Cloud

Wednesday Nov 3 / 02:10PM EDT

Although many architectural and design similarities exist between large scale Kubernetes footprints whether on-prem or in the cloud, there are also key differences. Gaining insight on these will help ensure your k8s scaling exercise in the cloud can be accelerated through application of best...

Harry Zhang

Tech Lead of the Cloud Runtime Team @Pinterest

Ramya Krishnan

Staff Site Reliability Engineer @Airbnb

Ashley Kasim

Tech Lead of the Compute team @Lyft

View full Schedule