You only get one shot at a first impression – being able to show off your listing in an engaging video can improve buyer confidence. Today, eBay announced a new video platform called REEL, which will integrate videos into the eBay Motors App. The feature allows sellers to upload one or more videos — in addition to adding photos of the car they are listing — which improves the shopping experience. Through video, sellers will have an easier time capturing the condition and usage of the car.
Building a video platform that can scale across eBay’s 1.5 billion listings will provide the following:
● High throughput with low latency for both upload and download services.
● Resilient video storage at a very low storage cost. This will require transcoding the original video into smaller but perceptually good quality videos.
● Durable and reliable storage. Any loss of data or missing chunks for a big video file can spoil the experience.
● Appropriate Service Level Agreement (SLAs) for ecommerce specific use cases of short-duration (~1 minute) videos.
● Smooth playback across multiple devices and unreliable networks.
● Efficient monitoring and visualization for a system’s state and performance.
● Visualization for business metrics.
● Simplicity and maintainability. Building on a standard stack with existing tools for deployments and configurations makes it easier for development teams and operations to monitor and evolve.
REEL Architecture and In-House Technologies
REEL’s architecture is comprised of the following in order of user/system interaction:
● Video Upload pipeline for ingesting large video files into the system using REEL’s resumable APIs.
● Video Processing pipeline for producing smaller sized versions of the source video with similar perceived quality.
● Video Delivery pipeline to enable fast streaming on players running on heterogeneous devices across the world.
To start the process of uploading a video, the seller signs into the eBay Motors App and starts to create an item listing for sale. Then, they upload the video stored on their local device to the app. The app will internally use REEL APIs to upload the videos to our platform. Once the whole video is received on the server end, it is broken into smaller chunks and stored into our backend storage. This triggers an async processing pipeline, where the uploaded video is further transcoded into a smaller resolution but high perceptual quality. Transcoded bytes are statically stored and retrieved during the video’s playback using DASH protocol. REEL relies on multiple levels of caching for faster delivery.
Video Upload Pipeline
HTTP servers are public-facing services that are responsible for receiving video uploads through resumable APIs. If the video fails to upload due to network fault, resumable protocol lets sellers resume an incomplete upload at a future time, a feature that can be useful for large files. This can save sellers extra bandwidth and time to re-upload larger files from scratch again. Service application is an orchestrator between storage and metadata services, and serves as a video “upload complete” event producer for the async transcoding of video. We currently use Kafka as our message bus, which carries Video IDs for videos once they are persisted in the backend. Upload APIs require OAuthtokens for access and are not public yet. They are only available for consumption by eBay applications. The REEL API currently accepts seller-provided metadata, like tags and a short description, along with a link to previously uploaded video thumbnail images. Current upload SLAs are aligned with search indexing.
Uploaded bytes are stored in Binary Large Object (BLOB) Storage which was built in-house to run on low cost commodity hardware and provide random reads at very low latencies, with an extremely high durability (~10- 24 probability for losing an object). It is a generic object store and is used for use cases other than images and videos as well. The key drivers for choosing it wereresilience and high performance.
We chunk the uploaded video and store the chunked bytes as individual objects in storage with corresponding metadata in a database store. Metadata includes a video’s original dimensions, size, bitrate, durations and sorts, user-provided details and peripheral things like storage object IDs. This enables metadata to be independently retrieved by eBay’s internal applications or be archived for analytics purposes. Currently, we use a homegrown, fault tolerant, geo-distributed datastore. All operations on this datastore are fronted by an Entity service layer.
Before reaching backend servers, upload requests are proxied through eBay’s Point of Presence(PoPs). We leverage PoPs for eBay’s own UFES (Unified Front End Services) deployments as software edge proxies between a user and our origin. This proxy connects clients to eBay’s network, bringing a customer’s request on eBay’s internet backbone with increased bandwidth and primed connections, resulting in lower latencies. It also helps with SSL terminations and other TCP tunings to achieve maximum performance.
Video Processing Pipeline
Video files are big, and they can require a lot of space for storage and bandwidth to download or stream. Files need to be compressed while maintaining a high quality and keeping the size as small as possible. Transcoding is the process to compress the original source file. We can achieve different levels of compression by altering FPS, video dimension, aspect ratio and audio/video bitrates. Videos are transcoded with appropriate codec to compress the original uploaded file for best perceived quality. Reducing the size of individual video filessignificantly lowers costs of distribution.
The transcoding compute engine consumes a previous referred event (containing video ID) from Kafka, pushed by the upload pipeline once the upload is complete and starts the transcoding pipeline. Transcoding is designed around DASH protocol. The Adaptive Bitrate Streams requires a video to be divided into chunks, ideally between 1-10 seconds, and the client device can pick the chunk which best fits the device environment. This is important if the network degrades, for instance. Enabling ABR requires an encoder, which encodes a single video source into multiple bitrate adaptations as required per output resolution. Our individual target chunks are three seconds long. Each chunk is stored as an individual object in our backend storage and can be referenced as a separate entity in the metadata store. Video transcoding is a highly computationally intensive task and so we use SKUs with many CPU cores to perform transcoding. REEL has a workflow to orchestrate video processing work, storage and metadata persistence — this way, it can parallelize the expensive workload of video transcoding. This workflow is managed in a Kubernetes cluster.
With the myriad devices and camera settings, we standardized on our acceptable inputs and outputs for our initial use cases. The table below provides a summary:
Video Delivery Pipeline
Typically, a player on a device would be playing a video associated with an item. A video is divided and stored as logical chunks as we saw in the video processing pipeline. When the backend HTTP server receives a request for a video, it fetches all the metadata required for serving the requested portion of the video. Since storage IDs are part of metadata, the server can fetch the required video files from storage and statically serve them.
Adaptive Bitrate Streaming
For delivery we make use of Adaptive Bitrate Streaming (ABR). ABR is designed to deliver video and audio streams efficiently over large distributed HTTP networks. REEL delivers playbacks using Dynamic Adaptive Streaming over HTTP (DASH) streams and supports MPEG-DASH and HTTP Live Streaming with fMP4 segments. Below is a brief overview on how ABR and DASH work.
The client streaming videos has a pretty good idea around resources — such as bandwidth, CPU and screen size — available on the device and what the device is capable of. Clients supporting DASH can use this information to control what best resolution or bitrate it can currently support for a smooth playback with minimal stalls or re-buffering. It can switch between various bitrates on a per segment basis. As the above diagram explains, based on changing network bandwidth conditions on the device’s end, it can request a bigger or smaller resolution segment for the next fetch. All the critical information needed to switch between different encodings is stored in an Media Presentation Description (MPD) file or HLS manifest files.
To move videos closer to the user, REEL leverages two layers of caching. It tremendously helps with customer experience by bringing down the latencies from a few seconds to a few milliseconds by reducing repeated requests to origin. Manifest files (MPD/HLS) and the video segments requested are cached on their way back to the user.
Third-party Content Delivery Networks (CDNs) serve as the first layer closest to the user. At the CDN layer, we have enabled Tiered Distribution, which sends the cache misses from edge servers to smaller sets of larger-capacity, second-tier servers. This greatly increases cache hit percentage on CDN.
eBay has several PoPs across the globe for edge computing. REEL leverages some of these PoPs to serve as a second layer of caching and the PoPs host Apache Traffic Server (ATS). We provide a large amount of dedicated storage to ATS to reduce chances of cache eviction before object TTLs. This helps with the long tail cache misses from the CDNs.
We leverage Unified Front End Services (UFES) deployments as software edge proxies between CDN and our origin caches as well.
As you can see, there are lots of moving pieces. A robust logging and monitoring system is required for the visibility across the systems. We use eBay’s standard observability platform called Unified Monitoring Platform (UMP), which enables logging, monitoring and graphing capabilities. Since we are chunking/segmenting a video at several places and there are multiple components involved, distributed tracing is necessary for tracing a request end-to-end.
UMP provides Prometheus support for metrics and alerting. Currently, we push metrics on latencies, counts and failures on each component along with system and custom metrics per component. This helps achieve the visibility required to understand the current utilization of the system and catch the issues timely. A sample dashboard for some of our video upload metrics is below:
In this blog, we went deeper into the architecture of the video platform by detailing how all of the components in the platform are loosely coupled and perform a specific business capability. With our low latency storage, big compute engines, adaptive bitrate streaming and caching, we can lower our SLAs and provide smooth playbacks. As more diverse use cases are onboarded to REEL, our scale will grow even further and will require a peripheral ecosystem to support growing needs. We want to move more toward making our video processing more efficient and extending our infrastructure to work with eBay’s Artificial Intelligence and Machine Learning platform.