Zhihao Hong; Employees Software program Engineer | Emma Adams; Sr. Software program Engineer | Jeremy Muhia; Sr. Software program Engineer | Blossom Yin; Software program Engineer II | Melissa He; Sr. Software program Engineer |
Video content material has emerged as a popular format for individuals to find inspirations at Pinterest. On this weblog submit, we’ll define current enhancements made to the Adaptive Bitrate (ABR) video efficiency, in addition to its optimistic affect on person engagement.
Phrases:
- ABR: An acronym for Adaptive Bitrate (ABR) Streaming protocol.
- HLS: HTTP stay streaming (HLS) is an ABR protocol developed by Apple and supported each stay and on-demand streaming.
- DASH: Dynamic Adaptive Streaming over HTTP (DASH) is one other ABR protocol that works equally to HLS.
ABR Streaming is a extensively adopted protocol within the trade for delivering video content material. This methodology entails encoding the content material at a number of bitrates and resolutions, leading to a number of rendition variations of the identical video. Throughout playback, gamers improve the person expertise by deciding on the very best high quality and dynamically adjusting it based mostly on community circumstances.
At Pinterest, we make the most of HLS and DASH for video streaming on iOS and Android platforms, respectively. HLS is supported by way of iOS’s AVPlayer, whereas DASH is supported by Android’s ExoPlayer. At the moment, HLS accounts for about 70% of video playback classes on iOS apps, and DASH accounts for round 55% of video classes on Android.
Startup latency is a key metric for evaluating video efficiency at Pinterest. Due to this fact, we’ll first study the steps of beginning an ABR video. To provoke playback, purchasers should get hold of each the manifest information and partial media information from the CDN by way of community requests. Manifest information are usually small-size information however present important metadata about video streams, together with their resolutions, bitrates, and many others. In distinction, the precise video and audio content material is encoded within the media information. Downloading sources in each file sorts takes time and is the first contributor to customers’ perceived latency.
Our work goals to cut back latency in manifest loading (highlighted in Determine 1), thereby enhancing total video startup efficiency. Though manifest information are small in dimension, downloading them provides a non-trivial quantity of overhead on video startup. That is significantly noticeable with HLS, the place every video incorporates a number of manifests, leading to a number of community spherical journeys. As illustrated in Determine 1, after receiving video URLs from the API endpoint, gamers start the streaming course of by downloading the primary manifest playlist from the CDN. The content material inside the primary manifest supplies URLs for the rendition-level manifest playlists, prompting gamers to ship subsequent requests to obtain them. Lastly, based mostly on community circumstances, gamers obtain the suitable media information to start out playback. Buying manifests is a prerequisite for downloading media bytes, and lowering this latency results in sooner loading instances and a greater viewing expertise. Within the subsequent part, we’ll share our resolution to this drawback.
At a excessive stage, our resolution eliminates the spherical journey latency related to fetching a number of manifests by embedding them within the API response. When purchasers request Pin metadata by way of endpoints, the manifest file bytes are serialized and built-in into the response payload together with different metadata. Throughout playback, gamers can swiftly entry the manifest info domestically, enabling them to proceed on to the ultimate media obtain stage. This methodology permits gamers to bypass the method of downloading manifests, thereby lowering video startup latency.
Now let’s dive into among the implementation particulars and challenges we realized through the course of.
One of many main hurdles we confronted was the overhead imposed on the API endpoint. To course of API requests, the backend is required to retrieve video manifest information, inflicting an increase within the total latency of API responses. This problem is especially outstanding for HLS, the place quite a few manifest playlists are wanted for every video, leading to a big improve in latency as a consequence of a number of community calls. Though an preliminary try and parallelize community calls supplied some aid, the latency regression persevered.
We efficiently tackled this problem by incorporating a MemCache layer into the manifest serving course of. Memcache supplies considerably decrease latency than community calls when the cache is hit and is efficient for platforms like Pinterest, the place well-liked content material is constantly served to varied purchasers, leading to a excessive cache hit price. Following the implementation of Memcache, API overhead was successfully managed.
We additionally carried out iterations on backend mechanisms for retrieving manifests, evaluating fetching from CDN or fetching immediately from origins: S3. We finally landed S3 fetching because the optimum resolution. In distinction to CDN fetching, S3 fetching gives higher efficiency and price effectivity.
After a number of iterations, the ultimate backend move appears like this:
AVPlayer
For the gamers’ facet, we make the most of the AVAssetResourceLoaderDelegate APIs on iOS to customise the loading technique of manifest information. These APIs allow purposes to include their very own logic for managing useful resource loading.
Determine 4 illustrates the manifest loading course of between AVPlayer and our code. Manifest information are delivered from the backend and saved on the consumer. When playback is requested, AVPlayer sends a number of AVAssetResourceLoadingRequests to acquire details about the manifest information. In response to those requests, we find the serialized information from the API response and provide them accordingly. The loading requests for media information are redirected to the CDN tackle as common.
ExoPlayer
Android employs a comparable method by consolidating the loading requests at ExoPlayer. The method begins with getting the contents of the .mpd manifest file from the API. These contents are saved within the tag property of MediaItem.LocalConfiguration.
// MediaItemTag.kt
information class MediaItemTag(
val manifest: String?,
)// VideoManager.kt
val mediaItemTag = MediaItemTag(
manifest = apiResponse.dashManifest,
)
val mediaItem = MediaItem.Builder()
.setTag(mediaItemTag)
.setUri(...) // instance: "https://instance.com/some/video/url.mpd"
.construct()
After the MediaItem is ready and we’re able to name put together, the notable customization begins when ExoPlayer creates a MediaSource. Particularly, Android implements the MediaSource.Factory interface, which specifies a createMediaSource method.
In createMediaSource, we are able to retrieve the DASH manifest that was saved within the tag property from the MediaItem and remodel it into an ExoPlayer primitive utilizing DashManifestParser:
// CustomMediaSourceFactory.kt
override enjoyable createMediaSource(mediaItem: MediaItem): MediaSource
val mediaItemTag = mediaItem.localConfiguration!!.tag
val manifest = (mediaItemTag as MediaItemTag).manifestval dashManifest = DashManifestParser()
.parse(
mediaItem.localConfiguration!!.uri,
manifest.byteInputStream()
)
With the contents of the manifest obtainable in reminiscence, using DashMediaSource.Factory’s API for side-loading a manifest is the final step required:
class PinterestMediaSourceFactory(
non-public val dataSourceFactory: DataSource.Manufacturing unit,
): MediaSource.Manufacturing unit override enjoyable createMediaSource(mediaItem: MediaItem): MediaSource
val dashChunkSourceFactory = DefaultDashChunkSource
.Manufacturing unit(dataSourceFactory)
val dashMediaSourceFactory = DashMediaSource
.Manufacturing unit(
/* chunkSourceFactory = */ dashChunkSourceFactory,
/* manifestDataSourceFactory = */ null,
)
return dashMediaSourceFactory.createMediaSource(
dashManifest,
mediaItem
)
Now, as an alternative of getting to first fetch the .mpd file from the CDN, ExoPlayer can skip that request and instantly start fetching media.
As of this writing, each Android and iOS platforms have totally applied the above resolution. This has led to a big enchancment in video efficiency metrics, significantly in startup latency. Consequently, person engagement on Pinterest was boosted because of the sooner viewing expertise.
With the flexibility to govern the manifest loading course of, purchasers could make native changes to realize extra fine-grained video high quality management based mostly on the UI floor. By eradicating undesirable bitrate renditions from the unique manifest file earlier than offering it to the gamers, we are able to restrict the variety of bitrate renditions obtainable for the participant, thereby gaining extra management over playback high quality. As an illustration, on a big UI floor akin to full display screen, high-quality video is extra preferable. By eradicating the decrease bitrate renditions from the unique manifest, we are able to be certain that gamers solely play high-quality video with out getting ready a number of units of manifests at backend.
We want to lengthen our honest gratitude to Liang Ma and Sterling Li for his or her important contributions to this mission and distinctive technical management. Their experience and dedication have been instrumental in overcoming quite a few challenges and driving the mission to success. We’re deeply appreciative of their efforts and the optimistic affect they’ve had on this initiative.
To study extra about engineering at Pinterest, try the remainder of our Engineering Weblog and go to our Pinterest Labs web site. To discover and apply to open roles, go to our Careers web page.