April 24, 2024
  • We’re sharing how Meta delivers high-quality audio at scale with the xHE-AAC audio codec.
  • xHE-AAC has already been deployed on Fb and Instagram to supply enhanced audio for options like Reels and Tales. 

At Meta, we serve each media use case possible for billions of individuals internationally — from short-form, user-generated content material, comparable to Reels, to premium video on demand (VOD) and reside broadcasts. Given this, we’d like a next-generation audio codec that helps a variety of working factors with wonderful compression effectivity and fashionable, system-level audio options. 

To handle these wants now and into the long run, Meta has embraced xHE-AAC because the automobile for delivering high-quality audio at scale.

The advantages of xHE-AAC

xHE-AAC is the most recent member of the MPEG AAC audio codec household. The Fraunhofer Institute for Integrated Circuits IIS performed a considerable function within the improvement of xHE-AAC and the MPEG-D DRC commonplace.

Right this moment, xHE-AAC is already offering a superior audio expertise on Fb and Instagram — together with on Reels and Tales — and has quite a lot of invaluable options.

Loudness administration

With a whole lot of hundreds of thousands of uploads per day throughout Fb and Instagram, we obtain audio tracks with loudness ranges starting from silence to full scale, and all the pieces in between. 

xHE-AAC codec at Meta

When folks play these movies sequentially, they will understand some audio as being too loud or too quiet. This creates listener fatigue from having to continuously alter the quantity.

xHE-AAC codec at Meta

xHE-AAC’s built-in loudness administration system solves for loudness inconsistency whereas meticulously preserving creator intent by bringing the typical loudness of all classes to the identical goal stage and managing the dynamic vary of every session to suit the playback surroundings.

As a substitute of burning in a selected goal stage and dynamic vary compression (DRC) profile throughout encoding, xHE-AAC permits us to go away the unique audio traits untouched and delegate loudness administration processing to the shopper through loudness metadata, for the optimum audio expertise based mostly on context. 

xHE-AAC codec at Meta

Because of xHE-AAC’s loudness administration, folks can spend extra time immersed of their favourite content material and fewer time fidgeting with the quantity management.

Adaptive bit charge audio

Most individuals who use our apps eat media on cellular units and anticipate the best audio high quality with out interruption. This presents a problem for streaming media as a result of connection high quality varies on cellular and may end up in a really uneven consumer expertise. 

xHE-AAC codec at Meta

To optimize high quality below dynamic bandwidth constraints, we produce a number of video and audio qualities to match various community situations at playback time. Although we produce a number of audio lanes, we now have traditionally solely employed adaptive bit charge (ABR) algorithms to modify video qualities throughout playback as a result of it’s troublesome to allow adaptive bit charge audio with out compromising high quality throughout lane transitions.

With the intention to allow seamless audio ABR, xHE-AAC introduces the idea of instant playout frames (IPFs) that include all the info obligatory to start out taking part in a brand new audio lane with out counting on information from different frames. By putting an IPF initially of every Dynamic Adaptive Streaming over HTTP (DASH) phase and aligning the phase durations of every lane, we will seamlessly swap between audio lanes throughout playback to supply the highest-quality audio at any accessible bandwidth whereas avoiding playback stalls.

xHE-AAC codec at Meta

After launching audio ABR on Fb for Android, we had been in a position to enhance consumer expertise by lowering the variety of classes the place playback stalls. 

How we deployed xHE-AAC

We generate xHE-AAC bitstreams utilizing an encoder SDK offered by the Fraunhofer Institute for Built-in Circuits IIS, after which put together the ensuing audio information for DASH streaming with shaka-packager. The xHE-AAC encoder’s two-pass encoding mode is used to measure the enter loudness envelope and common program loudness on the primary cross and carry out the precise audio information compression on the second cross. As an additional advantage, two-pass encoding permits us to make use of loudness vary management (LRAC) DRC, which mitigates pumping artifacts in any other case launched by single-pass DRC algorithms. xHE-AAC codec at Meta

To organize an xHE-AAC audio adaptation set for ABR supply, IPFs are inserted at fixed time intervals, audio configuration parameters comparable to pattern charge and channel configuration are stored fixed, and distinctive stream identifiers are chosen for every lane within the audio adaptation set.

At playback time, we custom-fit the audio to the listening surroundings by configuring a goal loudness stage and DRC impact kind based mostly on context, and due to the embedded loudness metadata, we will adapt a single xHE-AAC bitstream to a wide range of audio consumption use circumstances, from headphones to system audio system and varied ranges of background noise. Lastly, if the shopper is starved for information or bandwidth is plentiful, audio ABR will mechanically swap audio qualities to make sure that the best audio high quality is performed with out interrupting the playback session.

The place are you able to expertise xHE-AAC at the moment?

You possibly can expertise xHE-AAC audio on Fb for iOS and Android, in addition to on focused surfaces on Instagram, comparable to Reels and Tales. We encourage you to put in the most recent model of Fb and Instagram apps on iOS 13+ and Android 9+ to make sure which you can expertise it.


This work is the collective results of your complete Video Infrastructure and Instagram Media Platform groups at Meta in collaboration with Fraunhofer Institute for Built-in Circuits IIS. The creator wish to prolong particular due to Abhishek Gera, Tim Harris, Arun Kotiedath, Edward Li, Meng Li, Srinivas Lingutla, Denise Noyes, Mohanish Penta, David Ronca, Haixia Shi, Mike Starr, Cosmin Stejerean, Jithin Parayil Thomas, Simha Venkataramaiah, Juehui Zhang, Runshen Zhu, and the engineering workforce at Fraunhofer Institute for Built-in Circuits IIS.