en
en
See All

Loudness Problem in Streaming World

#INSIGHT

Loudness Problem in Streaming World

We’re in the age where artificial intelligence generates news and self-driving cars are speeding on the road, but the longstanding issue of loudness deviation in audio hasn’t made much progress. It almost seems the issue is getting worse.

For example, many of you may have experienced an ear-splitting volume difference between each Youtube video you watch on your mobile device, which prompts you to immediately adjust the volume. This is a problem, a problem that we easily overlooked since we have been acquainted with it. And this shows how essential volume buttons are when Apple can’t get rid of the volume buttons from iPhone X even after they have removed the haptic home button. 

The loudness problem gets worse when you switch from one app to another.  For instance, when switching from Netflix to Spotify you may feel uncomfortable and even anxious every time you have to change the volume manually worrying about unexpected loud sounds.

Some of you may have such thoughts like “why can’t the smartphone handle this in a smart way?”

So, why is this still an ongoing problem? As mentioned, this issue dates back as far as the dawn of media history.

First of all, there is a common belief of “louder the sound, better the quality”, (it is actually true, unfortunately), and it leads most sound producers to raise the level as much as possible. It used to be suppressed by regulation in other media like TV broadcasts and cinemas.  However, in the smartphone domain, there is no rule yet regarding this loudness problem. Additionally, it is an extremely challenging problem to solve even with a highly advanced AI.

Let’s dive in.

The volume controls on the iPhone prove how essential it is to have. 

First, why does the loudness problem occur?

You may have heard about the ‘audible frequency’. A frequency that is audible to human ears is around 20Hz to 20,000Hz. This range does not make all sound to be heard at the same loudness. High frequencies close to 20,000Hz or the low close to 20Hz, we can hardly hear them. In other words, we can hear it only if the sound pressure is very high. Based on the Psychoacoustic research, the statistical characteristics of the perceived sound level are determined called ’Equal-loudness Contour.’ 

In order for the deep bass and high treble to be fully heard, the sound must be loud. The reason why the music feels like it sounds better at clubs and concerts is that the sound is louder. Because of this, many music producers purposely mix their songs louder than their competitors and this phenomenon is called the “Loudness War” in the music industry, and it is still ongoing. The sound has been progressively getting louder and louder from the 70s music to hits in Billboards 2020 and is constantly pushing its limits. Films and TV programs are no different. They also have joined this war but the regulations are the ones to keep the peace.

Old Pop Waveform (Beatles-Yesterday), -17 LUFS, which we expect, shows the music signal waveform

The latest K-pop waveform (Red Velvet-Pyscho), -5LUFS, as above, the white wave is the background and blackboard is the signal.

When advertisements come in to play, the loudness issue gets even worse. To catch the interests of their viewers and listeners, they aim at even louder than the main programs.

In the U.S., the TV industry resolved this issue by passing the CALM Act (Commercial Advertisement Loudness Mitigation Act). The CALM Act and subsequently ATSC Recommendations A/85 regulates to keep all TV commercials and programs as -24 LUFS. Similar standards were regulated and enforced in Korea, EU, Australia, Japan, and etc for their TV broadcasts, which results in little to no loudness problem.

It is relatively easy to regulate and maintain on TV platforms since there are only a few content providers (broadcasters) and they all have to comply with the regulations to do their services. Even OTT services (like Netflix), which technically do not have to comply with the law, will stream their content to be compatible with (smooth switch from) terrestrial TV programs.

However, it’s totally different in the smartphone world. It is a new world in which plenty of platforms (apps) and plenty of content providers exist, from the point of view of the sound sources. While the Loudness War is ongoing among music streamings, podcasts and gaming apps, the OTTs coming from TV worlds make the issue more complicated because their target loudnesses (-24 LUFS) are way different from others (-6 to -13 LUFS).

When viewers watch Netflix turning volume higher due to the low loudness and turn to the music streaming app, it causes ear-splitting. If the loudness is not normalized across the apps, the problem will remain.

Here’s the research Gaudio Lab has done, the picture below shows the loudness distribution of each of some popular apps. From a maximum of -5 LUFS to a minimum of -39 LUFS, the loudness is quite different between app to app and even within one app. In the midst of this, you can double-check that Netflix is isolated from other mobile-centric services. Unlike the TV watching environment at home, the mobile (smartphone) environment has many chances of getting a listening experience in a noisy atmosphere, such as the subway. The target loudness -24 LUFS of the TV standard is therefore not appropriate in mobile. In fact, there are numerous complaints about the low volume of Netflix’s audio in the community. There’s an only small chance that Netflix would fix this problem because the majority of its viewers still consume on TVs. (It seems like Netflix doesn’t know about the Gaudio Lab’s solution that plays TV and smartphones at different target loudness without having different encoding of the content.😊)

The reality of the loudness problem that can be easily found in the services we often encounter.
(Max -5 and min -39 is the difference of pressing the volume key of the smartphone 11 times. No joke🙄)

Secondly, why is it not being solved?

If you are still reading this so far, you’ll come to the conclusion that it’s hard to sustain the loudness normalization without a regulation. Wouldn’t it be solved like how TV broadcasting set the loudness standard? In the beginning, I also thought to set a regulation would be the answer. But as I thought over it further, I was wondering whether borderless services could be controlled by law under the border of the country. Not only the loudness problem, but It’s difficult for every country, to regulate OTT services on any basis.

If the law doesn’t work then what’s up with the technology? The easy way is to control the loudness when the content is being produced. But, since anyone can be a creator, it’s hardly possible to place professional sound engineers to control the loudness in the workflow.

Alternatively, we can come up with a method to normalize it on the server of an OTT platform where content is aggregated. It’s a smart way to do it but there are two difficulties. As in many OTTs supporting N-screen services, in order to have different target loudness for each target device like TV, PC, and smartphone, different versions of the audio streams should be prepared. It could be a big waste of storage as well as an increase in complexity. Nevertheless, if you decide to do this, it is inevitable that the original should be damaged in the process of normalizing and re-encoding the audio signal to level the loudness. That is, you have to meet a Tandem-coding artifact, should be avoided if possible, for the normalization. Of course, it is also inevitable that the amount of computations increases during this process. Even, there are some content creators who strictly prohibit to alter their works.

Lastly, there is a way to handle it on the client app side. This is a method of the so-called AGC(Automatic Gain Control), which easily causes annoying sound distortions like pumping and breathing because you don’t know which signal will come to next during the processing.

Then, no way to solve?

As one of the heavy consumers on the OTT and music streaming services, I had met these loudness problems every day and really would like to solve it. After long agony, Gaudio Lab has introduced a server-client architecture of loudness normalization technology. The server only measures the loudness according to the international standard ITU-R BS.1770 (i.e: in LUFS unit) and generates metadata for each input content. Therefore, there is no need to alternate the original content in order to normalize and re-encode, and no need to store multiple versions of audio streams for different target loudnesses. It is the client app that does the actual loudness normalization. As it is on the client-side, it can easily set different target loudness according to the target devices and furthermore depending on the different circumstances (such as a noisy subway or a quiet house).

There is no longer a loudness problem mentioned so far on the platforms that have adopted Gaudio Loudness Normalization solution. However, when it comes to the fact that end-users do not stay in an app. The ideal future will come if all app services with sound capabilities could encode and decode the metadata in a unified way. And that’s the next challenge for us. To this end, we’re now working on standardizing the technologies and metadata.

Although I have not mentioned this yet, our ears are a kind of mechanical device and have a finite lifespan. Hearing loss due to excessive volume is a serious problem faced by all of us in the generation of smartphones and earphones. The WHO(World Health Organization) recently issued the ‘safe listening standard’ to warn the risk of hearing loss due to the loud sounds. For your ear safe and for your free from hassle volume settings, please keep your attention on this matter.


CEO/CO-FOUNDER

Henney is a CEO and Co-Founder of Gaudio Lab with 23 years of experience in the audio. He is the holder of over 1,300 worldwide audio patents, most of which are international standard patents.

Leave a Reply

Your email address will not be published. Required fields are marked *

Privacy Settings
We use cookies to enhance your experience while using our website. If you are using our Services via a browser you can restrict, block or remove cookies through your web browser settings. We also use content and scripts from third parties that may use tracking technologies. You can selectively provide your consent below to allow such third party embeds. For complete information about the cookies we use, data we collect and how we process them, please check our Privacy Policy
Youtube
Consent to display content from Youtube
Vimeo
Consent to display content from Vimeo
Google Maps
Consent to display content from Google
Spotify
Consent to display content from Spotify
Sound Cloud
Consent to display content from Sound