What you think you know about bit-depth is probably wrong

In the trendy age of audio, you can’t transfer for mention of “Hi-Res” and 24-bit “Studio Quality” music. If you haven’t noticed the development in high-end smartphones – Sony’s LDAC Bluetooth codec – and streaming providers like Tidal, then you actually need to start out studying this website more.

The promise is simple – superior listening quality because of more knowledge, aka bit-depth. That’s 24-bits of digital ones and zeroes versus the puny 16-bit hangover from the CD era. In fact, you’ll should pay additional for these larger quality services, but extra bits are certainly better proper?

“Low res” audio is typically shown off as a staircase waveform. This is not how audio sampling works and isn’t what audio seems like coming out of a device.

Not necessarily. The necessity for larger and better bit-depths isn’t based mostly in scientific reality, but relatively in a twisting of the truth and exploiting a scarcity of shopper awareness about the science of sound. Finally, corporations advertising 24-bit audio have much more to realize in revenue than you do in superior playback high quality.

Stair-stepping isn’t a thing

To recommend that 24-bit audio is vital, corporations (and too many others who try to explain this matter) trot out the very familiar audio quality stairway to heaven. The 16-bit example all the time exhibits a bumpy, jagged copy of a sine-wave or other sign, whereas the 24-bit equal seems superbly clean and higher resolution. It’s a easy visual assist, but one which relies on ignorance of the topic and the science to steer shoppers to the wrong conclusions.

Earlier than somebody bites my head off, technically speaking these stair-step examples do considerably accurately portray audio within the digital area. Nevertheless, a stem plot/lollipop chart is a extra correct graphic to visual audio sampling than these stair-steps. Think about it this manner – a sample incorporates an amplitude at a very specific time limit, not an amplitude held for a selected size of time.

Quantization Stairs vs Stem Graphs

Using stair graphs is intentionally misleading when stem charts present a extra correct representation of digital audio. These two graphs plot the same knowledge points however the stair plot seems a lot much less correct.

Nevertheless, it’s right that an analog to digital converter (ADC) has to suit an infinite analog audio sign into a finite variety of bits. A bit that falls between two ranges needs to be rounded to the closest approximation, which is referred to as a quantization error or quantization noise. (Keep in mind this, as we’ll come again to it.)

Nevertheless, if you take a look at the audio output of any audio digital to analog converter (DAC) built this century (and probably properly before then), you gained’t spot any stair-steps. Not even when you output an 8-bit signal. So what provides?

10khz sine wave output capture with an oscilloscope

An Eight-bit, 10kHz sine wave output captured from a low-cost Pixel 3a smartphone. We will see some noise but no noticeable stair-steps so typically portrayed by audio corporations.

First, what these stair-step diagrams describe, if we apply them to an audio output, is one thing referred to as a zero-order-hold DAC. This is a quite simple and low cost DAC know-how where a sign is switched between numerous ranges each new sample to offer an output. This is not utilized in any professional or half-decent shopper audio merchandise. You may find it in a $5 microcontroller, but definitely not anyplace else. Misrepresenting audio outputs in this means implies a distorted, inaccurate waveform, but this isn’t what you’re getting.

In reality, a modern ∆Σ DAC output is an oversampled 1-bit PDM sign (right), quite than a zero-hold sign (left). The latter produces a decrease noise analog output when filtered.

Audio-grade ADCs and DACs are predominantly based mostly on delta-sigma (∆Σ) modulation. Elements of this caliber embrace interpolation and oversampling, noise shaping, and filtering to clean out and scale back noise. Delta-sigma DACs convert audio samples into a 1-bit stream (pulse-density modulation) with a very high sample price. When filtered, this produces a clean output signal with noise pushed nicely out of audible frequencies.

In a nutshell: trendy DACs don’t output rough-looking jagged audio samples—they output a bit stream that is noise filtered into a very correct, clean output. This stair-stepping visualization is wrong because of something referred to as “quantization noise.”

Understanding Quantization noise

In any finite system, rounding errors occur. It’s true that a 24-bit ADC or DAC could have a smaller rounding error than a 16-bit equivalent, however what does that really imply? Extra importantly, what can we truly hear? Is it distortion or fuzz, are details misplaced ceaselessly?

It’s truly slightly bit of each relying on whether or not you’re in the digital or analog realms. But the key idea to understanding both is getting to grips with noise flooring, and how this improves as bit-depth will increase. To exhibit, let’s step back from 16 and 24 bits and take a look at very small bit-depth examples.

The difference between 16 and 24 bit-depths is not the accuracy in the shape of a waveform, but the obtainable restrict before digital noise interferes with our signal.

There are fairly a couple of things to take a look at in the instance under, so first a quick rationalization of what we’re taking a look at. We’ve our enter (blue) and quantized (orange) waveforms in the prime charts, with bit depths of 2, four, and eight bits. We’ve additionally added a small quantity of noise to our sign to raised simulate the actual world. On the backside, we’ve a graph of the quantization error or rounding noise, which is calculated by subtracting the quantized signal from the input signal.

Quantization noise example between 2 bits, 4 bits, and 8 bits.

Quantization noise increases the smaller the bit depth is, by way of rounding errors.

Growing the bit depth clearly makes the quantized signal a better match for the enter sign. Nevertheless that’s not what’s essential, observe the much larger error/noise sign for the lower bit-depths. The quantized sign hasn’t removed knowledge from our input, it’s truly added in that error sign. Additive Synthesis tells us that a signal may be reproduced by the sum of another two alerts, including out of part alerts that act as subtraction. That’s how noise cancellation works. So these rounding errors are introducing a new noise signal.

This isn’t simply theoretical, you can truly hear increasingly noise in decrease bit-depth audio information. To know why, look at what’s occurring within the 2-bit example with very small alerts, resembling before zero.2 seconds. Click on here for a zoomed in graphic. Very small modifications in the enter signal produce massive modifications in the quantized model. This is the rounding error in motion, which has the impact of amplifying small-signal noise. So as soon as once more, noise turns into louder as bit-depth decreases.

Quantization doesn’t remove knowledge from our input, it truly adds in a loud error sign.

Think about this in reverse too: it’s not attainable to seize a sign smaller than the dimensions of the quantization step—paradoxically often known as the least vital bit. Small signal modifications have to jump up to the closest quantization degree. Larger bit depths have smaller quantization steps and thus smaller levels of noise amplification.

Most significantly though, notice that the quantization noise amplitude remains consistent, regardless of the amplitude of the enter alerts. This demonstrates that noise occurs in any respect the totally different quantization ranges, so there’s a constant degree of noise for any given bit-depth. Larger bit-depths produce less noise. We should always, subsequently, think of the variations between 16 and 24 bit-depths not as the accuracy in the form of a waveform, however because the out there restrict earlier than digital noise interferes with our sign.

Can you hear this noise?

Now that we are speaking about bit-depth when it comes to noise, let’s return to our above graphics one last time. Word how the 8-bit instance seems like an virtually good match for our noisy enter sign. This is because its 8-bit resolution is truly enough to capture the level of the background noise. In other words: the quantization step measurement is smaller than the amplitude of the noise, or the signal-to-noise ratio (SNR) is better than the background noise degree.

The equation 20log(2n), where n is the bit-depth, give us the SNR. An Eight-bit sign has an SNR of 48dB, 12-bits is 72dB, while 16-bit hits 96dB, and 24-bits a whopping 144dB. This is necessary as a result of we now know that we only want a bit-depth with sufficient SNR to accommodate the dynamic vary between our background noise and the loudest sign we need to seize to breed audio as perfectly as it appears in the actual world. It will get slightly tough shifting from the relative scales of the digital realm to the sound pressure-based scales of the bodily world, so we’ll try to maintain it simple.

We require a bit-depth with sufficient SNR to accommodate for our background noise to seize our audio as completely because it sounds in the actual world.

Your ear has a sensitivity starting from 0dB (silence) to about 120dB (painfully loud sound), and the standard means to discern volumes is just 1dB aside. So the dynamic vary of your ear is about 120dB, or close to 20-bits.

Nevertheless, you can’t hear all this directly, as the tympanic membrane, or eardrum, tightens to scale back the amount of volume truly reaching the internal ear in loud environments. You’re additionally not going to be listening to music anyplace close to this loud, because you’ll go deaf. Moreover, the environments you and I take heed to music in are usually not as silent as healthy ears can hear. A well-treated recording studio might take us right down to under 20dB for background noise, but listening in a bustling front room or on the bus will clearly worsen the circumstances and scale back our want for a high dynamic vary.

The human ear has a huge dynamic range, but simply not all at one time. Masking and hearing safety reduces its effectiveness.

On prime of all that: as loudness will increase, larger frequency masking takes impact in your ear. At low volumes of 20 to 40dB, masking doesn’t happen apart from sounds shut in pitch. Nevertheless, at 80dB sounds under 40dB might be masked, while at 100dB sound under 70dB are unimaginable to listen to. The dynamic nature of the ear and listening material makes it exhausting to provide a precise quantity, however the actual dynamic range of your hearing is possible in the area of 70dB in a mean surroundings, down to only 40dB in very loud environments. A bit depth of just 12-bits would probably have most individuals coated, so 16-bit CDs give us plenty of headroom.

Human Hearing Masking Patterns

hyperphysics High-frequency masking occurs at loud listening volumes, limiting our perception of quieter sounds.

Most instruments and recording microphones introduce noise too (particularly guitar amps), even in very quiet recording studios. There have additionally been a couple of research into the dynamic range of different genres, including this one which exhibits a typical 60dB dynamic range. Unsurprisingly, genres with a larger affinity for quiet elements, akin to choir, opera, and piano, confirmed most dynamic ranges around 70dB, whereas “louder” rock, pop, and rap genres tended in the direction of 60dB and under. Finally, music is solely produced and recorded with so much constancy.

You may also be acquainted with the music business “loudness wars“, which definitely defeats the purpose of at the moment’s Hello-Res audio formats. Heavy use of compression (which boosts noise and attenuates peaks) reduces dynamic vary. Trendy music has considerably less dynamic range than albums from 30 years in the past. Theoretically, trendy music could possibly be distributed at lower bit-rates than previous music. You’ll be able to take a look at the dynamic range of a variety of many albums right here.

A photo of a stack of CDs on a wooden table.

CD quality may be “only” 16-bit, however it’s overkill for high quality.

16 bits is all you want

This has been quite a journey, but hopefully, you’ve come away with a much more nuanced image of bit-depth, noise, and dynamic vary, than these deceptive stair-case examples you so typically see.

Bit-depth is all about noise, and the more bits of knowledge you should retailer audio: the much less quantization noise might be launched into your recording. By the same token, you’ll also have the ability to capture smaller alerts more accurately, serving to to drive the digital noise flooring under the recording or listening setting. That’s all we’d like bit-depth to do. There’s no profit using big bit-depths for audio masters.

Surprisingly, 12-bits is probably enough for an honest sounding music master and to cater to the dynamic range of most listening environments. Nevertheless, digital audio transports extra than simply music, and examples like speech or environmental recordings for TV can make use of a wider dynamic range than most music does. Plus a bit of headroom for separation between loud and quiet by no means harm anybody.

On stability, 16-bits (96dB of dynamic range or 120dB with dithering utilized) accommodates a variety of audio varieties, as well as the bounds of human hearing and typical listening environments. The perceptual increases in 24-bit high quality are extremely debatable if not merely a placebo, as I hope I’ve demonstrated. Plus, the increase in file sizes and bandwidth makes them unnecessary. The kind of compression used to shrink down the file measurement of your music library or stream has a much more noticeable impression on sound quality than whether it’s a 16 or 24-bit file.

Disclosure: We might receive affiliate compensation in connection together with your buy of merchandise by way of links on this page. Although we might receive compensation, we all the time give our trustworthy opinions, findings, beliefs, or experiences on every product. See our ethics policy for extra details.