Press "Enter" to skip to content

Advanced equalization for headphones

It’s time to talk about equalization, a subject that can be extremely vast and complex, unfortunately often the victim of stereotypes.
Let’s start analyzing the major stereotypes: equalizing worsens the sound quality and increases distortion. This misconception derives from the wrong approach that many have towards the subject; Equalizing does not mean turning the tone knobs in a preamp and expecting miracles, nor does it mean applying equalization presets in the playback software you use, let alone playing with a fixed band equalizer expecting noteworthy improvements.
Like any aspect of audio reproduction, in order to master an instrument you need to study a little, which often the lazy audiophile does not want to do, limiting himself to a quick test and then drawing erroneous conclusions (however the distortion of the tubes and the RIAA equalization of vinyl are fine 😛).

Let’s start with the limits of equalization: it can’t work miracles, but it can almost always greatly improve the performance of a headphone.
The freedom with which to apply an equalization curve depends a lot on the drivers that your headphones use: dynamic drivers are the most critical ones, as among the various existing types they have a greater average distortion, if you go too far with the equalization you would risk having negative results. On the other side planar, electrostatic, BA and ribbon drivers, having less intrinsic distortion, will allow more aggressive curves.
In this guide I will not deal with the basics of using a parametric equalizer, but I will show, step by step, the path that I myself take to be able to achieve a satisfactory sonic result. Mind that this is the HARD way of doing EQ, but by far the best for results (imo)

Step 1 : Get a Measurement of Your Headphone’s Frequency Response

Visit one of these two links and search for your model: 
https://github.com/jaakkopasanen/AutoEq
https://www.reddit.com/r/oratory1990/wiki/index

As for over-ear headphones, the measurements are generally reliable, and the proposed equalizations are a good starting point. Measurements beyond 10kHz are not to be considered reliable.

As for the in-ear headphones, the measurements must always be taken and interpreted with a grain of salt, the tips used and the depth of insertion of the IEM can significantly vary the measurement, especially for high frequencies and extreme low frequencies.

Consequently, my process is to always start with the proposed equalization curve and then modify it, with numerous attempts, by ear.

It is extremely important to take numerous breaks during this process to prevent our ear from getting used to the sound. This way you will avoid being disappointed in your work the following day.

Step 2: Prepare the equalization curve

To prepare the equalization curve I use RePhase as software. Let’s start with the curve I created for my Tin Hifi P1:

The curve consists of: 2 shelwing low filters, 1 shelving high filters, 4 peak filters, 1 low-pass Likwitz-Riley filter.

As you can guess, such a curve can only be approximated with a parametric equalizer.
I will not discuss the differences between the different types of filters, in which case I would become extremely verbose and the literature about it is easily found on the internet (read it!).
Remember ALWAYS that you MUST EQ by subtraction. So first you build your curve, then you lower the output (“General” tab) to have all the curve below 0dB.

To export your filter you’ll need to setup some parameters:

  • Number of taps: for short-medium lenght filters (up to 88200/96000Hz) go for >=65536 taps. For longer and more complex filters (up to 384kHz) go with 131072 or higher.
  • Optimization: > moderate, -120dB or higher
  • Rate: the rate at which you want to apply the filter. You can’t apply a 96000Hz filter to a 44100Hz file! This must match the input file!
  • Format: it depends on what software you use to apply the eq, more on this in the next step.

Step 3: apply your filter using a convolution engine


On-the-fly EQ:

  • On Windows: system-wide with EqualizerAPO (it accepts up to 64bit stereo .wav). Single software like Roon, Jriver, Foobar2000 can do convolution.
  • On macOS: here I’m not so informed about. I know that there are some au vst plugin that you can use too apply convolution.
  • On Android: JamesDSP or Viper4Android, both accepts up to 32/64bit stereo .wav, needs to be 48Khz.

Offline EQ:

Another option is to apply your equalization directly on your music files, I personaly do this way since my DAP does not have an integrated equalizer (at least not a decent one).

For this you can use SoX (up to 32bit .txt files) you can use a .bat script in Windows to convert multiple files at once, for example:

cd %~dp0
mkdir output
FOR %%A IN (%*) DO sox -V4 -S %%A -t flac -C 8 -b 24 “output/%%~nA.flac” rate -vIn 88200 fir coeffs.txt
pause

This script will upsample all the input files to 24/88.2 and will apply the filter in “coeffs.txt”. Save it, then just drag and drop your input files on the batch script.

Conclusions

This is not meant to be a technical or detailed guide, for this reasons I avoided to go too much in depth about certain topics, the main goal is to arouse the curiosity of the reader (yes, you) so that he can try, make questions, and finally deepen the topic.

F.A.Q.

When I shared this article a lot of people asked me some question to elaborate and explain better some steps. Below I copied some of them:

Q: I find step2 not clear – I presume this is due to never having used RePhase (?).
I particular, what’s mostly unclear to me is why and how you mention samplerate-related aspects in there, in light of the fact that once you are “done” with the job, and generate a “list of filters” to be applied with a convolution peq (I use the one inside UAPP, for instance), those will not be different following the input file resolution (or will / should they) ?

A: First of all a filter create with RePhase and the EQ done by UAPP (as an example) are two kind of different filters. RePhase create a FIR (finite impulse response) filter while UAPP uses a IIR filter (infinite impulse response). At the end of this post I’ll share a pair of link if you want to deeply understand the difference (a bit of math and signal processing involved) but let’s simplify and say that the main difference is in the time domain. In a IIR filter (like a common pEQ) the higher the Q factor the higher the phase shift (=delay), that means that the frequencies you EQ will be literally delayed compared to the frequencies you won’t EQ. In a FIR filter the delay is costant above ALL frequency, which is a HUGE pro because you will not have some frequencies delayed compared to other but a HUGE cons, the entire audio will be delayed (more taps=more precise filter=more delay). And why it is a cons? Well, try to watch a film with a long FIR filter ahahah. The second cons is computational power, an IIR filter is very lightweight, a FIR a little less (but it is noticeable if you use it for multichannel convolution or if you use digital crossovers with multiple ways, not a problem for a stereo audio file). About the samplerate matching between the FIR filter and the input file I’ll show you an example: Let’s take a brickwall filter at 1kHZ (so everything above 1kHZ will be erased). If you apply a 44.1kHZ filter to a 88kHZ file the briwall will be at 2kHZ. If you apply a 176400Hz filter to the same 88.2kHz file the brickwall will be at 500hZ. https://community.sw.siemens.com/s/article/introduction-to-filters-fir-versus-iir

Q1: in your post you “en passant” mention upsampling to 24bit/88.2K. May I ask you to develop a separate informative post on why is this a good idea, why exactly on those values and not others, and the pros/cons on how to do it.
Q2: What I do not understand, is the need to resample to 88.2. From Nyquist’s Theorem, 44.1kHz already captures frequencies audible to us since the highest frequency component of most songs is around 22~ kHz.

A: When you do signal processing like applying a FIR filter, the software upsample the input file, usually at 32bit or 64bit (floating point). If you output at 16bit you have two consequences:
1- truncating the audio down to 16bit causes distortion artifact that needs to be cover by using dither. If you output at a higher bit depth you don’t need it.
2- Since equalizing is always done by substraction, when done in 16bit you will, at 100%, lose dynamic range. If you apply a -6dB gain to a 16bit file you’ll get a 15bit file. (as an example), and this information is lost; using higher bit depth can avoid the lost of information.
About sample rate, of course the problem is not the audible frequency, it is about how digital processing and digital-to-analog converters works. Processing audio at higher sample rate avoid foldback aliasing (if there are harmonics above the Nyquist’s frequency they will fall back in the audible range), as predicted by the Nyquist’s theorem. Another problem is that you must filter high freq noise above the human audible frequencies, with 44.1Khz you have more or less 2kHz to filter it (from 20kHz to 22.05), this is a pretty damn steep filter, steeper the filter = more ringing; to avoid ringing you can use a sloper filter, but this will affect the audio band attenuating high frequencies, with upsampling you can use a sloper filter without affecting the audible range). As pointed out on a paper by Dan Lavry the ideal sweet spot is to sample at c.ca 60kHz (http://lavryengineering.com/pdfs/lavry-sampling-theory.pdf), which is the point of diminishing return. Going too high with sample rate will cause other problems like intermodulation distortion. Some DAC chips will already upsample your file, other give you the choice, others will just process the input file; since I use different DACs, some of them will already upsample, others don’t, in any case I’ll use the next synchronous sample rate, in the worst case nothing happens, in the best case I can avoid some artifacts.

Q: [about oversampling] Go as hard and as deep as you can, potential benefits abound. Oversampling is a legit nice way to get better sound quality and you can verify the math theoretically for any real input. The op has made a really good post. Few things though. Higher tap is always better. Always process and output at possibly 192khz/176khz or the max your dac can support when doing these actions. As said above the theory supports this approach.

A: I totally agree that higher no. taps > better, but as always there’s a point of diminishing return, the number I specified are, from my experience, the recommended to start to have a really good filter, you can always go higher but there will be disadvantages more than advantages. More taps = more cpu usage = more lenght of the filter; and more lenght = more latency, with a too much high number of taps, when you switch song you’ll have to wait 1, 2, 3 seconds to have the song switched. Like everything in IT it’s always a compromise between performance and user experience.
As for sample rates, it is true that in the digital domain during signal processing higher sample rate can always be better, but the subsequent step is audio reproduction, and when you go out from the digital domain and you enter in the analog domain an higher frequency means that some circuit component can lose linearity, if that happens you get IMD, like for taps, there’s a point of diminshing return as always.

Be First to Comment

Rispondi

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.

%d blogger hanno fatto clic su Mi Piace per questo: