Let’s talk about 32-bit vs. 64-bit sound processing in VST plugins (not only talk but also I have [3 plugins to perform audio tests]!)
1. 32-bit single precision floating point format (a.k.a. float)
Used in most DAWs. The main format for VST plugins to process (AudioEffect::processReplacing).
– Sign bit: 1 bit
– Exponent width: 8 bits
– Significand precision: 24 (23 explicitly stored)
The sample values for 0 dbFS are between -1.0 and 1.0. So it’s actually 24 bits per sample.
2. 64-bit double precision floating point format (a.k.a double)
Optional format for VST plugins to process (AudioEffect::processDoubleReplacing starting from VST 2.4). Some DAWs declare if all plugins in chain use 64-bit processing the whole signal chain has 64-bit processing.
– Sign bit: 1 bit
– Exponent width: 11 bits
– Significand precision: 53 bits (52 explicitly stored)
The sample values for 0 dbFS are between -1.0 and 1.0. So it actually 53 bits per sample.
There’re 2 main questions to use double instead of float:
- How much is performance degradation?
- Is it noticable audio quality enhance?
My answers to these 2 questions are:
For question 1:
1a) Long mathematical calculations (not DSP) works ~2 times faster with single precision.
1b) Typical (my) plugin code with a lot of border conditions and some calculations between conditions works faster with double precision! I don’t know is it related with branches prediction or instructions reordering in CPU or with data alignment or with compiler optimizer (Visual C++ 2005 compiler with full optimizations on).
1c) SSE/SSE2 with manual assembler coding for bulk data processing (such as convolution) works 2 times faster with single precision (because SSE can process 4 floats instead of 2 doubles simultaneously).
For question 2:
For my ears there’s a noticable sound difference! But what is the compromise between speed and quality? I think the compromise is to use the dithering!
Check your ears and monitoring!
These are [3 versions of VST clipper plugin (x86,x64)] to check. This is the same plugin as in my last post but I added new parameter “Ovr.mode” – Signal/GR. If “Signal” is set the whole signal is upsampled, processed and downsampled. If “GR” is set only Gain Reduction signal is oversampled. To better hear the difference use “Signal” oversampling mode.
Version 1 [double]. Upsampling and downsampling use 64-bit double precision processing.
Version2 [float]. Upsampling and downsampling use 32-bit single precision processing.
Version3 [float-dither]. Upsampling and downsampling use 32-bit single precision processing but with the dithering.
TEST 1. Process the same audio sample with 3 versions of plugin and try to hear the difference. Use “Signal” mode. All another parameters set for taste but they must be the same for all of the tests! (For example I used 0 dB gain, -9 dB threshold, “A” shape, hardclip “Off”)
What is my result? I used 32-bit float wave-file for output so even fot the case of internal 64-bit audio processing I heard the output in 32-bit. But there’s a difference! And the difference is in depth! “Double” version sounds just amazing, “Float” version has sound with lack of back planes, “Float-dither” version sounds much better than regular “float”.
TEST 2. Now use “GR” mode with the same audio samples and the same settings.
In “GR” mode for performance reasons if gain reduction not needed for a relatively long period the signal bypasses oversampling. That’s why in this mode the difference is hardly noticable. But I think “float” version sounds bad and both “double” and “float-dither” versions sound quite good! So my choise was “float-dither” (the clipper plugin in my last post had exactly these settings hardcoded).