Almost 10 years ago, in the first days of December 2010, I was sitting at my then temporary workplace in the basement of MSM Studios Munich, a mastering studio for which I had been doing plenty of audio restauration work on the great Sonic Solutions NoNoise system in the late 90’s. One audio problem which was still impossible to repair digitally was correcting wow&flutter, an often frustrating situation. The only thing you could do (and still is the best method today), is to make a proper transfer and try to track the wow&flutter through the bias tone on the tape. Plangent Processes have invented and perfected that method. Unfortunately it doesn’t deal with multiple generation pitch problems (the problem needs to be caused on the tape transfered) and since there is no bias tone on records and cylinders, it can’t do anything useful there, either. The biggest problem, however, is to this day that often transfers are simply done badly and the original medium subsequently is trashed afterwards. So a purely digital method was dearly missed.
MSM studio owner Stefan Bock and I started talking during a coffee break over their actual restauration project which was impossible to release because of its wow problems. I got curious and started doing experiments with Melodyne which were only half succesful.
As I knew Celemony founder Peter Neubäcker personally from earlier conversations about experimental sound design, I approached him. I had luck, the problem interested him. Only a stunning 2 days later he came up with the working solution, of course based on his decades of research and development on the Melodyne algorithm which had just led to its ground-breaking polyphonic version, the patented DNA (Direct Note Access). Through DNA he analyzed the polyphonic music material to as many monophonic elements as possible and compared them. If the algorithm finds pitch movement common to all elements, it must be the pitch error and needs correction. If there is pitch movement in only one of the elements, it is, for example, the vibrato a solo instrument and needs to be left untouched.
The birth of Capstan
As the core algorithm was such a straightforward step coming from the Melodyne development, it just needed a few more months of improvement and User Interface design, in which I was heavily involved. Capstan 1.0 was proudly released and presented during the AES convention in May 2011 in London.
Since then, this is our product video:
Spreading the word
Among our early supporters and multipliers were, next to MSM Studios and many others, notably Andrew Rose from Pristine Classical and, above all, John Polito and Ellis Burmann from Audio Mechanics, an LA-based restauration studio which was involved in developing wow&flutter solutions by themselves before. They not only provided valuable input for the further product development but also helped greatly in getting the word out, especially by giving an extensive presentation during the 2012 AMIA “The Reel Thing” symposium. I can’t thank them enough.
Today Capstan is the industry standard in small niche market in which we thought we’d be alone. Interestingly enough we had sudden competition from the beginning on. Cedar Audio was developing its own “Respeed” plugin for the Cambridge hardware at the same time and released it hecticly just a little prior to us, after they heard from our efforts. Fortunately my tests were showing that it doesn’t work all too well (despite its price). Unfortunately though, quite some people were buying into it and had no funds left when we came out with Capstan.
Izotope RX8 Advanced Audio Repair
What becomes clear pretty soon is that Izotope’s plugin is conceived for periodic pitch errors, it doesn’t allow for manual editing of the speed curve like Capstan does. For correcting wow you can guide the algorithm to the speed of the wow (slow/medium/fast) and adjust its intensity, while for flutter you can only adjust intensity. That’s it.
One major difference in workflow is also that Capstan analyzes the material once during opening of the audio file and every later edit or setting happens quasi-instantaneous. In RX8 you have to render a preview for every new setting you want to hear – and that can take a very long time, especially in Flutter mode. (There is definitely room for optimization as the algorithm only uses a single thread).
I tested first with the tutorial examples which we created for Capstan. They’re prepared examples, for copyright reasons. But the speed error curves are actually taken from real restauration material, so the error is real. And don’t forget: as simple as it is to introduce wow&flutter into an audio file, it is hard to get rid of it again. But I also tested with actual client restoration material with similar results, I just can’t publish that for NDA reasons.
Capstan vs. Izotope comparison tests
The first example is a periodic medium to fast speed wow which both programs have little difficulties getting rid of it.
The second example shows a flutter problem. Flutter isn’t easy for Capstan, unfortunately. In fact, most of my user support conversations deal with really fast flutter and most of the time I have to tell the users that Capstan can’t cure that. To my relief as a sound restoration engineer, RX8 excels here. Many problems I can’t tackle with Capstan can now be tackled with RX8! More on that also later.
Example 3 presents us a pitch drop which often happens during a sticky tape splice. These problems can be repaired with Capstan easily. RX8, being optimized for periodic problems, can’t do anything useful here. I tried a little to use their other methods (like manipulating/drawing a speed curve), and it seems to be possible to do at least something if there is no access to Capstan, but it’s a way less workable solution and left still audible errors everytime.
Example 4 is a really bad but real file which Capstan can correct beautifully. RX8 modulates to a wrong key after the first chord and I don’t see any method how to point RX8 to correct this glitch. The result is unusable.
Then I used real client material, a snippet from a Furtwängler Beethoven recording which we were also allowed to use during the AES 2011 show. It presents us with a combination of wow, pitch drop and flutter problems. Capstan can deal with it mostly but leaves some flutter residuals behind. RX8 gets rid of this flutter entirely but can’t deal with the coarse problems. In the end, it is a combination of tools which brings the best result, as often in audio restauration: use Capstan for removing the wow and pitch drops and afterwards RX8 to remove the Flutter.
The journey continues
My Capstan journey was and still is a fascinating and inspiring one. I had so many interesting conversations, made friends and learned about wonderfully surprising scurrilities of the audio recording history. And then there are these special occasions in which I can be of help for very unique problems. For example, I remember very well how I once custom created a MAX/MSP patch to synchronize perfect audio to badly aligned cameras, the material being the Beatles rehearsing in the studio and trying to agree on ideas. Or I simply enjoyed the new transfers of the first audiophile productions by Everest Records (1958-61), re-issued by Countdown Media. I’m sure this Capstan journey won’t cease to surprise me in the further future.
My fondest memory, however, is sitting on the large square in Bologna at night after a hot day at the FIAF symposium 2016. On was “Modern Times” by Charlie Chaplin with a live orchestra. How beautiful was that. Nothing beats live.