You’d Think By Now We’d Have Solved This Audio Issue

You’d Think By Now We’d Have Solved This Audio Issue

A lot of tech questions involve trial and error to answer, mostly because those particular questions are ones that don’t get asked all that frequently, or exist in a realm of specialists who can hash out solutions in bare-metal terms that evade the grasp of someone who wants things to “just work”.

One problem I’ve run across on several occasions is that of audio streams. Normally we don’t care about audio streams because we mostly stream to a single output like desktop speakers or a set of headphones. It’s rare when we need to split output, but there are situations where being able to direct audio output to different inputs has it’s uses.

Case in point: podcasting. A lot of topics on how to get multiple inputs focuses on people being in the same room, which is the optimal configuration, but a lot of people who do podcast aren’t even in the same country as one another, never mind the same room. It seems that the most basic method that people use is to find a way to get everyone’s voice together (Skype, Teamspeak, etc), and simply pipe it through to a recording app (Audacity or similar) that records whatever audio the PC is pumping out. This is absolutely fine if you’re aiming for a stream of consciousness recording with very little leeway for editing together a decent product — devoid of as much dead air, “yeah”s and “um”s as you can possible excise. Those who want more freedom to edit, though, agree on some contortionist setup whereby each person records his or her part locally while also on voice comms, and then sends his or her files to the editor. This allows the editor to import each participant’s part into a mixing app, and edit out the dead space and superfluous garbage, normalize individual volumes, and clean each track individually.

What a pain in the ass, circa 2016. You’d think at this point — where podcasting has been A Thing for many years now, and voice comms has come such a very long way in both quality and ubiquity, that we’d have either a dedicated podcast studio app which allows for a master switchboard to record individual incoming remote streams as different files, or that there would be comprehensive, clear instructions on how to accomplish the same thing

 

OneAppMultiChannels

 

Above is an “ideal” situation. The app that you use to get your friends together would allow for individual outputs, either to files (so you can save each person’s voice and only his or her voice) or would come built in with “virtual outputs” that applications like Audacity would recognize and separate into individual tracks.

 

MultiAppMultiChannels

Barring that, being able to have multiple instances of the same VoIP app running, and then muting everyone in each app except for a single user (giving Mike, Jane, and Sara their own application output), and then saving the output or piping it into Audacity as individual tracks should also work.

This is the point where someone is growing hoarse screaming at their monitor about “virtual audio cables“. I have looked into this, at least on paper if not yet in practice, but I’m not sure that this is an actual solution, because of this line:

If more than one applications are sending audio to Virtual Cable device, VAC mixes all streams together. If more than one applications are receiving audio from Virtual Cable device, VAC distributes the same audio data among all targets.

While VAC could allow diagram 2 to work, it would still end up mixing all of the inputs into a single output, which would end up as a single track. It would be the same as if the VoIP app spit out all voices through the single pipe that PCs provide, and recording whatever came out of the speakers.

 

VBAudioVirtualAudioDevice

 

However, another VAC solution actually provides some screenshots (like the one above) which leads me to believe that things aren’t as hopeless as that quote above leads me to believe. This image shows potentially unlimited number of virtual audio pipelines. If this is true, then the setup using multiple VoIP applications, each one set to output to a different VAC, and each VAC streaming to it’s own track in a recording application, would work. Sadly, it doesn’t seem that Audacity can do multi-input recording on a per-track basis, but their instruction manual points to another app called Alis which may be able to record individual audio streams, such as from an VAC. Another app that can do multitrack recording, mentioned on some random forum somewhere, was Kristal Audio Engine in conjunction with VAC.

So almost 800 words, three images, and name-dropping at least seven different software packages later, I’m still confounded as to why a single, out of the box solution that does all of this without jury-rigging something using parts from far-flung corners of the Internet. If I were a better developer I’d take a crack at it myself, but I’m not, and the fact that most of the VAC terminology being used is way over my head leads me to believe that this is a case of waiting for someone with actual skills to realize an app like this could be a real boon for remote groups.

Leave a Reply

What do you think?