Use Cases

Ideas extracted from the Feature Requests page.

VoIP Recording - Combine Speaker & Mic Devices as Recording Source, especially USB sets

 * VoIP is becoming extremely popular, especially with Google Voice, Betamax products like Voipbuster, Skype, and many web conference providers.
 * Easy to set up a separate VoIP device (headset) on PC with the use of a very inexpensive USB audio device (eBay less than $2).
 * These devices can be used as a separate audio device for your VoIP headset, BUT they do not have advanced features like “Stereo Mix”.
 * And thus you can only select Microphone as recording source.
 * And thus you can only record your side of the VoIP conversation.
 * Hopefully not too difficult to combine Speaker & Mic Devices as a combined source for recording. There are programs who can do this with audio connections, however, USB headsets seem to be difficult to master.
 * Add to “Audacity Preferences” > “Recording” > “Playthrough” options.
 * Example Uses:
 * Build evidence to deal with phone solicitors, collection agencies, etc..
 * Recorded conversations of lost loved ones so to remember their voice.
 * Record interviews done over Skype for podcasts etc.
 * During scientific conversations many important details can be so quickly discussed that it's impossible to keep track and much important information is lost after the conversation. Recording and playback would be a great advantage.

Scheduled Recording
In the broadcast industry it is often necessary to record sections of audio off-air at predetermined intervals for confidence and/or affidavit purposes. The functionality described below would make Audacity a useful tool for that purpose:
 * Scheduler to define recording start and end times
 * A repeat option in the scheduler so that, for instance, a day's schedule can be repeated daily, weekly, monthly, etc.
 * Automatic saving of recordings to a user-specified location, with the file names containing the recording time & date.
 * User specification of the file type (aiff, wav etc) and sample rate.

Wildlife
A lot of wildlife enthusiasts utilise Audacity for recording and editing animal sounds (e.g. birds, bats) and generally work from the spectrum rather than the waveform. The following would extend the sound analysis capabilities of the software and make the spectrum feature easier to use:
 * Threshold settings to remove the displayed background noise - i.e. only sound over a certain amplitude is displayed
 * Time marks displayed along x-axis, more frequency marks along y-axis
 * User-defined spectrum colours
 * Settings saved upon exit (e.g. fit vertically, spectrum, threshold)
 * Possibility of saving setting-profiles, i.e. different settings depending on the animal.
 * Automatic saving of each segment as it is recorded, i.e. when stopping recording, with a timestamp incorporated into the filename

Audio Cleaning
Cleaning up audio is a common task. The features for it and interface could be improved:


 * Port the cleaning effects from Gnome Wave Cleaner. Particularly decrackle and remove click which are useful for cleaning LPs.  Make them into LADSPA effects so that they can be used in other programs too?
 * Wrap Declip as a LADSPA effect. Also see the discussion tab of this article.
 * Use VAMP to identify clicks, then an enhanced label track to allow applying a de-click effect by clicking on a label - or selecting ten and applying to ten at a time.
 * Wizard to help people with cleaning LPs.
 * Zoomed in view to help in locating clicks.
 * Picker for 'ambient noise' samples used to repair sounds during silence (more useful in vocal tracks than in music).
 * Waveform over spectrograph.
 * Enhanced Audio preview that allows you to hear the effect of different parameters and so adjust them better. The critical problem to solve is 'independence' of parameters.  It is almost impossible to set parameters well when there are more than two and changing one affects the others.  A preview that allows parameter changes to be heard in isolation could help.
 * Threshold on wavetrack (backported from Audacity-Extra).
 * Erasing certain frequencies from the spectrogram to eliminate "mouth noise", such as saliva clicks.
 * Here's the current method I use for this task:
 * Find the click in the spectrogram. It's usually a bright red spot in the higher frequencies.  Select the click.
 * Open the equalizer and draw a curve to filter out the click, without removing the "good" frequencies. The "Preview" button is very useful here.
 * Once I get as close as I think I can get, I apply the filter, then select a larger portion of the clip to hear it in context. At this point, I'll often discover that my filter was too drastic, and I have to do the whole process all over again.
 * If there was a tool to paint/erase frequencies directly on the spectrogram, that whole process would be much easier, especially since I could listen to a larger region than just what I'm working on.
 * Non-modal effect dialogues. It would be very useful if the 'Play' button still worked when a effect dialogue was open so I could compare the preview to the unmodified audio.  It would also allow previewing multiple effects to easily compare them.

Education (General)

 * Ability to use X-Windows sound. This is particularly important in Canada, where schools often use 'thin clients'.  That is, the classroom has computers which are little more than screen, keyboard, mouse.  Disk drives, memory and processor are on a central blade server.  This is more cost-effective and easier to administer, but does require that Audacity use X-Windows sound.
 * Methods for hiding advanced / optional features so as to make the interface less confusing. Each school may have a different definition of what is standard and what is advanced.
 * A version of this is already available See Simplifying Audacity.


 * Method for easily resetting options to some default saved value. For example to return the system to a a sampling rate and number of channels which works.  Pupil may be allowed to tamper with the settings, but we want to be able to get back to known good ones easily.

Languages

 * Vowel target practice based on code from the project.
 * There is a vowel-quadrilateral display we can take from Audacity-Extra. It uses voroni regions to show nearest vowel match to an utterance.
 * Other kinds of voice analysis, using VAMP plug-ins (e.g for the two kinds of 's' sound in the Polish language).
 * Programmed playback, e.g. using script.
 * Elaborations to label track so that can search audio lessons via text more easily.
 * Also have some info an 'audio icons' that make tree-based searching of audio, using audio only, faster.

The ideal is to create a small ecosystem of programs around Audacity that help with language learning. Some code could run on PDAs and mobile phones (Java) with the lessons being prepared on Audacity.


 * Can integrate with dictionaries from freedict and language resources from Open Commons.

See also: Proposal Structured Audio, Proposal Languages Ecosystem, Language Learning

Looping and Dictation Aids
These will let someone work through a dictation, adjusting and advancing the looping region during playback, sentence by sentence or verse by verse. (At least two people want all of these.)


 * When the user hits right-bracket, set the end of looping region to the current position and go back to beginning immediately. (Currently need to stop and restart looping play for it to take effect.)
 * When the user hits left-bracket, set the beginning of the looping play to the current play position immediately. (Currently need to stop and restart looping play for it to take effect.)
 * When the user hits shift-right-bracket, set the end of the looping region to the end of the track immediately. (New key, new feature.)
 * When the user hits shift-left-bracket, set the beginning of the looping region to the existing end of the looping region and move the play head to this new location. Also set the end of the looping region to the end of the track.  (New key, new feature.)

See also: Proposal Transcription Editor

Note: See Samwyse for patch progress information.

"Books on tape", field recordings, etc.
This is a collection of features that will enhance converting recordings made "live" or on physical tapes into digital form. My specific use-case is my attempt to turn foreign language instructional cassettes into MP3 files, but it should be equally useful in many other cases. (BTW, I've tried diving into the source code to implement these myself, but quickly got lost. Any assistance in getting started would be appreciated.)


 * Recordings such as these frequently feature a lot of metadata that is spoken by the presenter, things like "Unit 3" or the title of the next segment, which I want to capture as labels. While either recording or playing a project, you need "hot keys" that instantly create labels according to a template.  The template should allow the inclusion of auto-incrementing numbers.  Currently, I have to quickly press a sequence like [Ctrl-M], "u", "3" and [Enter], and then post-process the .aup file to expand the labels.  If several events occur in rapid succession, this is especially hard to do accurately.  Finally, it would be nice if the label were inserted a bit before the time of the key-press, to allow for reaction times.
 * Use '<' and '> to adjust the tempo of a recording during playback. The 'Change Tempo" effect does this, but I do not want to permanently change my recording.
 * Have a key that, during play-back, jumps backwards one or two seconds. Especially when scanning speech, this would reduce the time required to label everything. Note: implemented in current Audacity (left and right arrow).
 * Even with compensation for reaction times, it is unlikely that a label will be exactly where you'd want it to be. I'd like to be able do two things to help with this.  First, use the mouse to sweep a region and then have Audacity adjust the selection boundaries, for instance to move the selection's start point forward until it's a fixed time prior to the end of silence.  (Ideally, this would use a new type of Nyquist extension.)  Second, given a selection that covers exactly one label, adjust that label so that it matches the selection.
 * Finally, I'd like to see label-based navigation of tracks, things like "scroll to the {first,next,previous,last} label". Note: you can already click in a label then tab and SHIFT - tab back and forth between them. Thanks for the info, I didn't realize you could do that.

Taken together, these features would allow someone to perform a mostly-unattended capture of a conference presentation or a jam session, play it back at high-speed while marking points of interest, and easily revisit those points to adjust the labels.

Theatre

 * Create labels in real time from external source during recording: It would be nice to be able to create labels immediately during the recording process, using either remote calls or listening on a socket. I currently fake this with a little python script that logs the intended labels and (hopefully) the correct time-since-start into an external file, then "Import Labels". It would be nice if there were some way to do this directly, since Audacity always has the correct "current" time-in-recording. We use this for pulling in GPS-sourced time, scene change hints, and some other stuff.
 * "Waterfall"/"voiceprint" spectrum display option (instead of, or above/below waveform display)

Fixing a Jam Session

 * Multiple takes of the same 'session'. Allow the user to mix and match. With two takes the idea here is to have an interface where you can choose which fragment to take for each moment, so a bum note in one track that was OK in the other will be fine.  Particularly useful for long takes.  A necessary step is to align the two tracks so that equivalent sounds correspond.  This entails expanding/contracting one or both waveforms in various places.  A variant of this uses self-comparison of the same track.  This is useful where a singer sings a phrase again that they got wrong the first time.  Alignment can match up the repeated parts and allow easy removal of the unwanted sound.
 * Spot every cough from the audience in a recording (irregular spacing of a repeated pattern). Needs an interface for defining a 'cough' which will have one or more samples, each with parameters controlling the allowed variations.  For example, we can set how precisely the volume of the cough must be matched - ignoring coughs that are in the distance is less likely to mess up audio that by chance matches the pattern.
 * De-Mix the audio, e.g take the vocals away from the instrumentals, separate the percussion from the wind instruments onto two different tracks. In amateur recording, fewer mics than necessary may have been used, or there may have been bleed between the different sources.  Separating them cleanly allows for different noise cleaning and effects to be applied to each, before remixing again.

Algorithms for doing aspects of this are discussed on Audio Diff Notes as regards mix and match and comparison, and on Source Separation Notes for de-mixing audio.

Audio as a measuring tool

 * Graph the speed of an engine from its sound (regular spacing of a repeated pattern, spacing not initially known)
 * Measure the speed of a bullet from the time delay between the sound of firing and of the target being hit - and the known speed of sound in air. (There's a website with pictures of Audacity being used to do just that).
 * Gather evidence of the noise pollution (pneumatic drill, heavy lorries) from building works nearby over a ten day long recording.... He's planning to take them to court. For this one we need not only a way of classifying sound (advanced VAMP plug-ins) but also an improved time track, a way of converting the matches to a printable log.  For litigation there may be additional requirements such as ways to indicate that the data has not been tampered with.  We may want to run two widely separated microphones, or parabolic and non-parabolic, to confirm the direction of the sound source.

Algorithms for doing aspects of this are discussed on Audio Diff Notes.

Azimuth Setter
Azimuth refers to the angle between the tape head(s) and the tape medium. Sometimes tape azimuth is easy enough to set, sometimes not. Having to set it for every tape is a chore though.

Azimuth setter would display the azimuth setting on screen in real time, showing ideal setting in the middle and showing actual setting.

Azimuth setter would:
 * 1) look at the delay between the 2 channels, which contain a lot of mono content (slightly delayed when azimuth is out of line)
 * 2) graphically display the azimuth setting
 * 3) update display in real time, enabling easy ideal setting by user.

This would:
 * speed up the job
 * reduce errors
 * improve accuracy and thus bandwidth & signal to noise ratio
 * If it ran during recording, it would show when azimuth changed, enabling the user to readjust to the new perfect setting during recording.
 * It would also show all users the azimuth issue in a simple clear non-technical way, and result in better recordings by end users.

How might the code work?

This is just a thought, and maybe there are better options I dont know about. I'm not a computer programmer.

The idea is to take a brief sample and look for mono content, repeating with assorted time delays between the 2 channels.

One time shift will give the greatest mono content, and the amount of timeshift then gives you the degree and sign of azimuth error.

A little intelligence can presumably be applied to home in on the right delay figure while minimising the number of delay settings tried.

Note the display scale should be flippable (left to right). The computer has no way to know which way round the scale should be presented to make adjustment easy and intuitive.

If my coding skills weren't 25 years out of date I'd probably give it a go.

If you want to vote for Azimuth setter: please do so on the other recording enhancements section of Feature Requests

Other features
Other features mentioned on Feature Requests which may be useful in their own right are also useful to cassette recording, in particular :
 * 8 bit recording
 * SmartEQ (which tries to equalise the audio according to a known frequency spectrum).
 * "Automatically set recording settings" for sample rate and bit depth based on a sample of recorded audio

Third party applications controlling Audacity
From a previous Forum discussion:

Harshal wrote: I have a Java application that I use for playing karaoke tracks with synchronized lyrics. I would like to have the ability to start recording a microphone input track in Audacity at the same time when I play the karaoke track in my Java app. How can I achieve this? Is there a way for my Java app to invoke controls or commands in Audacity? Can Audacity receive commands such as Start/Stop recording on a TCP/IP port?

Kozikowski wrote: Linux allows you to run applications on any mounted machine using Telnet. Log in, push the buttons and go. You can do that with slightly more difficulty in OS-X (because it's a UNIX-based system) in /Utilities/Terminal and if you're really into pain, you can get Windows to do it with Remote Access Services, RAS.

Be advised in Windows you need to be exactly the same person and account on both machines. In some cases, you need administrative access to both machines--and the same administrator.

Prokoudine wrote: I think that the sanest way to implement this would be adding support for  via libosc.

Richard Ash wrote: No, there is no TCP support in Audacity at all. All kozikowski is saying is that you can run X11 (graphical) applications remotely over a network connection. You still have to transmit button presses to the application, so I don't see that it helps you much if at all.

A VST plug-in isn't going to help at all. Audacity only supports VST plug-ins for off-line effects processing, and then via a wrapper process thanks to the restrictive VST license.

There is an experimental scripting plug-in available which uses Windows named pipes to pass control messages to and fro, but it's not included in release builds yet. This builds on a generic binary plug-in interface that allows pretty much any part of Audacity to be over-ridden by external code. This would be logical way to implement OSC support.

Morpheus wrote: Maybe good idea is to make this scripting plug-in ready to use from shell level ? Probably it is not very difficult and can be used in Linux, Windows, Mac the same way.