More GSoC Ideas from past years

From Audacity Wiki
Revision as of 21:14, 3 October 2007 by James (talk | contribs) (first draft)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Audio 'Diff'

Possible Mentors:

Description:

Ability to compare and align two sound sequences just as one compares text using diff would be a powerful new feature in Audacity. It would greatly facilitate the combining of sounds from multiple 'takes' of the same track. It would also be of use to people looking to identify particular known sounds, e.g. repeated themes in birdsong, in a very long recordings.

The implementation idea is conceptually simple. The spectra in two sounds being compared are computed at regular spacings - using existing Audacity code. A metric for spectral similarity is written. In the first incarnation it can be a correlation function.

The alignment (diff) of the two sounds is computed using standard least-distance algorithms, using an adjustable parameter which is the penalty for stretching the sound and the spectral similarity score.

The GUI for presenting the alignment could use the existing code that allows a track to be split into smaller chunks that can be shifted around augmented with a 2D similarity 'plot'. If there is time, an enhanced interface that caters more directly to the two use cases could be provided.

Skills:

  • wxWidgets and C++
  • Maths

Early spinoffs from this work:

  • A method for scoring the similarity of two spectra built into audacity.
  • A 2D graphical display that will show the similarity of two spectra across the different frequencies.


Computed Automation Tracks

Possible Mentors:

Description: In many ways Audacity is just a specialised multi-track chart recorder. This project is to add a new type of track, a track which shows multiple computed automation variables. Rather than being stored, these are computed on demand. The immediate application for these is to give more flexibility in segmenting speech. They can give feedback on where the existing algorithms are proposing to segment a track, allowing fine tuning of the parameters by adjusting the threshold.

If there is time, the computed automation tracks could be used to control parameters in one or more other effect, not just used for segmenting audio.

Another direction this could be taken in is in improving the transcription processing in Audacity.

Skills:

  • wxWidgets and C++

Early spinoff from this work:

  • Implementation of a thresholding automation track, using an existing threshold slider GUI element backported from Audacity-extra and adding the new compute-on-demand code.


Bridges

Possible Mentors:

  • TBD; Federico Grau (for Rivendell); Bartek Golenzo (Octave).

Description: One way to grow the feature set of Audacity and at the same time to avoid re-inventing the wheel is to build compatibility 'bridges' between Audacity and other Open Source programs. This is an example of making Connected Open Source a reality. Two examples of bridges for Audacity are:

  • Bridge to Octave - Octave is an Open Source program for mathematical analysis and is useful in digital signal processing (DSP). Audacity could have a bridge to Octave that allows Octave to apply effects to Audacity waveforms and to annotate Audacity waveforms with labels.
  • Bridge to Rivendell - Rivendell is an Open Source program for radio station management. The Rivendell bridges already allows Audacity and Rivendell to exchange play-lists. Work on it would improve the integration.

A proposal for a new bridge should go into some detail as to what features of the other program will be bridged. Generally the plan should avoid extensive work on the other program, since the point of the project is to extend Audacity.

Skills:

  • wxWidgets and C++
  • Familiarity with Octave or Rivendell or other software being bridged to.

Early spinoff from this work:

  • A restricted bridge which exposes a smaller part of the functionality.


Feature Completion

Possible Mentors:

  • TBD (depends on details)

Description: Identify a feature of Audacity which is in development CVS but is in some way incomplete. Project proposal should describe how the feature would be much improved and brought to release candidate readiness.

Some features we have in CVS that are not yet ready for our stable builds include:

  • Transcription ToolBar - The Algorithms for finding word boundaries are rather buggy and slow.
  • Themes - Too difficult to use as they stand, not all items are covered, does not allow theming of backgrounds yet.


Skills:

  • wxWidgets and C++

Early spinoffs from this work:

  • To be specified in the proposal.


Render-On-Demand

Possible Mentors:

Description: Audacity effects are currently 'batch' oriented. You apply an effect and wait for it to render. This is appropriate for Audacity which often runs on low spec machines. Only the simplest effects can reliably be applied fast enough to play them as they render.

Render-On-Demand would add an option to return immediately when applying an effect, giving greater responsiveness. The render would be applied later as the effect was played. There are several complications to achieving this. They include tracking the state of effects that have been partially rendered and dealing gracefully with running out of CPU resources. A project proposal would need to demonstrate a clear strategy for dealing with these issues.

Skills:

  • wxWidgets and C++

Early spinoff from this work:

  • A visual display of the rendering state of an effect, alongside the affected track, rather than a progress indicator in a dialog. This would be used before any effects had been made 'render on demand'. You'd have restricted access to Audacity menus and buttons during a render. For example during a render you'd have the ability to examine the waveform for clipping, but not to edit sound or queue additional effects until the render completes.