Difference between revisions of "GSoC Ideas 2008"

From Audacity Wiki
Jump to: navigation, search
(updated after Google's GSoC 2007 announcement.)
(more ideas for 2008)
Line 1: Line 1:
 
 
The Audacity Developer Team aren't part of Google Summer of Code 2007 but we want to be in 2008 and we are watching what happens in 2007 closely.
 
The Audacity Developer Team aren't part of Google Summer of Code 2007 but we want to be in 2008 and we are watching what happens in 2007 closely.
  
Line 15: Line 14:
 
* Our proposed [[SummerOfCodeStudent|application form]] for Google Summer of Code 2008 Students.
 
* Our proposed [[SummerOfCodeStudent|application form]] for Google Summer of Code 2008 Students.
  
 
+
<hr>
 +
<hr>
  
 
= Ideas =
 
= Ideas =
Line 21: Line 21:
 
''Below is a list of potential projects, but feel free to suggest your own ideas as well.''
 
''Below is a list of potential projects, but feel free to suggest your own ideas as well.''
  
* <b>1. Intuitive cross-fading</b> (suggested by Matt Brubeck )
+
* <b>1. Audio 'Diff'</b> (suggested by James Crook)
 +
 
 +
Ability to compare and align two sound sequences just as one compares text using diff would be a powerful new feature in Audacity.  It would greatly facilitate the combining of sounds from multiple 'takes' of the same track.  It would also be of use to people looking to identify particular known sounds, e.g. repeated themes in birdsong, in a very long recordings.
 +
 
 +
The implementation idea is conceptually simple.  The spectra in two sounds being compared are computed at regular spacings - using existing Audacity code.  A metric for spectral similarity is written.  In the first incarnation it can be a correlation function.
 +
 
 +
The alignment (diff) of the two sounds is computed using standard least-distance algorithms, using an adjustable parameter which is the penalty for stretching the sound and the spectral similarity score.
 +
 
 +
The GUI for presenting the alignment could use the existing code that allows a track to be split into smaller chunks that can be shifted around augmented with a 2D similarity 'plot'.  If there is time, an enhanced interface that caters more directly to the two use cases could be provided.
 +
 
 +
'''Early spinoffs from this work:'''
 +
* A method for scoring the similarity of two spectra built into audacity. 
 +
* A 2D graphical display that will show the similarity of two spectra across the different frequencies.
 +
 
 +
 
 +
 
 +
<hr>
 +
 
 +
 
 +
* <b>2. Computed Automation Tracks</b> (suggested by )
 +
 
 +
In many ways Audacity is just a specialised multi-track chart recorder.  This project is to add a new type of track, a track which shows multiple computed automation variables.  Rather than being stored, these are computed on demand.  The immediate application for these is to give more flexibility in segmenting speech.  They can give feedback on where the existing algorithms are proposing to segment a track, allowing fine tuning of the parameters by adjusting the threshold.
 +
 
 +
If there is time, the computed automation tracks could be used to control parameters in one or more other effect, not just used for segmenting audio.
 +
 
 +
Another direction this could be taken in is in improving the transcription processing in Audacity.
 +
 
 +
'''Early spinoff from this work:'''
 +
* Implementation of a thresholding automation track, using an existing threshold slider GUI element backported from Audacity-extra and adding the new compute-on-demand code.
 +
 
 +
 
 +
 
 +
<hr>
 +
 
 +
 
 +
* <b>3. Postfish Integration</b> (suggested by )
 +
 
 +
Postifsh is a p-threads based Linux audio application with uncompromising quality and some unique effects such as 'deverb' which removes unwanted reverb from a track.  Currently it can't be used from within Audacity.
 +
 
 +
Integrating it into Audacity will be a challenge indeed, particularly on the windows side where we anticipate problems with a simple translation to wxThreads.  The student will need to be prepared to dive into such code.
 +
 
 +
We also have to be mindful that Audacity will usually be run on machines with far less processing power than the ideal for Postfish - most Audacity users do not have Dual Xeon 3GHz systems.  With single effects and on powerful machines, Postfish will be able to run in real time.  However, for its integration in Audacity we want to fall-back gracefully to non-real time mode when necessary.
 +
 
 +
We have some novel ideas for a progress-indicator interface that will allow this to happen gracefully.  We want to decouple the advance buffering of an effect from the playback, and to illustrate it provide a progress indicator alongside the wave track.  We'll then have more graceful degradation of behaviour when effects become too expensive to compute in real time.  A user can set an effect rendering, and whilst it continues to render start playing the rendered audio, with the 'read head' gradually catching up with the 'effect'.
 +
 
 +
'''Early spinoffs from this work:'''
 +
* Demonstration of a simple real-time echo effect using same thread structure as planned for Postfish integration.
 +
* Demonstration of existing offline effects rendering with a progress indicator alongside the track, rather than a 'standard' progress indicator.  In particular it must be possible to zoom in and see the render progress in greater detail, whilst it is in progress.
 +
 
 +
 
 +
 
 +
 
 +
 
 +
<hr>
 +
 
 +
* <b>4. Intuitive cross-fading</b> (suggested by Matt Brubeck )
  
 
One of the most common operations people want to do when mixing audio is to smoothly transition between two sound clips.  This is commonly called a cross-fade.  This operation is technically possible in Audacity now, but it is very clunky, requiring multiple steps and no editability short of undoing the actions and starting again.  We are looking for someone to implement a clean, intuitive, nondestructive cross-fade for Audacity.  Audacity already has all of the infrastructure necessary to support implementing this operation nondestructively and we already have a clear plan for how it should work.  The following webpage has a mockup of what we think the GUI might look like:
 
One of the most common operations people want to do when mixing audio is to smoothly transition between two sound clips.  This is commonly called a cross-fade.  This operation is technically possible in Audacity now, but it is very clunky, requiring multiple steps and no editability short of undoing the actions and starting again.  We are looking for someone to implement a clean, intuitive, nondestructive cross-fade for Audacity.  Audacity already has all of the infrastructure necessary to support implementing this operation nondestructively and we already have a clear plan for how it should work.  The following webpage has a mockup of what we think the GUI might look like:
Line 29: Line 84:
 
This feature, while seemingly small, would represent a huge boost in usability for Audacity.  This feature is intimately related to several other UI enhancements that we have proposed: for example, one element of this proposed GUI is that clips "stick" to each other or "snap" into place when you push them together.  Such a snap-to behavior would be great in several other circumstances, for example having a track stick to t=0, or to a point that lines up with another track.
 
This feature, while seemingly small, would represent a huge boost in usability for Audacity.  This feature is intimately related to several other UI enhancements that we have proposed: for example, one element of this proposed GUI is that clips "stick" to each other or "snap" into place when you push them together.  Such a snap-to behavior would be great in several other circumstances, for example having a track stick to t=0, or to a point that lines up with another track.
  
We'd look for two early spinoffs from this work:
+
'''Early spinoffs from this work:'''
 
* Ability for all effects to be faded in/out automatically.  This can avoid clicks in some circumstances.
 
* Ability for all effects to be faded in/out automatically.  This can avoid clicks in some circumstances.
 
* Label tracks to stick to the track above, so that they edit together.
 
* Label tracks to stick to the track above, so that they edit together.
  
  
* <b>2. Play-Back Enhancements</b> (First two parts suggested by James Crook)
+
 
 +
 
 +
 
 +
<hr>
 +
 
 +
 
 +
* <b>5. Play-Back Enhancements</b> (First two parts suggested by James Crook)
  
  
Line 44: Line 105:
 
* Drag-playback-cursor whilst playing; This requires changes to both playback and GUI.  It is an extension of vari-speed playback and would make locating sound more rapid.
 
* Drag-playback-cursor whilst playing; This requires changes to both playback and GUI.  It is an extension of vari-speed playback and would make locating sound more rapid.
 
* Play all 'labels' on the selected label tracks; Labels can be placed on the Audio both manually and automatically.  This has many uses, one being the possibility of previewing a recording whilst skipping over periods of silence.
 
* Play all 'labels' on the selected label tracks; Labels can be placed on the Audio both manually and automatically.  This has many uses, one being the possibility of previewing a recording whilst skipping over periods of silence.
 +
 +
'''Early spinoffs from this work:'''
 +
* TBD
 +
 +
 +
<hr>

Revision as of 19:58, 19 March 2007

The Audacity Developer Team aren't part of Google Summer of Code 2007 but we want to be in 2008 and we are watching what happens in 2007 closely.

http://code.google.com/summerofcode.html

Please help us to get ready for the next round. It is never too early. Your ideas and enthusiasm will help us make it happen next year. You can contact us at this email address: [email protected]

If you're looking for a 2007 GSoC project on Audacity, wxWidgets are participating in GSoC 2007 and so are Xiph.org. We use code from both, so indirectly work on those projects may benefit Audacity. If you have ideas for a project of that kind that you would like to do, we'd be keen to hear from you and discuss it with you, at the address above.


Other pages related to GSoC 2008 are:

  • Our proposed application to Google for mentoring status for 2008.
  • Our proposed application form for Google Summer of Code 2008 Students.


Ideas

Below is a list of potential projects, but feel free to suggest your own ideas as well.

  • 1. Audio 'Diff' (suggested by James Crook)

Ability to compare and align two sound sequences just as one compares text using diff would be a powerful new feature in Audacity. It would greatly facilitate the combining of sounds from multiple 'takes' of the same track. It would also be of use to people looking to identify particular known sounds, e.g. repeated themes in birdsong, in a very long recordings.

The implementation idea is conceptually simple. The spectra in two sounds being compared are computed at regular spacings - using existing Audacity code. A metric for spectral similarity is written. In the first incarnation it can be a correlation function.

The alignment (diff) of the two sounds is computed using standard least-distance algorithms, using an adjustable parameter which is the penalty for stretching the sound and the spectral similarity score.

The GUI for presenting the alignment could use the existing code that allows a track to be split into smaller chunks that can be shifted around augmented with a 2D similarity 'plot'. If there is time, an enhanced interface that caters more directly to the two use cases could be provided.

Early spinoffs from this work:

  • A method for scoring the similarity of two spectra built into audacity.
  • A 2D graphical display that will show the similarity of two spectra across the different frequencies.




  • 2. Computed Automation Tracks (suggested by )

In many ways Audacity is just a specialised multi-track chart recorder. This project is to add a new type of track, a track which shows multiple computed automation variables. Rather than being stored, these are computed on demand. The immediate application for these is to give more flexibility in segmenting speech. They can give feedback on where the existing algorithms are proposing to segment a track, allowing fine tuning of the parameters by adjusting the threshold.

If there is time, the computed automation tracks could be used to control parameters in one or more other effect, not just used for segmenting audio.

Another direction this could be taken in is in improving the transcription processing in Audacity.

Early spinoff from this work:

  • Implementation of a thresholding automation track, using an existing threshold slider GUI element backported from Audacity-extra and adding the new compute-on-demand code.




  • 3. Postfish Integration (suggested by )

Postifsh is a p-threads based Linux audio application with uncompromising quality and some unique effects such as 'deverb' which removes unwanted reverb from a track. Currently it can't be used from within Audacity.

Integrating it into Audacity will be a challenge indeed, particularly on the windows side where we anticipate problems with a simple translation to wxThreads. The student will need to be prepared to dive into such code.

We also have to be mindful that Audacity will usually be run on machines with far less processing power than the ideal for Postfish - most Audacity users do not have Dual Xeon 3GHz systems. With single effects and on powerful machines, Postfish will be able to run in real time. However, for its integration in Audacity we want to fall-back gracefully to non-real time mode when necessary.

We have some novel ideas for a progress-indicator interface that will allow this to happen gracefully. We want to decouple the advance buffering of an effect from the playback, and to illustrate it provide a progress indicator alongside the wave track. We'll then have more graceful degradation of behaviour when effects become too expensive to compute in real time. A user can set an effect rendering, and whilst it continues to render start playing the rendered audio, with the 'read head' gradually catching up with the 'effect'.

Early spinoffs from this work:

  • Demonstration of a simple real-time echo effect using same thread structure as planned for Postfish integration.
  • Demonstration of existing offline effects rendering with a progress indicator alongside the track, rather than a 'standard' progress indicator. In particular it must be possible to zoom in and see the render progress in greater detail, whilst it is in progress.




  • 4. Intuitive cross-fading (suggested by Matt Brubeck )

One of the most common operations people want to do when mixing audio is to smoothly transition between two sound clips. This is commonly called a cross-fade. This operation is technically possible in Audacity now, but it is very clunky, requiring multiple steps and no editability short of undoing the actions and starting again. We are looking for someone to implement a clean, intuitive, nondestructive cross-fade for Audacity. Audacity already has all of the infrastructure necessary to support implementing this operation nondestructively and we already have a clear plan for how it should work. The following webpage has a mockup of what we think the GUI might look like:

http://limpet.net/mbrubeck/temp/cross-fade/

This feature, while seemingly small, would represent a huge boost in usability for Audacity. This feature is intimately related to several other UI enhancements that we have proposed: for example, one element of this proposed GUI is that clips "stick" to each other or "snap" into place when you push them together. Such a snap-to behavior would be great in several other circumstances, for example having a track stick to t=0, or to a point that lines up with another track.

Early spinoffs from this work:

  • Ability for all effects to be faded in/out automatically. This can avoid clicks in some circumstances.
  • Label tracks to stick to the track above, so that they edit together.





  • 5. Play-Back Enhancements (First two parts suggested by James Crook)


Audacity lags behind commercial audio software in a number of details of its play back behaviour. Specific enhancements we would like a summer student to provide are:

  • No 'click' on start/stop/loop; At the moment there usually is an audible click when Audacity starts or stops playing a sound, or in iterations of playing a loop. A very short fade-in and fade-out applied only to playback should fix this.
  • Loop play adjusts dynamically to boundaries being moved; Finding the precise boundaries of a sound, for example an unwanted sound to be fixed, can be difficult with Audacity as it currently is. The location of the sound isn't obvious from the waveform. The new option would allow playing the sound in a loop, adjusting the boundaries to find out exactly where it starts and stops.
  • Vari-speed playback; Fast playback of sound allows sections of audio to be located more rapidly. Slow playback allows precise location (on timeline) of sound to be determined more accurately.
  • Drag-playback-cursor whilst playing; This requires changes to both playback and GUI. It is an extension of vari-speed playback and would make locating sound more rapid.
  • Play all 'labels' on the selected label tracks; Labels can be placed on the Audio both manually and automatically. This has many uses, one being the possibility of previewing a recording whilst skipping over periods of silence.

Early spinoffs from this work:

  • TBD