More GSoC Ideas from past years

From Audacity Wiki
Revision as of 17:32, 12 March 2008 by Vaughan (talk | contribs)
Jump to: navigation, search

Ideas

Possible Mentors These are, in order, the most likely or best mentors for this project idea.
Skills For nearly all project ideas, wxWidgets and C++ programming are essential. Some projects need additional skills too.
Difficulty 'Easy', 'Moderate' or 'Hard'.

Hard tasks can be made easier by solving a simpler problem. That's a decision that needs to be made early on. 'Easy' problems are lower risk. They are better suited to being done in separate smaller pieces if the going gets tough. So take the 'difficulty' grading with a grain of salt. It's only a guideline of how hard we think the problem is.

Early Spinoffs We regard it as vital that projects have early spinoffs that can be completed well within the time. These early spinoffs help to ensure that the code is useful to us. We don't want to end up with 'almost complete' code that we cannot use!


Below is a list of more potential projects.

Feel free to suggest your own ideas as well.

  • When the idea is well defined and mentors have been found, it can be moved to the GSoC Ideas page.

Command Scripting Support

Possible Mentors:

  • TBD (Existing work by James Crook, proposed for GSoC by Richard Ash)

Description: Audacity's core audio processing facilities would be useful for many purposes besides the normal use as an audio editor. For many of these, some means of controlling Audacity from another application will be needed. This can also be used for creating advanced scripts using Audacity to perform complex sequences of operation. Rather than trying to incorporate a full scripting language into Audacity, we want to develop a command interface so that a script written in any language (perl, python, javascript, bash ....) can "drive" Audacity by sending commands to it and receiving feedback. Currently an experimental implementation exists for Windows only using named pipes. This project would aim to:

  • Provide a cross-platform implementation, using appropriate IPC structures such as ptys, and possibly TCP/IP on all platforms
  • Make other changes in Audacity to move the code from experimental to production status.

The main issue in the latter category is handling of errors in Audacity. Currently most errors result in an error dialog being displayed directly. This is neither technically possible nor desirable when Audacity is the back-end of another process. Three modes of operation are possible: Normal Audacity, where the error is in the main thread; a graphical scripting client where the error is in a different thread to the user interface; and a non-graphical scripting client where there is no user interface to display a dialog. A means is needed to replace the dialog in such cases and handle the error, along the lines of:

  • error in main thread: show dialog normally
  • error in script thread, GUI mode: post a message to the main thread to show the dialog.
  • error in script thread, non-GUI mode: return a text string indicating an error.

Skills:

  • wxWidgets and C++.
  • Helpful to already have knowledge of IPC on Windows and at least one other platform - but a well honed ability in finding technical info on the internet will suffice if not.

Difficulty:

  • Moderate.

Early spinoffs from this work:

Either

  • Scripting compiles and usable on more than one platform

Or

  • scripting works in all three modes on Windows.


Multi-Channel Audio support

Possible Mentors:

  • TBD (project was proposed by Richard Ash)

Description: Audacity is currently only designed with stereo and mono audio in mind. There is a mechanism for exporting tracks from a project to separate channels in a file, but there is no mechanism for panning audio between more than two channels. Given the widespread use of multiple channel audio (5.1 surround, ambisonics etc) it would be good if Audacity could support working with these formats. Adding this support requires a number of separate changes, each of which could be quite far-reaching in the existing Audacity code base:

  • Allow arbitrary sized groups of tracks to be linked, rather than just pairs of tracks. This is currently very badly abstracted in the code, so will mean a lot of fixing of existing code to use the new track group interface.
  • Create a multi-channel capable panning module to replace the current pan control in feeding mono tracks to a multi-channel output. This must cope with a number of different multi-channel formats, and be extensible in the future for more. Cinelerra may have something suitable for 5.1 as a starting point.
  • Provide multi-channel playback in Audacity. Some simple implementations of this exist, but do not use multi-channel mixing of content. The Audacity mixer needs to become multi-channel capable if it is not already, and be linked up to the sound card. Some system for coping if the sound device does not support the Audacity format will be needed (more on platforms with inflexible audio APIs than on those with a plug-in architecture like ALSA where this can probably be done outside Audacity)
  • Provide multi-channel export from Audacity. Mixing shared with above, but may need work on exporter modules to provide things like channel mapping control if needed by the format.
  • Enable track group support in multi-channel importers, so that multi-channel files come in as a track group not lots of mono tracks as is currently the case. This is probably the simplest section of the work.

Skills:

  • wxWidgets and C++
  • Some portions may need knowledge of psychoacoustics for doing 3-D panning control
  • access to some multi-channel audio hardware would be advantageous.

Difficulty:

  • Moderate.

Early spinoffs from this work:

  • Track grouping could be done independently of the rest. The same may also be true of a better panning control, and cleaning up the current odd implementation of mono/left/right/stereo tracks.

Smart Help Infrastructure

Possible Mentors:

  • TBD

Description:

(a) The preferences panel in Audacity is becoming more difficult for new and experienced users as more preferences are added. There are conflicting forces.

  • We want to keep the descriptions on the dialogs short so that the dialogs are not cluttered.
  • Simultaneously we want more explanatory text so that people can find out what the preferences actually do.

The proposal is to link a wxHTML help window with the preference panel. The preference panel will have short descriptions. The HTML help window will have longer descriptive text. The link will be two way, and be at the level of static boxes. The HTML window will highlight text, and if needed scroll, when the static boxes are clicked on. Conversely clicking on an icon in the html text will highlight it in the text and move the preferences dialog on to the right page and highlight the appropriate static box.

(b) We also look for ways to build help screenshots directly from the program. We already have a built-in screenshot tool. This could be augmented to automatically collect all the screenshots needed from an Audacity build. This considerably reduces the work when there are changes in the interface. It also paves the way for a smaller distribution. Images in the help files are not needed since they can be generated on the user's machine. An additional advantage is that the images are custom to the OS on which Audacity is running.

(c) Lists of the available effects and what they do can be built up by the program, so that help generates a custom html page describing the effects actually installed. Similarly the key bindings, mouse bindings and menu bindings can be 'walked' to generate some of the html help files.

It is intended that this new code be released under the wxWidgets license, which is compatible with us releasing Audacity under GPL, so that it can be used in other wxProjects too as part of the Application Framework initiative.

Skills:

  • wxWidgets and C++

Difficulty:

  • Moderate. Each part not too technically challenging, but there is a lot to do.

Early spinoffs from this work:

  • Either (a) or (b) essentially completed at the half way stage.


Audio 'Diff'

Possible Mentors:

Description:

Ability to compare and align two sound sequences just as one compares text using diff would be a powerful new feature in Audacity. It would greatly facilitate the combining of sounds from multiple 'takes' of the same track. It would also be of use to people looking to identify particular known sounds, e.g. repeated themes in birdsong, in a very long recordings.

The implementation idea is conceptually simple. The spectra in two sounds being compared are computed at regular spacings - using existing Audacity code. A metric for spectral similarity is written. In the first incarnation it can be a correlation function.

The alignment (diff) of the two sounds is computed using standard least-distance algorithms, using an adjustable parameter which is the penalty for stretching the sound and the spectral similarity score.

The GUI for presenting the alignment could use the existing code that allows a track to be split into smaller chunks that can be shifted around augmented with a 2D similarity 'plot'. If there is time, an enhanced interface that caters more directly to the two use cases could be provided.

Skills:

  • wxWidgets and C++
  • Audio DSP

Difficulty:

  • Hard. Only suitable for a student who has some familiarity with this kind of problem - and possibly already doing research on something related.

Early spinoffs from this work:

  • A method for scoring the similarity of two spectra built into audacity.
  • A 2D graphical display that will show the similarity of two spectra across the different frequencies.


Computed Automation Tracks

Possible Mentors:

Description: In many ways Audacity is just a specialised multi-track chart recorder. This project is to add a new type of track, a track which shows multiple computed automation variables. Rather than being stored, these are computed on demand. The immediate application for these is to give more flexibility in segmenting speech. They can give feedback on where the existing algorithms are proposing to segment a track, allowing fine tuning of the parameters by adjusting the threshold.

If there is time, the computed automation tracks could be used to control parameters in one or more other effect, not just used for segmenting audio.

Another direction this could be taken in is in improving the transcription processing in Audacity.

Skills:

  • wxWidgets and C++

Difficulty:

  • Moderate.

Early spinoff from this work:

  • Implementation of a thresholding automation track, using an existing threshold slider GUI element backported from Audacity-extra and adding the new compute-on-demand code.


Bridges

Possible Mentors:

  • TBD; Federico Grau (for Rivendell); Bartek Golenzo (Octave).

Description: One way to grow the feature set of Audacity and at the same time to avoid re-inventing the wheel is to build compatibility 'bridges' between Audacity and other Open Source programs. This is an example of making Connected Open Source a reality. Two examples of bridges for Audacity are:

  • Bridge to Octave - Octave is an Open Source program for mathematical analysis and is useful in digital signal processing (DSP). Audacity could have a bridge to Octave that allows Octave to apply effects to Audacity waveforms and to annotate Audacity waveforms with labels.
  • Bridge to Rivendell - Rivendell is an Open Source program for radio station management. The Rivendell bridges already allows Audacity and Rivendell to exchange play-lists. Work on it would improve the integration.

A proposal for a new bridge should go into some detail as to what features of the other program will be bridged. Generally the plan should avoid extensive work on the other program, since the point of the project is to extend Audacity.

This is part of the Connected Open Source initiative, aiming to build more bridges between Open Source projects.

Skills:

  • wxWidgets and C++
  • Familiarity with Octave or Rivendell or other software being bridged to.

Difficulty:

  • Moderate to Hard. Student has to get familiar with two programs.

Early spinoff from this work:

  • A restricted bridge which exposes a smaller part of the functionality.


Feature Completion

Possible Mentors:

  • TBD (depends on details)

Description: Identify a feature of Audacity which is in development CVS but is in some way incomplete. Project proposal should describe how the feature would be much improved and brought to release candidate readiness.

Some features we have in CVS that are not yet ready for our stable builds include:

  • Transcription ToolBar - The Algorithms for finding word boundaries are rather buggy and slow.
  • Themes - Too difficult to use as they stand, not all items are covered, does not allow theming of backgrounds yet.

Skills:

  • wxWidgets and C++

Difficulty:

  • Moderate.

Early spinoffs from this work:

  • To be specified in the proposal.


Application Framework Extraction

Possible Mentors:

  • TBD (depends on details).

Description: Enhance Audacity code through splitting generic code from application specific code. Make the generic code more generic and more useful to Audacity at the same time.

Why is this valuable?

  • Cleanly separating the generic code makes the code easier to work with. This benefits everyone working on Audacity.
  • Audacity is already an excellent way to learn how to use wxWidgets. Extracting general application code further increases the value to programmers of learning Audacity's code. That's because the framework can be reused with little change in other GPL programs that have nothing to do with audio.

The work on refactoring highlights opportunities for making minor features of Audacity more general, particularly on the GUI side. For example, we already have a simple GUI class for matching channels. It is Audio specific. This could be made into a general purpose widget for 'matching'. Improved graphics and flexibility would benefit our application at the same time as making it useful to other projects.

We'd expect a student to propose some specific classes to work on, to say how they can refactor them and how in doing that they can add general purpose functionality that is directly valuable to us in our application. What's presented here is only an outline of possible directions. A successful student will need to convince us with sufficient detail.

Skills:

  • wxWidgets and C++
  • Very strong awareness of software reuse.

Difficulty:

  • Moderate to Hard depending on completeness.

Early spinoff from this work:

  • It is envisaged that refactoring and making code more generic would be done in stages, each stage useful in itself, rather than progressing all planned changes in tandem. The easiest way to get a visible spin off early is to focus on a particular GUI component. For example, a focus on the graphs could give us graphs that are not just tied to a single audio channel. This could be used to demonstrate overlaying a waveform graph over a spectral plot - easy with a more generic component, but not possible with our current arrangement.


Render-On-Demand

Possible Mentors:

Description: Audacity effects are currently 'batch' oriented. You apply an effect and wait for it to render. This is appropriate for Audacity which often runs on low spec machines. Only the simplest effects can reliably be applied fast enough to play them as they render.

Render-On-Demand would add an option to return immediately when applying an effect, giving greater responsiveness. The render would be applied later as the effect was played. There are several complications to achieving this. They include tracking the state of effects that have been partially rendered and dealing gracefully with running out of CPU resources. A project proposal would need to demonstrate a clear strategy for dealing with these issues.

Skills:

  • wxWidgets and C++

Difficulty:

  • Easy to Moderate.

Early spinoff from this work:

  • A visual display of the rendering state of an effect, alongside the affected track, rather than a progress indicator in a dialog. This would be used before any effects had been made 'render on demand'. You'd have restricted access to Audacity menus and buttons during a render. For example during a render you'd have the ability to examine the waveform for clipping, but not to edit sound or queue additional effects until the render completes.


Nyquist update

Possible Mentors:

  • Roger Dannenberg, TBD

Description: Upgrade Nyquist support to latest version.

Nyquist is a version of the LISP language built into Audacity.

Skills:

  • C++.
  • Developing on Linux and Windows.

Difficulty:

  • Moderate.

Early spinoff from this work:

  • Demonstrate the upgrade on one of the two platforms.
  • If there are no unanticipated hitches, the project will be easily achievable in the 8 weeks. A good project proposal based on this idea should include other improvements for the Audacity Nyquist user too. For inspiration look at the archives of the audacity nyquist list at sourceforge. These would happen in the second half of the project after the basic upgrade.

LV2 support

Possible Mentors:

  • Vaughan Johnson, TBD

Description: Add support for the plugin architecture LV2, the new, improved descendant of LADSPA.

The plan for the project could look something like this:

1. Implement basic support for LV2 plugins, with functionality equivalent to the current LADSPA support. This shouldn't take too long, and can be considered an "early spin off".

2. Implement the relevant parts of the LV2 core spec that are not in LADSPA. This includes

  • scalePoints - labeled values of control parameters. For example a balance control could have a scalePoint at value -1 with the label "Left" and another at value 1 with label "Right", or a linear gain control could have one scalePoint at 1 with label "0 dB" and one at 0 with label "-∞ dB". These labels would be displayed in the plugin dialog. Some heuristics would be needed to decide which labels to show if there are many of them.
  • categories - the LV2 spec defines a hierarchy of categories that plugins can belong to (filters, distortions, reverbs etc) and plugins may also define their own categories. These would ideally be presented as submenus of the "Effects" menu, so you could go "Effects" -> "Filter" -> "Equaliser" -> "My cool EQ plugin" instead of "Effects" -> "Plugins 151-165" -> "My cool EQ plugin".
  • i18n - this is strictly not a new feature in LV2 since a LADSPA plugin can set its port names and plugin names to different values depending on the current locale environment settings. It's more explicit in LV2 though since the RDF format used for descriptive plugin data has i18n built in for any string value.

3. Once the infrastructure for categorised plugins is in place, it could be used for other plugin formats as well. Particularly, there is an extension library for LADSPA called LRDF that can be used to access RDF data about LADSPA plugins (provided that the plugin author has written such RDF data), containing among other things plugin categories that map pretty well to the LV2 categories.

4. Implement support for some useful existing LV2 extensions (not parts of the core spec), such as port groups (which could allow for nicer presentations of the plugin control parameters when there are lots of them), fixed-buffer-size extensions (useful for plugins with algorithms that work most efficiently with fixed-size blocks of data, such as FFTs) and icons (lets the plugin author associate an SVG icon to an LV2 plugin, could be nice to display e.g. in the "Effects" menu).

All these things would require adding some more dependencies to Audacity, at the very least a library for parsing and querying RDF (most likely Redland, since it seems to be the free RDF library) and probably also libslv2 and liblrdf. Since everyone might not want to install these libraries all the code that depends on them should be optional and possible to turn off with options to the configure script.

Skills:

  • C++

Difficulty:

  • Moderate.

Early spinoff from this work:

  • Basic support for LV2 plugins, with functionality equivalent to the current LADSPA support.

VAMP support

Possible Mentors:

  • Chris Cannam, TBD

Description: Add support for VAMP (http://www.vamp-plugins.org/) to Audacity.

Skills:

  • C++.
  • Developing on Linux and Windows.

Difficulty:

  • Moderate.



More ideas at GSoC Ideas; also the feature Feature Requests page may suggest ideas for a project proposal.