GSoC 2009 - D1-1/GSoC Progress

From Audacity Wiki
Revision as of 15:04, 19 April 2009 by D1-1 (talk | contribs) (Responding to comments)
Jump to: navigation, search

General Information

See D1-1/GSoC_Information for an overview of the project.

Progress Charts

This page has the latest charts and status reports: External Progress Page

In particular, see the Gantt chart for the approximate order in which I propose to tackle things.

Details of Planned Work

Here is a summary of the work I intend to carry out (not necessarily in chronological order), along with estimated time required.

Scripting Module

  • Provide a way for the script thread to delegate certain GUI actions (including display of error and progress dialogs) to the main thread
  • Provide a way for error messages to be returned to the script when necessary.

JC: Mentors have confirmed that if this project goes ahead they do want this and by the mid term deliverables date.

Track Views/Find Notes

  • Refactor TrackArtist so that different WaveTrack views are drawn by separate classes, and provide a composite view which allows alternative views of the same track to be shown alongside one another.
  • Optimise track drawing methods - for example, ensuring that buffered drawing is used and that analysis data is cached where it is beneficial - with the aim of making the more computationally intensive track views usable on less powerful machines.

JC: I'm assuming this means that you would keep about a screen's worth of buffered image around so that scrolling can be done with a bitblt. If so, I'd like to see at some time the option of smooth scrolling (waveform scrolls by with cursor staying fixed) for fast enough machines. With our current paged scrolling navigation is less good than it could be. Please comment on whether that would follow naturally from the proposed change, and if not clarify what the proposed change is.

DH: Yes, automatic scrolling was something I had in mind when I proposed this, and I think the changes would be sufficient to allow this at least on the drawing side. The changes I've been thinking about would be more general - the tracks would keep track (ha) of which regions are buffered already and which need redrawing. Policy concerning which pieces are most useful to cache could be determined afterwards, when experimentation is easier. Another possibility is a 'level of detail' system whereby a low resolution rendering of, say, the whole track is stored to allow fast scrolling without using too much memory. I'd quite like to reduce Audacity's idle cpu usage - I suspect it's higher than it needs to be, and that this is because of the drawing. Of course this needs further investigation/profiling to say for sure.

  • Improve the Find Notes algorithm to make it useful for the purpose of musical transcription. It should work well enough to be able to identify notes and chords in a solo piano recording with reasonable accuracy, and to display this to the user clearly. Amongst other things, it will need to be able to vary the number of notes detected adaptively.

JC: Please discuss whether any changes to the find-notes display will be made. For example, is it useful to show the algorithm's time-varying 'level of confidence' in the analysis?

DH: This is open to discussion - I think such a quantity would implicitly require a definition for 'the perfect output' with which to compare the actual output, and I'm not convinced there is an objective one. (For instance, what should the output be given white noise as input?). I think the only real option is to assume the user knows roughly what they want to get as output, and give them the control over the analysis to ensure they get it. One change that would definitely be useful would be drawing a keyboard instead of or as well as the frequency scale. (I think the note track drawing code already does this, but it'd need a bit of modification) Maybe also making the amplitude variations clearer (with colours). Setting the initial zoom level sensibly would probably also be good.

  • Fix the related bugs on the release checklist (e.g. P4 - Minimum and Maximum frequency settings don't work for Spectrum log(f) view)


  • P3 Jack problems
  • P3 Opening a second file while the first is playing
  • P4 Linux build fails with EXPERIMENTAL_SCOREALIGN defined
  • P3 Preferences dialog - clicking OK can sometimes cause a crash
  • Some failed assertions I encountered, including
    • Track.cpp(209) when deleting a label track
    • A wxWidgets problem in BatchProcessDialog
  • P3 Desynchronisation problem when pasting with audio and a label track
  • P2 Labels should move appropriately when timeline-changing effects are applied
  • P3 Ensure tooltips are updated when language is changed in preferences

JC: One of the GSoC requirements is (at the end) to submit a 'tarball' of code written. Please keep a folder with all your patches somewhere to make this step easy.

JC: If bugs on this list are solved by someone else, please note and add a bug of similar difficulty to the end of the list.

Optional Extras

  • Solve more of the issues from the release checklist or the 'not aiming' list.
  • Further extend the capabilities of the scripting module, for example to allow automatic taking of screenshots.
  • Refactor or optimise other areas of the code, if it becomes apparent that this would be useful.
  • Allow transfer of detected notes to a MIDI track, export of note names, or of a MIDI file.
  • Provide a general way of allowing alternative colourings to be used in spectrum view and other places.

Midterm Status

Due to the time constraints imposed by my exams, the bulk of the work would necessarily be completed in the period after the midterm evaluation. My main deliverable for the midterm period would be enhancements to the scripting module. I would provide code for a module which works on at least the windows platform and allows a perl script to control import and export of wav files as well as application of effects, and returns an error message to the script if a command was not recognised. In addition I would provide fixes for at least three items from the release checklist.

Preliminary Work

The two main patches I submitted were:

  • A modification to TimeTextCtrl to ensure that the first non-zero digit is focused initially, rather than the first digit
  • A larger patch which refactored the generator effects and resolved a problem when the duration was zero

Additional Comments

My primary development platform is Arch Linux, but I can also test/develop on Windows XP if necessary. I have no way of tackling any Mac-only bugs, or other issues which I cannot reproduce for whatever reason.