Talk:GSoC 2009 - D1-1/GSoC Progress

From Audacity Wiki
Jump to: navigation, search

Further Script Work

JC: A script that exercises each effect, reports the time taken and compares the result to a reference waveform would be useful. It gives a working example of a script for people to modify when tracking down live issues. For end users, a function that tells us the length of audio in a track and a function callable from script that allows us to set the selection start and end sample count makes scripting a lot more useful - consider fade in and fade out.


Optimised Track Drawing

JC: I'm assuming this means that you would keep about a screen's worth of buffered image around so that scrolling can be done with a bitblt. If so, I'd like to see at some time the option of smooth scrolling (waveform scrolls by with cursor staying fixed) for fast enough machines. With our current paged scrolling navigation is less good than it could be. Please comment on whether that would follow naturally from the proposed change, and if not clarify what the proposed change is.

DH: Yes, automatic scrolling was something I had in mind when I proposed this, and I think the changes would be sufficient to allow this at least on the drawing side.

The changes I've been thinking about would be more general - the tracks would keep track (ha) of which regions are buffered already and which need redrawing. Policy concerning which pieces are most useful to cache could be determined afterwards, when experimentation is easier. Another possibility is a 'level of detail' system whereby a low resolution rendering of, say, the whole track is stored to allow fast scrolling without using too much memory. I'd quite like to reduce Audacity's idle cpu usage - I suspect it's higher than it needs to be, and that this is because of the drawing. Of course this needs further investigation/profiling to say for sure.

JC: One neat strategy on caching used by ARM processors is the 'random drop cache'. The idea is to have a fixed size of cache and drop a random item from it when a new item is added. This gives graceful degradation of cache performance, where a least-recently-used cache would end up doing the same as no cache in worst case. Random drop cache would be my preferred starting point as it is also easy to implement.

DH: Definitely sounds like a good starting point. I suppose the key is working out what the average case is, which is tricky without stats on how people actually use it. I'd guess that there's quite a bit of jumping around which is effectively random, but with a tendency to be close to the current position. There might not be enough of a benefit to justify any extra complexity though.

Find Notes View

JC: Please discuss whether any changes to the find-notes display will be made. For example, is it useful to show the algorithm's time-varying 'level of confidence' in the analysis?

DH: This is open to discussion - I think such a quantity would implicitly require a definition for 'the perfect output' with which to compare the actual output, and I'm not convinced there is an objective one. (For instance, what should the output be given white noise as input?). I think the only real option is to assume the user knows roughly what they want to get as output, and give them the control over the analysis to ensure they get it. One change that would definitely be useful would be drawing a keyboard instead of or as well as the frequency scale. (I think the note track drawing code already does this, but it'd need a bit of modification) Maybe also making the amplitude variations clearer (with colours). Setting the initial zoom level sensibly would probably also be good.

JC: I'd expect a quality measure to drop out fairly naturally. If you give note finding flute music, very pure tones, the found notes will account for near 100% of the signal. If you give it white noise, the found notes will account for near 0%. For piano music it depends on how closely you model your pianos notes and reverb, but accounting for 50% of the signal when there are few/no chords should be doable without it becoming a major research project all on its own. If a quality measure doesn't come out naturally, then displaying some quality measure - at least for a GSoC project - is not a good idea after all.

If you add a dialog to control the analysis, and that is very optional, then please have at most one continuously adjustable numerical parameter. This is a classic problem with analysis code from a user perspective. Finding the right combination of parameters can be like searching for a needle in a haystack. To fix that you then need more complex ideally visual feedback on the different stages of analysis. There is not time for that level of sophistication within GSoC - though it would be very welcome later on. So my advice is keep it simple enough that it can be usable by end users within GSoC time frame. We need what you do for GSoC to be usable by users. Given the other changes, probably no numerical parameter at all, but if there is, at most one.

Happy to argue/discuss, as a lot of what I say is educated guesswork based on analogy. I'm not quite as dogmatic as I may sound.

DH: Thanks. I agree that simplicity is the order of the day - my concern with the quality measure was that for it to be useful it might have to be complex. With an instrument that has strong overtones, if the the note-finding doesn't take this into account (quite likely, given time available), it may just add extra notes at those frequencies. So then it would account for more of the signal, but would be less correct. That said, if a useful quality measure does appear naturally then I have no objection to displaying it.

I think one parameter will be plenty - probably just a threshold value of some sort. If it turns out that even that isn't needed then all the better.