Proposal Waterfall Spectrograms

From Audacity Wiki
Jump to: navigation, search
Proposal pages help us get from feature requests into actual plans. This proposal is for "waterfall" or perspective style views of spectrograms.
Proposal pages are used on an ongoing basis by the Audacity development team and are open to edits from visitors to the wiki. They are a good way to get community feedback on a proposal.


  • Note: Proposals for Google Summer of Code projects are significantly different in structure, are submitted via Google's web app and may or may not have a corresponding proposal page.

Proposed Feature

A waveform is a two-dimensional plot of level against time, but a spectrogram is a three-dimensional plot of power against time and frequency. The usual view uses color as a third axis. An alternative is to use perspective. The result is a "landscape."

Is there any clamor for this from the users? Will this really be useful to anyone in editing practice? Hard for me to say. I just know it looks super cool, and was easier to implement so far than it looks.

Developer/QA backing

  • Paul Licameli

First Prototypes

Some images of the work in progress.

Opaque waterfall view Waterfall view of about two seconds of speech in "Opaque" or hidden-line removal style, with gridlines at octaves. The "slope" and "height" perspective parameters are exaggerated. Mel scale.

Translucent Waterfall view The same except for "translucent" or wireframe style. Crests and grid lines obscured from view in the previous image now appear faded.

Solid Waterfall view The same with less exaggerated perspective and in "Solid" view and a different choice of grid lines -- "31 bands" corresponding to the graphic equalization sliders.

Problems

  • When selecting or scrubbing in such a view, the time deduced from mouse position should be corrected for the slant, depending on the y coordinate of the mouse. The same should also apply to clicks to zoom in, and to ctrl-mousewheel which determines a center of zooming. I think this correction should not apply to time-shift, however, because a clip can be dragged to another track, and it would only get confusing if corrections applied sometimes here but not there.
  • The play indicator, quick play line, and point cursor should no longer be drawn as a simple vertical when crossing such a view, but instead follow the curve of equal time. Perhaps similar should be done for the guides of the zoom tool, and for the yellow snap lines that appear when selecting, but not for the yellow snap lines for time-shift tool.
  • If you point at a "peak" of the graph for spectral selection, you are not selecting that frequency: you must instead point at the bottom of the hill. If you have an exaggerated vertical scale (as in the illustrations), the confusion gets worse. Rather than complicate the interpretation of mouse position further (correcting frequency as well as time), it might be easier to have a "crosshair" that follows mouse position to indicate what you would really select. This would draw one curve of equal time as for the play indicator and perhaps another for a curve of constant frequency. The curves would in this case pass under the cursor, not exactly through.
  • The gray background rectangles and sync-lock pocket watches should appear as parallelograms in such a view.
  • Zoom-to-selection and drag-zoom should fit the parallelogram, not the rectangle.

Further Ideas

Related

the best way to enable and disable spectral selection.