Suggested Frequency Analysis Capabilities
From Audacity Wiki
The addition of the "Plot Spectrum..." option in the "Analyze" menu in Audacity 1.3beta is great. Here are some suggestions for converting it into a robust and useful tool.
Frequency Analysis Window
- Wikipedia suggests the correct name for Hanning window is Hann window - named after the inventor.
- Currently the "Frequency Analysis" window is generated by highlighting a region in the audio track, and then selecting "Plot Spectrum..." option in the Analyze menu. When the "Frequency Analysis" window is opened, there is a option at the bottom of the window which is set to "2048". 2048 is presumably the size of the transform. If so, then why must a region of audio be selected in order to open the "Frequency Analysis" window? It would be very much preferred if the "Frequency Analysis" window could be opened based on the current cursor position, rather than a region of audio. The current cursor time position would determine the start of the transform region, and the window size parameter in the "Frequency Analysis" window would determine the endpoint of the analysis window. ..... Extension of this comment by another editor: ..... Analysis by highlighted region is a useful feature. I use this to select the duration of a specific note in a piece of music, and then run the Frequency Analysis on that specific tone. However the mysterious window that is "set to 2048" (or 512 on my version Audacity 1.2.6) could have an explanatory label added. My experiments suggest this is not the duration of the sample that is analysed. It seems to be linked to duration in some way though. Higher values, up to 16382, give sharper resolution of frequencies in the displayed spectrum - but demand that a longer segment of time is selected before performing the analysis. Result is that if the musical tone, say, is too short in duration then you cannot analyse it in such fine detail. But that is not a feature request - (we can't ask the developers to break the laws of physics and mathematics).
- The displayed "peak frequency" shown in the "Frequency Analysis" window is useful for tuning musical instruments. (I use this for tuning harmonica reeds.) However it displays to only the nearest cycle per second. When checking a high "A" at 1760Hz the 1Hz resolution is ample. But at a low "A" of 110Hz an error of 1Hz represents a significant fraction of a semitone. For this purpose long duration tones can easily be recorded, giving ample data for the analysis to chew on - if only an option to analyse at higher resolution was available (in that drop down list that currently goes from 128 to 16382?).
- Which audio track is being displayed in the spectrum when viewing a stereo file? Currently this is confusing -- is just the first channel of the file being displayed? Are both channels being mixed before the frequency transform is being done? It would be nice for the user to be aware of which channel is being used for the spectrum analysis, or to allow the user to choose how to select the audio tracks for generating the spectrum.
- While the "Frequency Analysis" window is open, clicking at another location in the audio track should update the "Frequency Analysis" plot with the spectrum at that newly selected point in the audio.
- It would be nice to have a keyboard short-cut to open/close the "Frequency Analysis" window.
- While the "Frequency Analysis" window is open, playback of the audio will cause a real-time update of the spectrum in the "Frequency Analysis" window.
- Convert the options at the bottom of the window into a toolbar such as the ones at the top of the main Audacity window (such as "Audacity Control Toolbar" which contains the play button.
- I like the display of the frequency at the cursor point as a musical pitch-class/octave. Could you add a cents qualifier to the musical pitch discription. For example if the cursor is located at 443 Hz, that would be the pitch A4, but slightly sharper, so it would be nice to have the cursor display for that frequency read: A4+12, since it would be 12 cents sharp of A4 in equal temperament at A4 = 440 Hz. If the frequency were 437, then the cursor display could read A4-12, which would stand for A4 minus 12 cents.
- The peak value in the cursor status area of the "Frequency Analysis" window is somewhat confusing. This is mostly due to the low resolution (zoomed out) of the spectrum. Currently it is not a useful value. There is a recommendation about spectral peak measurement below which might tend towards having the "Peak" entry in the cursor status region removed.
- The axes of the spectrum display is overly verbose. For example, the frequency axis is labeled like: "22Hz, 3KHz, 6kHz, 10KHz, 13KHz, 16KHz, 19KHz). It would be a lot cleaner if instead the frequency axis were labeled something like: ".22, 3, 6, 10, 13, 16, 19 [kHz]". Likewise for the magnitude axis and the redundant dB unit specifiers.
- It would be nice if the frequency axis could be zoomed/panned like the time axis in the main Audacity window's time axis. Showing the entire spectrum range is usually not useful for musical purposes, since the more musically relevant region is 20 to 8000 Hz or so. At a minimum, it would be nice to have a frequency cutoff point. For example the user could type in 1000, to see the frequency range of the spectrum from 0 to 1000 Hz.
- When the frequency axis is zoomed in very far, it would be nice for the individual frequency bins to be displayed as dots on the spectrum function, just like zooming into the time-domain audio in the main Audacity window.
- It is probably not necessary to have a zoom/pan facility for the magnitude axis of the spectrum, since the dB scale is logarithmic. However, it would be nice to have a cutoff point for the lowest dB value to display. Currently the display covers about 100 dB of dynamic range. For sample resolutions greater than 16 bits it would be nice to increase the range, and for practical purpose, it might be nice to limit the dynamic range so that a noise floor can be hidden for older analog recordings.
- It would be nice to add a few more window types which are common in audio analysis, such as the Blackman 2, 3, and 4 term windows. Also there are a few windows with a tunable parameter, such as the Kaiser window, so it might be nice to think on how to add an extra option for this parameter to the toolbar when that type of window is selected.
- Once zooming on the frequency axis has been implemented, it will be useful to add an option to the analysis toolbar which controls a "zero padding factor". This option would control the amount of frequency interpolation between bins in the frequency transform. The factor would control the amount of zero padding done in the time domain before the transform is taken. For example 0 would mean no zero padding, 1 would mean to add one extra frame of zeros to the analysis window, 3 would mean to add 3 extra frames of zeros to the analysis window. Same for 7, 15, and as high as is practical. Zero padding in the time domain is equivalent to interpolation in the freuqency domain. Therefore zero padding will generate higher resolution of the frequency values in the spectrum plot.
- Along with zero padding, it would be nice to have a button in the analysis toolbar which could be clicked on in order to have the computer search for the exact freuqency value of the nearest peak to the cursor. The cursor would then move to that position, and the peak information would be displayed in the Cursor status area in the "Frequency Analysis" window. This is useful for very exact measurement of the frequency represented by the peak in the spectrum. It can be measured by a combination of a high zero padding factor (which could be left up to the user), and then some interpolation method, such as "parabolic peak interpolation" where you would take the top three points in the peak, fit a parabola through them, and identify the highest point in the fitted parabola.
- It would be nice if more than one frequency were displayed on the graph. Preferrably, a range of frequencies would be represented on the Y axis, a range of time would be represented on the X axis, a per-pixel color value would represent the Z axis, the magnitude of the frequency.
- It would be nice to be able to analyze a selection longer than 23.8s with Plot Spectrum (looking at a wav file, for example). If that is a hard limit, then give the user some control over how to distribute that time over an entire song. Something like 1s every 10s, 1s every 15s, type of thing.
- To facilitate comparison of different tracks, the ability to have a separate (independent) analysis display per track (hence multiple windows) would be great.
- Currently there is a hard upper limit of 100KHz for the maximum visible frequency in the spectrogram view. Ьost users will have audio files with a sample rate of 192KHz or less, and that a 100KHz limit is a reasonable default computationally and memory wise, however a warning might be more appropriate than a hard limit. (request from bsmiller in the official forum, useful for bioacoustics)