The intent of regression testing is to ensure that changes such as those mentioned above have not introduced new faults. One of the main reasons for regression testing is to determine whether a change in one part of the software affects other parts of the software.
Common methods of regression testing include rerunning previously completed tests and checking whether application behavior has changed and whether previously fixed faults have re-emerged. Regression testing can be performed to test a system efficiently by systematically selecting the appropriate minimum set of tests needed to adequately cover a particular change.Contrast with non-regression testing (usually validation-test for a new issue), which aims to verify whether, after introducing or updating a given software application, the change has had the intended effect.
Regression test scripts to be linked to from here when written and should include:
- LP tape transcription to CD/iTunes
- Recording web-stream and exporting the result as MP3/WAV
- Multi-track editing
- Timing tests for common exports and common effects such as Amplify and Noise Reduction. The tests should take around two hours to complete.
This should be undertaken on an ongoing basis as new features are added or modified - driven by the entries in the changes log, working either with the posted alphas or a test Audacity version that you build yourself incorporating the patch changes.
James 27Mar15: We should use and improve the scripting feature to make scripted tests. My experience is that automated pass/fail tests end up repeatedly testing things you already know work. A much better strategy is to build tests that
- (a) return performance information and
- (b) produce screenshots for the manual.
Performance information gives you 'more bits of information' per test, disk, RAM and CPU utilisation. Producing screenshots forces you to walk through nearly all of the code, and also potentially saves documentation team time when changes are made to the user interface.
I am using some simple macros so that we can safely leave logging and performance monitoring permanently in the code in debug and release. These work with a count, so that they log the first ten times they occur, and then just count occurrences after that. That makes them almost zero cost even in release builds. The macros look like this:
DIAG( message ) TRACK_MEM( message, amount ) PERFMON_START( message, timername ) PERFMON_STOP( message, timername )
In running code we can go and reset selected counters, if we want more monitoring. The DIAG macro just logs a message. TRACK_MEM keeps track of memory allocated and deleted. PERFMON START and STOP measure time intervals. If we additionally load the module mod-test, we get a running display of the counters.
Developers need to add detection of 'dropped packets' into the code. I plan to use DIAG for that. I have sometimes played a piece of audio and heard a glitch, and then played it again with no glitch - a fairly clear sign that disk was not keeping up with output stream. The second time the audio would already be in memory. We need some long running tests, such as recording for 24 hours, possibly with a test signal so we can confirm the quality of the recording. I suggest that the bulk of our long running tests should mostly be for recording and be at the 4hr mark, with just a few outliers for longer periods. There are some known regressions with editing and handling of longer projects (https://bugzilla.audacityteam.org/show_bug.cgi?id=218 and https://bugzilla.audacityteam.org/show_bug.cgi?id=765 ) that need to be cleared before we can sensibly do long running tests.
If we have a number of different automated long-running tests, Audacity may be changing even before we complete running all the long tests. We need to do this testing on an ongoing basis, rather than as a formal part of an release as there is no time for long running tests then. We should 'drop in' an updated Audacity between tests, and continue the tests from there. Otherwise, if we reset to the start of the tests, the first few tests will get exercised a lot and the later ones hardly ever run.