Automation Project Progress

From Audacity Wiki
Jump to: navigation, search
This provides an overview of all progress on the automation project. It links to ToDo checklists, for the six component sub projects.

Project proposal pages and preliminary work investigating the available tools was done in May 2017. The original plan was to complete all six parts of the automation project by end Nov 2017. The Automation project is therefore taking about twice as long as planned.


Driving Audacity by an external script:

Conversion of the script from Perl to Python was easy.

  • I've made the C++ code more data driven, which reduces the 'proliferation of classes' in the old design.
  • As a general rule, getting lists of things rather than individual items is a win.
    • The lists are small enough. Less C++ code is needed if we always produce whole lists. We often need the whole lists anyway.

Currently iterating on the commands. The commands to gather data, 'getters', are generally data-dump style commands. I've made the commands use JSON format for their output. This is convenient everywhere, since Python has a JSON-read library, and it is also directly convenient for what-is-that which is written in Javascript. The setter commands are needed for generating images of waveforms for the manual. Turns out we already have a command for setting preferences. It needs an 'apply' command added so that the updated settings (e.g for theme) are applied and used.

I found that rather than create many many new commands for get and set, the command GetInfo fetching lists obviated the need for finer grained getters. The Set commands now have new optional fields (which in dialogs have a check box beside them, making them usable within chains) so obviating the need for finer grained setters. The result is easier to use scripting and less programming.

Thumbnails of work in progress

Python Scripting

Stage 1: Python scripting, with extra commands too

  • Python Scripting now supported.
  • TrackInfo and other Getter/Setter commands enhanced.
  • Spaces in parameters fixed.
  • Proof-of-concept Demo Command.

Commands in Chains

Stage 2: Chains working, with menu commands too

  • This also eliminates the need for 'MenuCommand: CommandName='
  • Menu commands have this format:
Human Readable Name (CrypticName)
  • This format aids look up of cryptic names and transition to Python scripting.
  • Commands like SetTrackInfo are now available as commands with parameters in Chains.


  • Dec 23 2017 - Python working now. Basics of list-getters implemented.
  • Jan 06 2018 - Checked into Audacity master. Mod-script-pipe is used to get the lists (menus, boxes) that drive WIT.
  • Feb 10 2018 - Setters available for preferences, tracks, clips, labels. Automation Reference now covers menu commands, and these are available direct from chains and batch.

Relevant Scripts:

  • - script to test the pipe works at all. Kept short.
  • - script to fetch the structural information from Audacity. This is used in making image maps.
  • - the script that invokes mod-script-pipe, and includes utility functions.
  • - the script that times and generates 240+ images by invoking scripts for labels, tracks, spectrograms and so on.
  • - script to play and re-record some audio.

Image Script

Collecting / Modifying Images for the Manual:

Stage 1 in C++ - This was harder than it looked!

  • Scripting the imagemaps gave much better hit-boxes (consistency) than 'doing it by hand'.
  • wxWidgets menus cannot be auto-captured with normal tools. There is no way to make them drop down programmatically. Instead the menus were completely redrawn using code.
  • I used ImageMagick for scripted cropping and image manipulation, but later found PIL(llow) for Python, which IS more flexible and has better syntax.

Stage 2 in Python - Using the C++ screenshot class.

  • Developing the Python script to get images gave an excellent work out of the C++ scripting commands.
    • This led to many small extensions, and additional optional parameters for those commands.
    • Optional parameters proved a win where we can specify channel numbers OR track numbers, and so resize channels of a stereo track differently
  • The current code can't get the mouse-hover annotations, since these vanish as soon as focus is lost.
    • That affects snap-to-lines, the scrub ruler, label movement.

The C++ version of imagemap generation is hard to maintain and not suitable for general manual editor use. Also it only does imagemaps. There are also many track images in the manual. There are also numerous effect dialogs, preferences and toolbars. Toolbars additionally need annotations, which the C++ tool did not do.

I took care of the effect dialogs and toolbars by extending Audacity screenshot tool. Now it can take images of a family, such as all effect dialogs. The screenshot tool can be used without scripting. However it is also available to scripting too, so that scripting can do the lot.

I used experience developing the C++ imagemap generator in the Javascript version. The javascript version (now part of WIT) maintains tables such as the table of tooltips in wiki so that editors can update the inputs easily. The Javascript version is a web-app so no installation and set up issues for editors. It does image maps for menus, annotates toolbars and produces imagemaps for those too. It also does a few formatted table generation tasks, for pages in the manual that need it.

The image scripting needs scripting, which in turn marshalling in text to binary conversions. In Audacity we previously had three subsystems for marshaling data. We handled preferences, effects and batch commands by different methods. Two consequences of this were:

  • Automation commands were not available from within chains, so chains was restricted to effects plus a very few commands added as specials by hand.
  • Effects provided no mechanism for self-documenting their parameters, whereas there was a basic mechanism to do this for commands.

Tidying up and unifying these three systems was a lot of development. It led to shorter and more powerful code. I removed redundancies, and all systems got access to all functionality. This work was required to add the new commands that the image script uses, and have them fit in with chains and have them be available from ext-menus too.

The next stage of developing the image script was generating track images via scripting Audacity. Cursors and drop shadows will be added by Python/PILLow.

Warning icon Annotated/Toolbar images done:

14 of 14
Warning icon Preferences images done:

19 of 19
Warning icon Raw icons done:

50 of 50
Warning icon Menu images done:

54 of 54
Warning icon Effect dialog images done:

62 of 62
Warning icon Track images done:

249 of 190
The real issue is not the number of images, but rather the number of styles and special details.
190 is the number of track images that needed cropping in the manual, so was the target number for number of track images.

Thumbnails of work in progress

Some Images Required

Stage 1: Survey of the images required

  • Some of the images can be created quickly/easily e.g. by exporting from the Theme Cache.
  • Families of images are particularly suited for automation.

Stage 2: Many imagemaps produced by the C++ tool

  • This approach broke easily and was hard to maintain.
  • Annotation (using Inkscape and a template) was still essentially done by hand.

Track Images Created Automatically

Stage 3: Tasks spun off to WIT

  • Image map generation spun-off into WIT and tackled there.
  • Annotation generation spun-off into WIT and tackled there.

Stage 4: Track Images Autogenerated

  • General scripting made possible the capture of large numbers of different wavetracks.
  • And then label tracks too.


  • Jun 03 2017 - C++: Auto-produced images (proof-of-concept) batch uploaded to wiki.
  • Jul 26 2017 - C++: Full set of auto-generated menus, preferences, image maps and command table in production use for the manual.
  • Aug 19 2017 - One-button prefs/effects capture, via Audacity Screenshot tools.
  • Dec 23 2017 - Javascript: Proof-of-concept for spinning the image-map generation off to WIT.
  • Jan 17 2018 - Python: Scripting to set track properties in Audacity working. Start of Python image script.
  • Feb 12 2018 - Python: Creating images of tracks for tracks page and for labels examples page.


Downloading / Modifying / Uploading from Manual by Script:

This works

  • The main tool, PyWikibot works well enough.
    • It is a bit slow, probably due to server speed and throttling.
    • We can upload about four pages a minute using it, but that is fine as it can just be left running until done.
  • I've used a customised PyWikibot in a simple way for bulk uploads.
  • I didn't use PyWikibot for bulk image downloads.
    • Instead I used a custom wiki page to collect/view the required images and then get-web-page-complete from within a browser to download lots of images.

Templates in wiki now do a lot of the work that could be done through bots. The bot works, but is slow (and needs babysitting), which is down to the wiki rather than the bot. Main win is the upload of multiple files.

The Pywikibot project turned out to be a find-out-how project rather than a pure coding project. I found that existing PyWikibot is just fine for bulk upload/download of wiki content. We avoided needing smarter wikibot code than PyWikibot by using ParserFunctions, in the wiki itself. I added quite a few new templates. This is better than the original plan. Edits direct in the wiki templates can now affect/improve multiple pages without having to fetch multiple pages, modify those pages and then send them back in a bot solution. WIT reduces the need for python wikibot scripting too. WIT collects information from multiple pages in wiki directly in the browser, and WIT can generate tables that we otherwise would have scripted with wikibot in Python.

  • The most relevant parts of the wikibot how-to are documented in the GitHub repo for the team tools.
  • The new wiki templates that reduce the need for external scripting are documented in the wiki and manual.


  • Aug 23 2017 - Wikimedia's wikibot in use. Produced and applied a wiki 'fixes' file for custom patch ups.


Make New Commands Available to Nyquist Too:

This now works.

  • A significant win was code in mod-script-pipe to generate in either JSON format, a LISP format, or Brief format.
    • The work done there means less reformatting work on the Nyquist side of things.

To execute a script command, use:

 (aud-do <command-string>).

Initial investigation involved reading on SWIG as a possible interface tool.

  • SWIG is well maintained and currently looks suitable.
  • I followed SWIG commits on GitHub

However the 'by hand' implementation of aud-do as a single command is working well. SWIG can be added later, if/when we want to be more native to each scripting engine in our API.

Thumbnails of work in progress


Nyquist result, after calling out to the GetInfo command to get clip information


  • Feb 12 2018 - First version working with Nyquist.

Whats is That? Page

Dynamic Javascript 'Image Map' for Audacity:

One of the early take-homes from experiments is to do relatively little image updating with hover movements. This is to avoid distracting the viewer with things happening in their peripheral vision.

  • I do most substantial image updates on a click.
  • A key difference from the wiki image map is that this map has multiple levels.
    • That in turn means a lot more content is connected, and also a lot more hit-targets to define.
  • Caching (varnish and CloudFlare) together with CORS was a major obstacle to development.
  • I developed the extension features for in-house use as Chrome-browser-only.

I am now using a single canvas for the two areas. This makes drawing the connecting lines easier. The scroller (an iframe) is over part of the canvas. I tried fading-to-black first, but fading-to-white works better.

I have been through several versions of the data structures for driving it. Heavily nested was not such a good idea, as hard to keep the bracket matching right.

Some of the optional enhancements required CORS to correctly integrate with wiki. The CORS correctness could only be tested on the actual target. Caching (in browser, varnish and CDN) and a crashed cron job that was doing the updates made developing/debugging this much more difficult than expected. Caching was eventually worked around by as occasion demanded, renaming all the files by renaming a directory, and also by using a timestamp in the query that required CORS validation, so that the query result was never cached.

I initially tried out a cross-browser library for doing downloads (images from the canvas). However, it didn't name the files correctly, and was downloading transparent images - it did not work out of the box. So I instead got a solution to work using chrome browser specific code. There is not an urgent need to support multiple browsers for this role, since the download feature is an extension for our use, not needed for widespread use. The generic download library could be investigated further at a later date, if we do need cross-browser downloads.

Two levels of selection, with annotation and arrows.

Thumbnails of work in progress

Early experiment

Stage 1: Simultaneous (green) highlighting of label and region of interest

  • This was the first time that highlighting could follow label and region together.
  • Both the main part and the labels below are drawable images, that we can draw on using javascript.

Feature Complete

Stage 2: Automated multi-level annotation

  • Clicking about fetches appropriate content for menus, toolbars, buttons.
  • Menus configurable via wiki.
  • Buttons above the scroller for [Reset] [Manual] and [Special]

Annotation Feature

Stage 3: Styling; Special Options Panel

  • (Optional) Options Panel with annotations & other documentation features.
  • (Optional) Proof-of-concept of Architecture ImageMap.
  • Styled with Shinta's graphics and with drop shadows etc.


  • Dec 23 2017 - Hover and click using semi-auto-generated (rather than fully-hand-crafted) boxes.
  • Jan 04 2018 - First version announced on team list.
  • Jan 24 2018 - Nicely styled version with optional extras.

Pdf of Manual

Wiki Manual as a PDF:

This involved work with a large number of tools. Key take aways:

  • Many of the pdf tools are low level and poorly maintained.
    • I ended up rejecting / abandoning quite a few that seemed to be more work than they should have been.
  • I use LaTeX as an intermediary.
  • I do document restructuring using HTML tools. Beautifulsoup is mature and well designed.
  • I do most of the heavy lifting (page layout, cross referencing and detailed format tweaking) in LaTeX. LaTeX is designed for producing documents.
  • I generate and output both .html and .tex files in processing the source html into latex.
    • This .html isn't used in generating the pdf, but it makes debugging much easier.

The LaTeX editor/viewer really made this part of the automation project work well. It means we get good (and fast) preview of the end result(s). Without a LaTeX intermediary the output format would be far less flexible, and working with the scripts would be harder.

2 Column layout


2 Column Layout

Imagemaps working

Automatic Table of Contents

Table of Contents as seen in pdf reader


Searching for 'Selection' in the pdf manual


  • Nov 16 2017 - DraftManual2.2.0_v02.pdf available. This is the whole manual, except 4 pages. It looks basically good, though there are minor formatting issues. The live table of contents is a win.
  • Nov 17 2017 - DraftManual2.2.0_v03.pdf available. Now formatted using Helvitica (sans) font rather than Times New Roman. Some unicode issues addressed.
  • Nov 19 2017 - DraftManual2.2.0_v04.pdf available. Custom front page. Imagemaps working.