Proposal Audacity 4 Blind

From Audacity Wiki
Revision as of 10:14, 31 July 2017 by James (talk | contribs) (Minor updates.)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search
Proposal pages help us get from feature requests into actual plans. This proposal page is about accessibility for Audacity for blind users.
Proposal pages are used on an ongoing basis by the Audacity development team and are open to edits from visitors to the wiki. They are a good way to get community feedback on a proposal.

  • Note: Proposals for Google Summer of Code projects are significantly different in structure, are submitted via Google's web app and may or may not have a corresponding proposal page.

Proposed Feature

Instead of using wxAccessible to make controls within Audacity accessible for blind users, this proposal splits Audacity into an audio engine and a GUI. The GUI handles visual-only aspects, bitmaps and mouse clicks and drags. The underlying audio engine handles the audio and selections. The interface between the audio engine and GUI is defined in files that SWIG can use, allowing us to script Audacity. So Audacity 4 Blind is closely related to Scripting. There is a concept of a 'clutch'. When it is engaged the visual GUI will track requests and changes made to the underlying audio engine. The GUI is plug-in, so it does not have to be present. In principle we could plug-in different GUIs.

This refactoring and splitting of the program into audio and user-interface concerns is beneficial for automation and for testing. As there is considerable work involved in splitting audio from GUI code, it would be beneficial to also define some other 'concerns' which divide the code and at least partially separate along those concerns too.

  • Main split - Audio from GUI, so that we can operate the audio using a command line interface, like CSound.
  • Editor split - Audio from history/undo, keybinding (cut/copy/paste), preferences, file I/O. There is generic editor code in Audacity which is relevant to other kinds of editing.
  • Update and caching - We have many special case mechanisms for ensuring data is ready 'at the right time'. Potentially we can eliminate a lot of 'glue code' by providing a library with general mechanisms for updating dependent data.

The Audio/GUI split gets us about half way to where we want to be with Audacity 4 Blind. It gives us a script based interface that exposes all the features. This could be used with a screen reader like Jaws. To go beyond that we would add further refinements. Proposed refinements include:

  • Value tweakers - the ability to temporarily bind a variable to keyboard so that (for example) left and right arrow increase and decrease the value, and the step size can be varied while listening to the audio (depends on real-time improvements such as the real-time looping). We could also allow binding to MIDI controllers and to mouse wheel. One idea is to provide a long-slider in Audacity (GUI) so that this becomes a valuable feature to sighted users too.

They temporarily bind a value to the long slider when they are working with it.

  • Efficient audio help information. Much of this is about reflection of commands and options that are available. For example we can list the functions that can be applied, and the user should be able to navigate through this list quickly without having to listen to it all.
  • More sophisticated keyboard shortcut methods.

Developer Backing

  • James
  • Leland

Use Cases


So how exactly will the 'efficient audio help information' operate? Current thinking is in terms of trees, like existing menus, but rather more deeply nested and with each level shorter than current menus are. One problem blind users face is not knowing how long a list is before listening to it! Something complex like effects might be offered as:

84 Effects
7 Categories
9 Technologies
12 Alphabetic groups

You then choose whether to explore by categories (time-preserving, reverbs, repair... ) or by 'technologies' ( Built-in, VST, LADSPA, Nyquist....) or by Alphabetic ( A-B, C-D, E-F.... ). For alphabetic you would usually type the letter, e.g. R and start hearing all effects that begin with R. We allow items to appear more than once in the menu, since the different organisations will give better options for navigating.

We'll want some standard exploration methods besides, up, down, previous, next. Possibly 'help' to go read the manual. Possibly 'curt' to ask the system to say things as briefly as possible for when revisiting lists that have been navigated before.

Typical blind users will set hotkeys into location in these menus that are useful to them. At worst this menu system will be no worse than a conventional menu system customised for blind. With the extra smarts it could be a lot better. We'll be sure to make sure every feature of Audacity is exposed by some 'menu item'.

In all this I am assuming that we are still using a third party screen reader to give accessibility to the text.