From Audacity Wiki
Revision as of 13:40, 4 January 2012 by Arent (talk | contribs)
Jump to: navigation, search
Mezzo ("A cross-platform audio editing engine") is an audio library based on a substantial rewrite of Audacity's current core audio manipulation code. It handles file input and output, threading, buffering, editing and signal processing, but has no GUI, and one its substantial advantages is that it would allow real-time effects processing in Audacity. The intention is therefore that Mezzo should serve as a "back-end" for a future major revision of Audacity that we are tentatively calling "Audacity 2.0".


A snapshot of Mezzo is available here . At the moment, Mezzo uses SCons  for its build system. All it requires is that you have Python installed. To build, type:

$ python scons/

Design Philosophy

Here are some of the primary goals in the design of Mezzo:

  • not tied to any GUI system. Mezzo will not contain any GUI code, and should thus be usable with any GUI system or even from the command-line.
  • as few dependencies as possible. We want Mezzo to be easy to use and accessible. You shouldn't have to hunt down a bunch of other libraries to get Mezzo working. We don't want to drag in very large libraries if we only need a small part of their functionality. Any required third-party libraries will be included.
  • independently useful components. Mezzo will include a lot of functionality, but you shouldn't have to buy into the whole package if all you need is a small part. You should be able to use only as much as you need.
  • clear memory semantics. Mezzo's objects will be usable on the stack wherever possible, to eliminate one source of resource leaks. Where longer-lived objects are necessary, we will do our best to ensure that it is clear who is responsible for freeing the objects.


What follows is a roadmap of the different parts of Mezzo. The system is described from the inside-out. We begin with Buffer , a class that almost everything else depends on, then we move out to the things that depend on Buffer, and so on.

I recommend generating documentation with Doxygen  ( doxygen Mezzo.dox ) to refer to in conjunction with the explanations given here.


Buffer is a very central class in Mezzo -- it represents a buffer of sound data. The Buffer has a specific length, and its data is in 16-bit, 24-bit, or 32-bit samples. The data inside a Buffer is reference counted (much like string classes often are) so that the operation:

Buffer b = otherBuffer cheap. This also means that passing a Buffer as a parameter to a function is cheap, and using references for function parameters is unnecessary.

Because Buffer can have 16-bit, 24-bit, or 32-bit data, it does not have any methods that pertain to a specific sample type; it only has generic methods like GetLength () that do not depend on the sample type. If you want to access or modify the sample data, you must use the subclasses Int16Buffer , Int24Buffer , and FloatBuffer . You can convert a Buffer to any of these by using the methods Buffer::AsInt16() , Buffer::AsInt24() , and Buffer::AsFloat() . If the buffer is already in the format you request, then no conversion is done, and the new class just references the same data. Otherwise, it converts the data.

Sequence, SeqBlock and SeqBlockContext

A Sequence is an abstract data structure representing an array of audio samples. The main operations you can perform on a Sequence are Cut, Copy, Paste, Append, and [InsertSilence] .

The simplest implementation of Sequence is MemorySequence . It simply stores the entire array in memory. It is not at all efficient (Cut, Copy, and Paste all copy huge amounts of memory if your Sequence is very big), and only meant for testing.

The implementation of Sequence that is intended to be used is BlockedSequence . BlockedSequence works by dividing up the data into smaller chunks called SeqBlocks , according to the algorithm described in Dominic Mazzoni and Roger Dannenberg's paper "A Fast Data Structure for Disk-Based Audio Editing." It is efficient for all operations (Cut, Copy, Paste, [InsertSilence] , etc).

SeqBlock (the building block of [BlockedSequence] ) is also an abstract class, that has two implementations: SeqMemBlock stores the data in memory, and SeqDataFileBlock stores it in an AU file on disk. For large amounts of data, SeqDataFileBlock is going to be preferable.

To recap, this gives a few different options:

  • MemorySequence : Very inefficient for cut, copy, paste
  • BlockedSequence with [SeqMemBlocks]  : efficient, but uses lots of memory
  • BlockedSequence with [SeqDataFileBlocks]  : efficient, and uses disk which you presumably have a lot more of than memory

When you create a BlockedSequence you pass it a SeqBlockContext , which is what it will use to create its SeqBlocks . A SeqBlockContext is a lot like a factory for [SeqBlocks] , however it also reference-counts the blocks and copies them if you try to move a block between contexts (each SeqBlock belongs to exactly one context).

ManagedFile and ManagedFileContext

When you use BlockedSequence with SeqDataFileBlocks , each Sequence is going to have a lot of small files associated with it. It is important to keep track of these files; if you save a project, you have to save all the right files along with it. If you "Save As," you have to make sure that the old project has all the files it needs, and that the new one has all the files it needs too. However, you don't want these files to persist after they are no longer needed. ManagedFileContext is a class that keeps track of a bunch of ManagedFiles that are united in one purpose (in Audacity's case, each ManagedFileContext will correspond to a project). Its primary responsibilities are:

  • keep a list of files in the context
  • reference-count them, to know when they are no longer needed
  • delete them when they are no longer needed
  • make sure that certain files in the set are preserved when a client requests it (we say that such files are locked)

Each file that is managed thus must have a corresponding class that derives from ManagedFile . Such a class should create the file to be managed in the constructor. Other than that, there are no requirements for what kind of file is being managed. Right now only sound files (AU) are stored this way -- SeqDataFileBlock derives from both SeqBlock and ManagedFile -- but other kinds of files should be able to use this mechanism also.

Quick Recap

Buffer, [BlockedSequence] , [SeqBlock] , [ManagedFileContext] form the core of Mezzo's on-disk audio storage. Higher levels of abstraction that represent things like Tracks, Regions, and Projects can use these core classes for data storage without caring how they work.


Audacity can store and load projects very quickly, because the Audacity project format is really just a serialization of Audacity's object model. Therefore "storing a project" is synonomous with "serializing the objects to disk." Serialization is handled by the Loader and Storer classes. Any object that can be serialized derives from Storable . See Storable.h for more information about the mechanics of how serialization works.