Proposal Unitary Project
|Proposal pages help us get from feature requests into actual plans. This proposal page is about managing Audacity Projects.|
Proposal pages are used on an ongoing basis by the Audacity development team and are open to edits from visitors to the wiki. They are a good way to get community feedback on a proposal.
- Note: Proposals for Google Summer of Code projects are significantly different in structure, are submitted via Google's web app and may or may not have a corresponding proposal page.
To provide a single "container" for Audacity Projects rather than the AUP file and data being separate.
There are a number of approaches that may be taken to achieve the purpose of this proposal:
- Save the AUP file inside the Audacity Project folder.
- When a new project is created the user would be prompted to name the project and set the location where it will be saved. At that point a "Project Folder" will be created containing an AUP file and a _data folder.
- There will also be a File menu option to "Close and Delete" the project in case the user decides that they do not want a saved Project. "Close and Delete" would automatically produce a warning.
- For users that do not want to save the Audacity Project, there will still be the facility of creating a "Temporary Project" which will be exactly the same as how projects currently work.
- Benefits of this approach:
- It helps new users to learn the important distinction between Projects and Audio files.
- Losing the data folder for a project becomes difficult. This is particularly relevant if the user saves all of their projects in the same directory. Rather than having dozens of AUP files and dozens of _data folders, they will have each project neatly packaged in its own sub-folder.
- Power users will often use this method of Project management already, but need to manually go through the steps of creating a folder and saving the new project into that folder.
- In the event of a system crash, all of the data files are already safely written into the Project Folder and not at risk of being deleted if the Temp folder is emptied. This is particularly important on Linux as the Temp folder is automatically emptied when the system is rebooted.
- Accidentally closing Audacity will not delete the data files.
- Dependent files could also be kept in the "Project Folder", allowing easy re-connection to the project if/when there is a feature to find missing dependencies.
- Save Audacity Projects in an "file archive" format (similar to how Nero handles disk images)
- The idea here is to save Audacity projects in some form of Interchange File Format that allows Audacity to read directly from the file without needing to extract the file before use.
- (I have no idea about the technical implications of this approach).
- To keep Audacity Projects as they currently are, but provide an additional feature to "unify" the project into a single archive file.
- The idea here is that on Save, Audacity copies all files that are used in the project into a single archive file (like Zipping the project).
- This is probably the most simple approach but carries a severe time overhead for saving and opening large projects.
James' Proposal (a new format, not AUP based)
Properties we want:
- One file rather than multiple.
- Inserts and deletes are localised and do not require rewriting the whole file. Ditto recording (even if adding recording into the middle).
- Stutterless playback - we are not jumping about in a large file like a mad Jack rabbit, which would lead to the disk not being able to keep up.
- Guarantees about low wasted space. E.g. less than half the space is wasted.
The tree and block design by Roger and Dominic achieves all but the first of these. Its scheme has multiple files because it 'delegates' management of whole blocks to the OS running the disk. An important point is that when splitting / merging blocks we copy so that a block rarely ends up less than half full. Specifically, we never allow two half-full blocks in succession. If that would happen, we copy to merge them. This gives the stutterless guarantee. Empty files aren't needed at all, and are deleted, the OS doing 'garbage collection' of any disk space freed. This makes the 'memory management' simpler.
James proposes instead:
- Implement our own efficient malloc/free within a single file. We set a minimum size for mallocs in the file - which corresponds to the block size in our current scheme. We also offer a free that can free part off a block. If both parts are above the minimum size this is done in situ without copying. This combination leads to a small efficiency gain relative to the existing system, in that
- (a) our 'blocks' can end up bigger than the minimum block size
- (b) our block boundaries will more often end up aligned with split/join boundaries, reducing the amount of copying on a split/join.
- (c) we'll more often use exactly the size needed.
We still keep the guarantee about stutterless play, which is that we have at most two jumps per 'block-size' of data.
- The 'AUP' tree becomes an edit decision list and is added to the end of the file.
- Because we don't have the OS doing garbage collection of empty blocks, we may end up with holes in our file. Also when we save we in any case want to purge the undo data. The easy way to do this is to write a new file. Because the working file already has the stutterless guarantee, doing this will be very nearly as fast as copying a file, as we'll be working with large blocks of data.
Some more details:
- We can repurpose wav and more easily ogg-vorbis files for this, so that the Audacity specific information is held in metadata. After garbage collection these will be playable as native files. Without garbage collection they will still have 'the right audio' in them, just jumbled and partially repeated. Typically the early part of the file, and any uninterrupted recording session will be contiguous in the file.
- As a 'unitary project' I would also like to put preferences data and project data on the same footing. I would like users to be able to choose whether a setting, such as quality settings is per project or global. This project can therefore override settings that are global. This requires changes in the user interface so that the user can choose what level a setting lives at, and is not an essential part of the proposal.
- Most code in Audacity should be written in a way that does not care or know about the block boundaries. It should just be processing streams of data. The system needs an abstract interface that hides the blocks.
This is too big a change for main Audacity itself. My thinking is to bring the Unitary Project format in with the new trackpanel plug-in.
- I've not spelled out the details of the edit decision list. It will reference data via pointers, and where the pointers or the data get too fragmented, consolidate that into one piece, and free the original pieces, adding on to the end of the file if there is no other free space to use in the file.
- Each separate stream of data has the stutterless property. However, this can still (and this is a problem with the original block scheme) lead to too many seeks at the same time, if many streams are being used simultaneously, EVEN if the streams are like automation streams or summary data streams and low data rate. A central system to decide which packet to retrieve/create next will be needed. It can queue up the next low data rate block for load when the IO demands are not too high.
- New users will frequently move or delete the _data folder without realizing that it is part of the Project. A unified Audacity Project will avoid this pitfall.
- As long as all dependencies are resolved, projects can be moved and transported safely.
Moving an Audacity Project to another machine:
Copy the unified Project to the new machine That's all - no more searching around to find the _data folder that goes with the .AUP file.
Clean-up old unused files:
No more mistakes of accidentally deleting the _data folder (there may still be a danger of deleting dependencies)
Saving Projects for later use
Long term storing of Audacity Projects is currently unsafe as the multiple file format makes it far too easy for project files to become detached from their data.
If a version control system is employed to store projects, the addition and deletion of AU files in the project's _data folder needs special handling.