GSoC 2008 - FFmpeg integration
This page describes the FFmpeg integration GSoC project and will reflect its progress over time.
FFmpeg is a set of libraries for audio/video encoding/decoding/muxing/demuxing/processing/capturing. In Audacity FFmpeg will be used to import and export audio data. While FFmpeg also has other uses, this is outside the scope of this project.
As a design decision, FFmpeg libraries are being linked dynamically and loaded at run time. This allows FFmpeg to be optional (Audacity will run without FFmpeg) and it could be distributed separately (removes licensing issues).
It's not yet clear which build of FFmpeg will be used - static or shared. Static build results in one big library, that contains all functions from FFmpeg package. Shared build results in a few libraries (and these libraries may depend on other non-FFmpeg libraries), each library exports different functions. Valid system-wide FFmpeg package is usually shared, and resides somewhere in PATH, so any application is able to load functions from any of these libraries. This is also necessary because libraries depend on each other. Static build is usually stored somewhere within application's directory as some kind of plug-in. My guess is that Linux users would prefer shared FFmpeg, since they have superior package managing system, while on Windows it's easier to drop static FFmpeg build into Audacity and forget about packages.
From Audacity's point of view there's only one difference: when using shared FFmpeg, Audacity has to load each function from respective library, and each library has to be mapped into Audacity process memory by separate wxDynamicLibrary object. When using static FFmpeg, Audacity only loads one library and imports everything from it. Because all shared libraries should be in PATH, it's not useful to ask user for libraries' location when shared build is used, while with static build it may be necessary.
At the moment FFmpegLibs class supports loading both shared and static libraries, however library locating dialog has been removed and Audacity only looks for libraries in PATH.
FFmpeg is able to import and export FLAC, OGG Vorbis, MP3, MP2 and various uncompressed wave files, however Audacity already possesses all necessary capabilities to work with these formats (except MP3 - exporting requires separate plug-in). Using built-in Audacity import/export modules is better because they support various sample formats, while FFmpeg supports only 16-bit integer samples. FFmpeg also cannot (at the time of writing) import meta-data (tags) from raw FLAC files.
In some cases user may wish to use FFmpeg to import files however. This could be achieved in two ways:
- Add new preference - plug-in loading order. By using this preference user could adjust the order in which plug-ins are being registered in Audacity. When FFmpeg import plug-in is registered before all others, it would handle importing of any file without letting other plug-ins even try. At the moment FFmpeg is always registered last.
- Make Importer aware of the file filter user choose in FileDialog, and assume that when user chooses "FFmpeg-compatible file" filter, he (user) wants to import files via FFmpeg, and by choosing "MP3 files" user suggests Audacity to use libmp3lame. Implementing this requires a few changes, some of them may involve cumbersome passing of additional argument through several functions from ShowOpenDialog to Importer.
Overlap in export functionality is discussed later.
Currently Audacity only offers limited choice of export file types:
- WAV, AIFF and other uncompressed files
- OGG Vorbis
- Command-line exporter
First one hides variety of export types (header formats) and codecs (sample encoding formats) in it's "Options..." dialog. This is possible because there's little or no difference between all these formats. For FFmpeg such solution is not acceptable, since FFmpeg can export audio in completely different formats and grouping them all in one choice is unintuitive. Grouping all the options for all these formats in one dialog is not the easiest (to implement and to use) thing in the Universe too.
To overcome this ExportPlugin class was slightly redesigned to present itself as more than one export type, while using the same code to perform actual export procedure. This new feature will be used to present FFmpeg as a few common export types, while still providing access to complete functionality in options dialog. It is not yet decided whether options dialog should be one for all FFmpeg-exported types or separate dialog for each type/group of types.
User may explicitly choose either built-in FLAC exporter or FFmpeg lossless exporter (featuring FLAC amongst other codecs/formats).
Questions and Answers
- James: The Windows build of FFmpeg is (and can only be) built using MinGW/Gcc. Audacity is built using MSVC. Is there an issue with loading a MinGW/Gcc DLL from a MSVC program? Do we know of other programs that already do that?
- LRN: As far as i know - no issues, though i can't name a program that does such loading. For me it always been self-evident that ld produces normal C dlls capable of exporting functions in conventional way. wxDynamicLibrary (wxEquivalent to LoadLibrary()/GetProcAddress()), should load anything. Issues may arise with C++ function decorations, but they are not used in FFmpeg at all.
- James: My understanding is that, as with Mpeg, because of possible patent issues we won't be distributing the FFmpeg dlls. Which sites would be appropriate to direct people to for downloading (a) Windows (b) Linux dynamic link libraries? Does the legal status depend only on the physical location of the server, and where are these servers located?
- LRN: FFmpeg with AMR support is illegal anywhere, we can't officially distribute such builds. Builds without AMR - depends on country (such FFmpeg builds are legal in software-patents-free countries). Linux binaries could be obtained from repositories (Medibuntu comes in mind). Windows binaries built with dlls are rare (usually they contain only ffmpeg.exe), so far the only site i know of that hosts dll version of FFmpeg is Ramiro's site. Yes, i think legal status depends on location only (it may be illegal for people from US to download FFmpeg from anywhere, but hosting FFmpeg builds somewhere in Europe should be legal). If you want guarantees, contact your lawyer. No, i don't know anything about the location of any servers i mentioned. My own server is located in Moscow, Russia.
GetOpenFileName filter length bug
On Windows one file filter in OpenDialog can't exceed the length of 260 bytes.
Eventually Leland found a workaround by directly hacking into the ListView control of the dialog and filtering its contents via custom filtering routine.
FFmpeg presents channels in the order they stored on the media. Audacity has to watch the importing type and reorder channels accordingly. Same goes for export. This is worsened by the fact that Audacity lacks proper multi-channel support.
I decided to put this issue on hold. Without having full support for multi-channel audio there's no sense in reordering channels on import - order doesn't matters for Audacity, all channels are mixed to stereo or mono at playback.
For some files or other sources audio parameters may change over time. For example, number of channels or sample rate may vary. At the moment FFmpeg importer can't handle that.
While i believe such issue does exists, the only file with such odd audio i had happened to be broken (new version of FFmpeg refuses to transcode it, claiming it to be malformed). Until i find a sample with dynamically changing parameters, this issue is on hold.
FFmpeg offers a lot of information, most of it is used to handle special cases (like - additional time offset, etc) and is completely ignored at the moment. I added a few log messages for special cases. While FFmpeg importer still won't handle these cases, at least one can see in the log that something is wrong.
Importers and exporters in Audacity are not known for their meaningful error messages, because there is no error messages coming from them. Error reporting could be improved in all plug-ins, and error handling should be improved in FFmpeg plug-in in particular.
New logger solved this problem partially. New Progress Dialog helps too - it recognizes three outcomes of import procedure (Success, Canceled, Error) instead of two (Success, Not Success).
- FFmpeg exporter.
- Compatibility detection for FFmpeg exporter.
This will serve as some kind of log, anyone is free to post FFmpeg-related entries here. This could be done in Talks of course, but Talks are supposed to be used for discussions about contents of the page and further edits, not for content itself.
LRN 21:34, 21 April 2008 (PDT): First log entry.
LRN 16:09, 13 May 2008 (PDT): Still trying to put myself together and start coding. Meanwhile i managed to set up FFmpeg building environment. Again. I am publishing binaries right away, see the link at FFmpeg page. These builds are audio-only, meaning that i didn't included any optional video libraries. Maybe it's possible to actually disable video routines, i'll have to make some research. Audio-only is able to decode and encode video, but to much lesser extent when compared to audio. All available audio codecs and container formats are available (extept liba52, AFAIK FFmpeg features ac3 decoder/encoder by default and it's better than liba52). There's two archives for each version (GPL- and LGPL-version). The only difference between these two is AAC decoder, libfaad (GPL-only). Dependencies (pthreads, lame, ogg and vorbis libs) are the same for both builds. AMR support is not enabled, because libamr is undistibutable (non-free). All builds are shared ones (with cross-dependencies). And since this is Windows-only builds, i compiled in AviSynth support. Could be extremely useful to leverage full DirectShow decoding support on Windows!
Maybe i missed something? Let me know.
LRN 16:11, 20 May 2008 (PDT): Made a few small commits:
- Do not put duplicate entries in filter for Import file dialog.
- Use wxGenericFileDialog to import files (on Windows).
- Skip MP3 import plug-in when blindly trying each plug-in, since it may produce garbage data on non-mp3 files.
- Make Export file dialog resizeable (on Windows).
Commits 1 and 2 were taken from my earlier patch i did for my GSoC application. Commit 3 comes from there too, but i changed the implementation.
I noticed that wxGenericFileDialog may show debug assertion failure (in isctype.c, making call to wxIsalpha() from filefn.cpp, line 388) when opening directories with non-latin characters in the name. Looks like it's normal behaviour (code handles negative values correctly).
LRN 09:24, 5 June 2008 (PDT): After 1.3.6a1 got tagged, i rushed with FFmpeg importer (at last). As a result, for the next two days trunk was broken :) As of today here's the changes:
- Leland fixed the OpenDialog. As a result, wxGenericFileDialog is no longer required.
- New logger. Logger works for all the Audacity, but is especially useful for FFmpeg integration. Still buggy (don't try to open more than one Audacity project at once!).
- FFmpeg importer - builds on Windows and (mostly) on Linux. ifdef'ed with USE_FFMPEG (see config*.h).
- New way of loading libraries (by tracking the links from avformat).
- FFmpeg importer handles errors properly (moved error-prone stuff away from the constructors).
Thanks to Lars for his efforts in debugging FFmpeg importer on Linux (can't do it myself).
At this point i'll need more testing. Also, my exams will begin soon. I hope i'll manage to work on both exams and Audacity.