FFmpeg integration

From Audacity Wiki
Jump to: navigation, search
Summer of Code 2008 logo
The FFmpeg integration project incorporated the open source FFmpeg library into Audacity as an optional library. This greatly expanded the range of proprietary formats Audacity can import and export, as listed below.

This page reviews and discusses progress of the FFmpeg integration project during GSoC 2008. Further discussion can also be found in the audacity-devel mailing list.


Contents

Essence

FFmpeg is a set of libraries for audio/video encoding/decoding/muxing/demuxing/processing/capturing. In Audacity FFmpeg will be used to import and export audio data. While FFmpeg also has other uses, this is outside the scope of this project.

User Documentation

User documentation was produced by the student and has been incorporated into appropriate pages in the Audacity Manual. Note that details of the documentation or of the project implementation itself may change over time.

Design

Linking

As a design decision, FFmpeg libraries are being linked dynamically and loaded at run time. This allows FFmpeg to be optional (Audacity will run without FFmpeg) and it could be distributed separately (removes licensing issues).

It's not yet clear which build of FFmpeg will be used - static or shared. Static build results in one big library, that contains all functions from FFmpeg package. Shared build results in a few libraries (and these libraries may depend on other non-FFmpeg libraries), each library exports different functions. Valid system-wide FFmpeg package is usually shared, and resides somewhere in PATH, so any application is able to load functions from any of these libraries. This is also necessary because libraries depend on each other. Static build is usually stored somewhere within application's directory as some kind of plug-in. My guess is that Linux users would prefer shared FFmpeg, since they have superior package managing system, while on Windows it's easier to drop static FFmpeg build into Audacity and forget about packages.

From Audacity's point of view there's only one difference: when using shared FFmpeg, Audacity has to load each function from respective library, and each library has to be mapped into Audacity process memory by separate wxDynamicLibrary object. When using static FFmpeg, Audacity only loads one library and imports everything from it. Because all shared libraries should be in PATH, it's not useful to ask user for libraries' location when shared build is used, while with static build it may be necessary.

Somewhere along the way the FFmpegLibs class was modified to support loading both shared and static libraries, and library locating dialog was redesigned and put into Preferences. Also, now Audacity will modify application-wide environment variable PATH to include the directory where located library resides. Because of that it is possible to load both static and shared libraries regardless of their location, as long as all relevant shared libraries are either in one directory, or in the PATH (before there was a requirement that shared libraries had to be either in PATH or in Audacity directory). It means that static build is not very advantageous now, and should not be used unless there is a specific reason to do so.

Functionality overlap

FFmpeg is able to import and export FLAC, OGG Vorbis, MP3, MP2 and various uncompressed wave files, however Audacity already possesses all necessary capabilities to work with these formats (except MP3 - exporting requires separate plug-in). Using built-in Audacity import/export modules is better because they support various sample formats, while FFmpeg supports only 16-bit integer samples. FFmpeg also cannot (at the time of writing) import meta-data (tags) from raw FLAC files.

In some cases user may wish to use FFmpeg to import files however. This could be achieved in two ways:

  • Add new preference - plug-in loading order. By using this preference user could adjust the order in which plug-ins are being registered in Audacity. When FFmpeg import plug-in is registered before all others, it would handle importing of any file without letting other plug-ins even try. At the moment FFmpeg is always registered last.
  • Make Importer aware of the file filter user choose in FileDialog, and assume that when user chooses "FFmpeg-compatible file" filter, he (user) wants to import files via FFmpeg, and by choosing "MP3 files" user suggests Audacity to use libmp3lame. Implementing this requires a few changes, some of them may involve cumbersome passing of additional argument through several functions from ShowOpenDialog to Importer.

I implemented second way: FileDialog stores selected filter index in preferences, and Importer retrieves this index and uses it to guess first import plugin to try.

Overlap in export functionality is discussed later.

Export procedure

Currently Audacity only offers limited choice of export file types:

  • WAV, AIFF and other uncompressed files
  • OGG Vorbis
  • MP3
  • MP2
  • FLAC
  • Command-line exporter

First one hides variety of export types (header formats) and codecs (sample encoding formats) in it's "Options..." dialog. This is possible because there's little or no difference between all these formats. For FFmpeg such solution is not acceptable, since FFmpeg can export audio in completely different formats and grouping them all in one choice is unintuitive. Grouping all the options for all these formats in one dialog is not the easiest (to implement and to use) thing in the Universe too.

To overcome this ExportPlugin class was slightly redesigned to present itself as more than one export type, while using the same code to perform actual export procedure. This new feature is used to present FFmpeg as a few common export types, each - with it's own simplified options dialog, while still providing access to complete functionality via Custom FFmpeg export dialog.

Questions and Answers

  • James: The Windows build of FFmpeg is (and can only be) built using MinGW/Gcc. Audacity is built using MSVC. Is there an issue with loading a MinGW/Gcc DLL from a MSVC program? Do we know of other programs that already do that?
    • LRN: As far as i know - no issues, though i can't name a program that does such loading. For me it always been self-evident that ld produces normal C dlls capable of exporting functions in conventional way. wxDynamicLibrary (wxEquivalent to LoadLibrary()/GetProcAddress()), should load anything. Issues may arise with C++ function decorations, but they are not used in FFmpeg at all.
  • James: My understanding is that, as with Mpeg, because of possible patent issues we won't be distributing the FFmpeg dlls. Which sites would be appropriate to direct people to for downloading (a) Windows (b) Linux dynamic link libraries? Does the legal status depend only on the physical location of the server, and where are these servers located?
    • LRN: FFmpeg with AMR support is illegal anywhere, we can't officially distribute such builds. Builds without AMR - depends on country (such FFmpeg builds are legal in software-patents-free countries). Linux binaries could be obtained from repositories (Medibuntu comes in mind). Windows binaries built with dlls are rare (usually they contain only ffmpeg.exe), so far the only site i know of that hosts dll version of FFmpeg is Ramiro's site. Yes, i think legal status depends on location only (it may be illegal for people from US to download FFmpeg from anywhere, but hosting FFmpeg builds somewhere in Europe should be legal). If you want guarantees, contact your lawyer. No, i don't know anything about the location of any servers i mentioned. My own server is located in Moscow, Russia.
  • Gale: How does the distinction between the GPL and LGPL licenses of the individual codecs in FFmpeg fit in with the Audacity licence? For example is your user-friendly single static dll distributed with the alpha Windows builds LGPL, not only the GPL'ed formats?
    • LRN: Yes. Nothing prohibits you from distributing LGPL'ed and GPL'ed binaries (and code) together - they all become GPL'ed. So, when i say "LGPL libavformat", I mean "libavformat with only LGPL'ed codecs/formats", but "GPL libavformat" means "libavformat with both GPL'ed and LGPL'ed codecs/formats".
  • How do I enable / use FFmpeg support on Windows?
    • Gale: Step 1: Uncomment the "USE_FFMPEG" line in win/configwin.h and delete the "#undef USE_FFMPEG" line in the same file, or change it to "#define USE_FFMPEG". Step 2: Rebuild Audacity. Step 3: Extract avformat.dll from avformat-06.10.2008.zip and place it with the other widgets dll's for your build of Audacity. Step 4: Import a Windows Media Video (WMV) file and listen to its audio.
  • What is the situation on Mac?
    • Leland: Works quite well on the Mac. I haven't hit any multi-stream files, but plain old WMV and the like import just fine. It did require that I built my own versions of the libraries. The reason is that the libavformat and brethren MUST be built as a "bundle" due to the way wxWidgets works on the Mac. The wxDynamicLibrary class will not load normal dynamic libraries (.dylib). I do not know if there are prebuilt OS X binaries that satisfy this requirement, so we may need to provide them ourselves...if the legal department say that's okay.
  • How do I enable / use FFmpeg support on Linux (and other platforms) using the configure script?
    • Richard: Step 1: Install a recent (post-January 2008) copy of FFmpeg on your system, including the development packages if your distribution uses them). Step 2: configure a recent CVS check-out of audacity using the --enable-ffmpeg option (or omit it altogether for auto-detection). Step 3: Import a file of a type that only ffmpeg supports, and ignore the broken progress indicator.

Functionality

The following table lists FFmpeg format functionality at the time it was first integrated into Audacity.

Audacity also now supports import and export of 16-bit Apple Lossless (ALAC).

All functionality depends on the build of FFmpeg in use and the codecs enabled in that build of FFmpeg. For file formats and codecs supported by latest FFmpeg builds, see http://ffmpeg.org/general.html#Audio-Codecs.

The latest builds of FFmpeg may be used for exporting by pointing Audacity's command-line encoder to your preferred FFmpeg binary. Audacity has no ability yet to import files by pointing to arbitrary decoding libraries.
Long name Short name Default extension Read Write Read metadata Write metadata
4X Technologies format 4xm  ??? yes no no no
ADTS AAC adts aac yes yes no no
Audio IFF aiff aiff yes yes yes yes
3GPP AMR file format amr amr yes yes no no
CRYO APC format apc  ??? yes no no no
Monkey's Audio ape ape yes no yes5 no
ASF format asf wma yes 2 yes 3 yes yes 4
SUN AU format au au yes yes no no
AVI format avi avi yes yes no no
AVISynth avs avs yes yes no no
AVS format avs avs yes no no no
Bethesda Softworks VID format bethsoftvid  ??? yes no no no
Brute Force & Ignorance bfi  ??? yes no no no
Interplay C93 c93  ??? yes no no no
CRC testing format crc crc no yes no no
D-Cinema audio format daud 302 yes yes no no
Delphine Software International CIN format dsicin  ??? yes no no no
DV video format dv dv yes yes no no
Electronic Arts cdata ea_cdata  ??? yes no no no
Electronic Arts multimedia format ea  ??? yes no no no
FFM (FFserver live feed) format ffm ffm yes yes no no
FLV format flb flv yes yes no no
framecrc testing format framectc no yes no no
GXF format gxf gxf yes yes no no
id CIN format idcin  ??? yes no no no
id RoQ format RoQ roq yes yes no no
IFF format iff  ??? yes no no no
Interplay MVE format ipmovie  ??? yes no no no
lmlm4 raw format lmlm4  ??? yes no no no
Matroska file format matroska mka yes yes yes yes
American Laser Games MM format mm  ??? yes no no no
mmf format mmf mmf yes yes no no
QCP format containing QCELP qcp qcp no no no no
QuickTime MOV format mov mov yes yes yes 1 yes
MP4 format mp4 mp4 yes yes yes 1 yes
3GP format 3gp 3gp yes yes yes yes
PSP MP4 format psp mp4 yes yes yes 1 yes
3GP2 format 3g2 3g2 yes yes yes yes
iPod H.264 MP4 format ipod m4a yes yes yes 1 yes
MPEG audio layer 3 mp3 mp3 yes yes yes yes
MPEG audio layer 2 mp2 mp2 yes yes yes yes
Musepack mpc  ??? yes no no no
Musepack SV8 mpc8  ??? yes no no no
MPEG-PS format mpeg mpeg yes yes no no
MPEG-2 PS format vob vob yes yes no no
MPEG-2 transport stream format mpegts ts yes yes no no
MTV format MTV  ??? yes no no no
Motion Pixels MVI format mvi mvi yes no no no
Material eXchange Format mxf mxf yes no no no
NullSoft Video Format nsv nsv yes no yes no
NUT format nut nut yes yes yes yes
NuppelVideo format nuv  ??? yes no no no
Ogg ogg ogg yes yes yes yes
Sony OpenMG audio oma oma yes no no no
Sony Playstation STR format psxstr  ??? yes no no no
Speex Speex spx yes no yes no
TechnoTrend PVA file and stream format pva  ??? yes no no no
Raw AC-3 ac3 ac3 yes yes no no
Raw DTS dts dts yes yes no no
Raw FLAC flac flac yes yes no no
Raw GSM gsm gsm yes no no no
Raw MLP mlp mlp yes yes no no
Raw Shorten shn  ??? yes no no no
Raw PCM pcm_* yes yes no no
r2l format r2l  ??? yes no no no
RM format rm rm yes yes yes yes
RPL/ARMovie format rpl  ??? yes no yes no
Sega FILM/CPK format film_cpk  ??? yes no no no
Sierra VMD format vmd  ??? yes no no no
Beam Software SIFF siff son yes no no no
Smacker video smk smk yes no no no
Sierra SOL format sol  ??? yes no no no
Flash format swf swf yes yes no no
Flash 9 (AVM2) format avm2 yes yes no no
THP thp  ??? yes no no no
Tiertex Limited SEQ format tiertexseq  ??? yes no no no
True Audio tta tta yes no no no
Creative Voice file format voc voc yes yes no no
WAV format wav wav yes yes no no
Wing Commander III movie format wc3movie  ??? yes no yes no
Westwood Studios audio format wsaud  ??? yes no no no
Westwood Studios VQA format wsvqa  ??? yes no no no
WavPack wv  ??? yes no no no
Maxis XA File Format xa  ??? yes no no no


1 as of August 2008 read support prevented by an FFmpeg bug
2 WMA V2 only
3 WMA V2, Professional and Voice only
4 partial write support only: Artist, Track Title and Comment, plus Copyright (not supported by Audacity)
5 Artist Name not supported

Issues

GetOpenFileName filter length bug

On Windows one file filter in OpenDialog can't exceed the length of 260 bytes.

Eventually Leland found a workaround by directly hacking into the ListView control of the dialog and filtering its contents via custom filtering routine.

Channel order

FFmpeg presents channels in the order they stored on the media. Audacity has to watch the importing type and reorder channels accordingly. Same goes for export. This is worsened by the fact that Audacity lacks proper multi-channel support.

I decided to put this issue on hold. Without having full support for multi-channel audio there's no sense in reordering channels on import - order doesn't matters for Audacity, all channels are mixed to stereo or mono at playback.

Dynamic changes

For some files or other sources audio parameters may change over time. For example, number of channels or sample rate may vary. At the moment FFmpeg importer can't handle that.

While i believe such issue does exists, the only file with such odd audio i had happened to be broken (new version of FFmpeg refuses to transcode it, claiming it to be malformed). Until i find a sample with dynamically changing parameters, this issue is on hold.

Special cases

FFmpeg offers a lot of information, most of it is used to handle special cases (like - additional time offset, etc) and is completely ignored at the moment. I added a few log messages for special cases. While FFmpeg importer still won't handle these cases, at least one can see in the log that something is wrong.

Error handling

Importers and exporters in Audacity are not known for their meaningful error messages, because there is no error messages coming from them. Error reporting could be improved in all plug-ins, and error handling should be improved in FFmpeg plug-in in particular.

New logger solved this problem partially. New Progress Dialog helps too - it recognizes three outcomes of import procedure (Success, Canceled, Error) instead of two (Success, Not Success).

TODO

All planned major things are done.

Progress

This will serve as some kind of log, anyone is free to post FFmpeg-related entries here. This could be done in Talks of course, but Talks are supposed to be used for discussions about contents of the page and further edits, not for content itself.


LRN 21:34, 21 April 2008 (PDT): First log entry.


LRN 16:09, 13 May 2008 (PDT): Still trying to put myself together and start coding. Meanwhile i managed to set up FFmpeg building environment. Again. I am publishing binaries right away, see the link at FFmpeg page. These builds are audio-only, meaning that i didn't included any optional video libraries. Maybe it's possible to actually disable video routines, i'll have to make some research. Audio-only is able to decode and encode video, but to much lesser extent when compared to audio. All available audio codecs and container formats are available (extept liba52, AFAIK FFmpeg features ac3 decoder/encoder by default and it's better than liba52). There's two archives for each version (GPL- and LGPL-version). The only difference between these two is AAC decoder, libfaad (GPL-only). Dependencies (pthreads, lame, ogg and vorbis libs) are the same for both builds. AMR support is not enabled, because libamr is undistibutable (non-free). All builds are shared ones (with cross-dependencies). And since this is Windows-only builds, i compiled in AviSynth support. Could be extremely useful to leverage full DirectShow decoding support on Windows!

Maybe i missed something? Let me know.


LRN 16:11, 20 May 2008 (PDT): Made a few small commits:

  1. Do not put duplicate entries in filter for Import file dialog.
  2. Use wxGenericFileDialog to import files (on Windows).
  3. Skip MP3 import plug-in when blindly trying each plug-in, since it may produce garbage data on non-mp3 files.
  4. Make Export file dialog resizeable (on Windows).

Commits 1 and 2 were taken from my earlier patch i did for my GSoC application. Commit 3 comes from there too, but i changed the implementation.

I noticed that wxGenericFileDialog may show debug assertion failure (in isctype.c, making call to wxIsalpha() from filefn.cpp, line 388) when opening directories with non-latin characters in the name. Looks like it's normal behaviour (code handles negative values correctly).


LRN 09:24, 5 June 2008 (PDT): After 1.3.6a1 got tagged, i rushed with FFmpeg importer (at last). As a result, for the next two days trunk was broken :) As of today here's the changes:

  1. Leland fixed the OpenDialog. As a result, wxGenericFileDialog is no longer required.
  2. New logger. Logger works for all the Audacity, but is especially useful for FFmpeg integration. Still buggy (don't try to open more than one Audacity project at once!).
  3. FFmpeg importer - builds on Windows and (mostly) on Linux. ifdef'ed with USE_FFMPEG (see config*.h).
  4. New way of loading libraries (by tracking the links from avformat).
  5. FFmpeg importer handles errors properly (moved error-prone stuff away from the constructors).

Thanks to Lars for his efforts in debugging FFmpeg importer on Linux (can't do it myself).

At this point i'll need more testing. Also, my exams will begin soon. I hope i'll manage to work on both exams and Audacity.


LRN 09:07, 21 June 2008 (PDT): 1.3.6a2 was tagged. FFmpeg importer weren't really working on some platforms, but on Windows it should work for most users. The important changes:

  1. Preferences->Export/Import got a new section for FFmpeg libraries (just as LAME).
  2. FFmpegLibs now supports loading statically-linked dynamic libraries. I made and packed one such library, see the external links on FFmpeg.
  3. More log messages for FFmpegLibs.
  4. Audacity warns the user when FFmpeg libraries are required, but are not available. This warning could be disabled (yet, there's no way to re-enable it other than manually editing config file).
  5. "Import functionality overlap" resolved by a quick fix - now Audacity tries to use importers in different order, depending on which filter was selected when user opened the file in Open Dialog. Richard suggested complete reshaping of the importer. Maybe i'll do it later, after both importer and exporter are fully working.
  6. Various bugfixes.
  7. I fixed the OGG Vorbis importer to work with multi-stream OGG files.
  8. New multi-format exporter (source of both bugs and joy). As an example, i modified "WAV, AIFF and other uncompressed" exporter into multi-format one, and the list of exported types has grown significantly. I think it will return to normal by the next alpha - no one really needs all these export types. Gale asked to keep "AIFF" as separate type - but that's one only one.
  9. FFmpeg exporter is ready to be committed, but i'm wavering - i'm not sure i'll be able to shape it up before next deadline.

LRN 12:02, 2 July 2008 (PDT): 1.3.6a3 was tagged. I took my chances and committed the exporter at 22st June. Luck was on my side and we managed to fix all nasty bugs before deadline - so, yes, 1.3.6a3 comes with FFmpeg exporter. Actually 1.3.6a3 is the first version with FFmpeg enabled by default. Important changes are:

  1. Audacity changes environment variable "PATH" (if available), adding the directory, where avformat library is located, before loading it, and restores "PATH" back to normal afterwards. Now it should be possible to load shared FFmpeg libraries from arbitrary place (not only from Audacity directory or directory being in PATH). So far i didn't received any feedback on this feature for platforms other than Windows (on Windows i tested it myself).
  2. Added an error dialog box that is shown when FFmpeg fails to find a codec required for exporting in one of the pre-defined types (which is the only way it works at the moment). It says "codec is not found" and displays a number (codec ID). Most of the time it should appear when exporting to AMR, because AMR is never included in any legal FFmpeg distribution. More informative dialog especially for AMR was proposed, i'm yet to implement it.
  3. Fixed nasty bug with flags in encoder context. This fixes MP4-AAC exporting (and maybe other formats too).
  4. Changed GSM to be exported to AIFF rather than WAV (GSM can't be stored in WAV container, only GSM-MS can).
  5. Now all pre-defined type can have more than one valid extension (it is possible to export MP4-AAC as "*.mp4" or "*.m4a", etc). People are saying that on Mac "m4a" extension should be used by default.
  6. Changed channel checking code. I'm not sure it is entirely correct, but at least now AMR and GSM exporting works on non-mono tracks (mono-mixing is done automatically).
  7. Now Audacity has two uncompressed exporters - "AIFF signed 16-bit PCM" and "WAV signed 16-bit PCM", and "other uncompressed types" for everything else libsndfile can export. People reported weird behaviour with WAV being exported as AIFF.
  8. "Download" and "Find library" buttons for FFmpeg in Preferences are disabled if Audacity is built without FFmpeg support.

"Other FFmpeg types" dialog was committed only after 1.3.6a3, but it isn't functioning anyways. I committed it only because it got tired of cutting it away from the ExportFFmpeg.cpp before committing other changes in this file.


LRN 07:51, 11 July 2008 (PDT): After 2nd July i took a vacation that lasted for one week. Because of that there isn't much to report.

  1. "Other FFmpeg-Compatible Files" was committed after 1.3.6a3, but only today i finally patched it enough for it to start working. In theory this dialog allows exporting audio in every possible format/codec/format options/codec options combination. Presets are not implemented yet, so user either has to use only one combination of parameters, or change them manually each time he/she wants something different.
  2. AAC exporter now exports audio with different profiles, such as LC-AAC, Main-AAC etc.

Format and codec compatibility is not yet completely correct. For example, it would claim that ogg container supports only FLAC codec, which is obviously wrong. Also, codec list may show duplicate entries.


LRN 09:25, 4 August 2008 (PDT): Didn't updated for a while. Here is the summary of changes since:

  1. Now Audacity won't show libsndfile export dialog options for "WAV 16-bit PCM " and "AIFF 16-bit PCM" export types;
  2. AAC SSR profile is commented out (not supported by FFmpeg yet)
  3. Audacity will feed FFmpeg with silence to ensure that last block is full. This behaviour is enabled by "/FileFormats/OverrideSmallLastFrame" config switch (1/0) and is enabled by default. There is no GUI for this setting at the moment.
  4. MP4 (AAC) files are now exported with ".m4a" extension by default (required for Mac)
  5. RA3 support was removed - RA3 is not suppored by FFmpeg and i can't find a way to fix that.
  6. FFmpeg tag import and export both should work now. These should be tested against various players to make sure that metadata is exported and imported in proper character encoding.
  7. Removed temporary FFmpeg export types (MP3, Ogg Vorbis, etc)
  8. Removed AVCodecTag-based compatibility and added hardcoded compatibility list instead
  9. Simplified M4A (AAC) and AC3 export dialogs.
  10. Changed WMA allowed bitrate list.
  11. Custom FFmpeg export dialog is almost finished. It can successfully export audio and save/load presets.
  12. start_time is now handled in FFmpeg importer.
  13. AAC and AC3 exporters will check project sample rate agains a list of valid sample rates for these formats and will offer resampling.
  14. A few bugfixes.

LRN 06:31, 18 August 2008 (PDT): Looks like it's my last entry.

  1. Moved a lot of code from ExportFFmpeg.cpp to newly created ExportFFmpegDialog.h/cpp
  2. Fixed quality computation in AAC and quality flag (without it quality-based export won't happen)
  3. Added some code comments
  4. Removed all testing export types
  5. Simplified preset load/save code
  6. Presets now are saved/loaded to/from xml file. It is also possible to import/export presets from/to external xml file (export your own presets into file and send 'em to your friend!)
  7. Fixed save/delete issues for preset (presets without name, non-existing presets, confirmation when deleting a preset)
  8. Improved calls to guess_format() function for export types (added shortname to each export type)
  9. Some fixes in the behaviour of format/codec listboxes in Custom FFmpeg export dialog
  10. Fixed metadata export
  11. Fixed default file extension for "Other uncompressed files" - it is now WAV on Windows and AIFF on other platforms.
  12. Removed FFmpeg-based GSM export type, added libsndfile-based one.

Personal tools