FFMPEG API. The Agony and the Adequacy.

The FFMPEG API is one of those things that I love to hate.  On one hand, it always seems to be the best tool for the job whenever I write video code.  It’s portable, and integrates relatively easily into any application without needing to use a special framework.  And it has existed long enough that I can reasonably expect it to still be there when i am making version 2 of my project.  It also supports basically every format I need to deal with.  Thats more than I can say about QuickTime[-X].  (Not portable to Linux.  QuickTime-X is Mac/iOS only.  Requires dealing with Carbon/Cocoa API’s.  Hard to use in a command line app.  Etc.)  Or Windows Media.  (Obviously Windows only.  Is is DirectShow or Windows Media Framework now?  Oh yeah, also surprisingly difficult to integrate into a command line app.)  Or G-Streamer.  (More portable than WinMedia stuff, sure.  But still a lot of baggage to add to a port, and keeping consistent format support between platforms is essentially impossible.  And my app isn’t meant too be a “G-Streamer client.”  It’s an app that I want to get video into.  So, just give me the damn pixels already.  I’ll worry about displaying them, thank you very much.)  Or QtMultimedia.  (In practice, I am using Qt, and it is fairly portable, but I still just want the damn pixels.  And I’ll need to encode at some point.)

Most of these API’s suffer from the fascinating delusion that people just want to write simple video players.  Who the hell actually wants to do that?  Why is it such a well supported use case given that most users already have a video player installed.  It makes a neat party trick to do your own, but I’m not sure what I get out of writing another vlc/mplayer/totem/mediaPlayer/QuickTimePlayer given that I already have one.  I wonder who these legions of develops are who look at VLC and think, “I’m gonna basically do that, except probably worse and less mature.”  Every developer I ever talked to about using a video API was also doing something “interesting.”  They needed the pixels more than they needed a a 20 line demo of making a video player in python.  They needed easier encoding much more than they needed trivialised presentation.  And they needed good documentation.

So, FFMPEG stays the king of a motley crew of video API’s that aren’t that great.  But I keep running into things like the fact that they keep gradually evolving the API in place and keeping most of the cruft but making it so the random example code I fount on the net won’t actually compile anymore.  The API cleanups are never quite sweeping enough to elevate the API to “nice,” but just enough to make a lot of extra work out of figuring out what tutorials and samples are actually valid when starting a project.  Which wouldn’t be so bad if the main project documentation was first-rate.

Fore example:  http://ffmpeg.org/doxygen/trunk/group__lavc__encoding.html#gaa2dc9e9ea2567ebb2801a08153c7306b from the documentation for “avcodec_encode_video2.”  Now, first off there is the fact that in their API redesigns, in order to preserve some sense of backwards compatibility, since they mutate in place rather than just having a “version 2” of the API itself, they have version numbers on individual functions.  This function replaced the older avcodec_encode_video after it was deprecated.  I suppose it is practical, and it does server a purpose, but nothing else I depend on works this way.  As a matter of personal opinion, I really don’t like it.  But my actual complaint with the docs here is the explanation of the return value.

“0 on success, negative error code on failure”  Okay, so I can know if it worked, but that negative error code doesn’t actually seem to be documented anywhere.  And given the state of the docs and examples, I am pretty much guaranteed to do something that causes an error.  Unfortunately, I won’t get any kind of a hint about what exactly I did wrong.  The documentation, such as it is, is pretty much all just autogenerated Doxygen HTML pages presenting a slightly prettier view of exactly what is in the headers if I just read the headers directly without any documentation at all.  And that’s where I go bonkers.

I will hopefully be able to talk more about the app that I am writing that uses FFMPEG quite soon.  It’s an interesting project.  I’ve learned a lot about a lot of things, and it has some interesting features.  It’s part of the post production pipeline for a project that will be shooting in January that will hopefully turn out quite fun.