Video Decoding using libav and ffmpeg

I spent the last month completely taking the libavg video decoding module apart and putting it together again. I’m finally convinced that the code is well-designed and readable – and it’s fast. It turns out that getting good video decoding is not as easy as it sounds, so I’ve written up a complete description of the insides for anyone that’s interested: VideoDecoding.

The weird thing is that from the outside, it looks like a solved problem, so every time I start telling someone about this, I get the same reaction. There’s libraries for that, right? You just plug in libav or ffmpeg or gstreamer or proprietary things like QuickTime or DirectShow. All these libraries have existed for years, so they’re stable and easy to use, right?

Well, yes and no. If you don’t need advanced features, high-level libraries like gstreamer might do what you want. But we want frame-accurate seeking and a low-latency mode, as well as color space conversion using shaders. Opening and closing video files shouldn’t cause any interface stutters, and so on. Also, libavg can’t work with proprietary libs – we need something that works cross-plattform. That leaves libav/ffmpeg, and this library exposes a pretty low-level interface. It does support every codec but the kitchen sink (pardon the wording) and gives you control over every knob that all of these codecs have to tune things. That’s really great, because you wanted control, right? Anyway, you can get everything done with libav/ffmpeg, but suddenly things get complicated. For starters, you’re suddenly juggling five threads: demuxer, video decoder, audio decoder, display and audio mixer. libav/ffmpeg leaves all the threading to the user, so you’re dealing with a pretty complicated real-time system where lots of things happen at the same time. Dranger’s tutorial helps, but it’s dated.

To make things worse, the interface of libav/ffmpeg changes with minor revision numbers, so to support a few years of operating systems, you find yourself adding a generous amount of #ifdefs to the code. I couldn’t find documentation that describes which changes happened in which minor revision, so you need to guess appropriate version numbers for the #ifdefs based on tests with multiple systems. Oh, and there’s actually several constituent libraries that each have their own version number. Of course, you need to query the correct one. All of that takes time; the resulting code is hard to read and test. In addition, since ffmpeg forked and the developers aren’t on speaking terms (see this and this if you really want to know more), you need to test with libav (the fork) and ffmpeg (the original) if you want maximum compatibility.

All of this is really a pity, because I think the libav/ffmpeg developers are insanely smart guys and the library does do a really admirable job of de- and encoding everything you can throw at it. Also, if I’m honest, most of the time spent was figuring out how to organize the different threads well – and that’s something I really can’t blame libav/ffmpeg for.

Anyway, we’re now ready to add Raspberry Pi (read: OpenMAX IL) and VA-API hardware decoding, seamless audio loops and other cool things to libavg.

Raspberry Pi Support

I’m sure most of you have heard of the Raspberry Pi, a $25 ARM computer that runs Linux. We’ve spent quite a bit of time in the last weeks getting libavg to run on this machine, and I’m happy to say that we have a working beta. We render to a hardware-accelerated OpenGL ES surface and almost all tests succeed. Besides full image, text and software video support, that includes all compositing and even offscreen rendering and general support for shader-based FX. We have brief setup instructions at https://www.libavg.de/site/projects/libavg/wiki/RPI. Update: The setup instructions have been updated for cross-compiling (much faster!) and moved to https://www.libavg.de/site/projects/libavg/wiki/RaspberryPISourceInstall.

Most of the work was getting libavg to work with OpenGL ES. We now decide whether to use desktop or mobile OpenGL depending on a configure switch, an avgrc entry and the hardware capabilities. Along the way, we implemented mobile context support under Linux for NVidia and Intel graphics systems, so we can now test most things without actually running (and compiling!) things on the Raspberry. Speaking of which – compiling for the Raspberry takes a long time. Compiling on it is impossible because there just isn’t enough memory. We currently chroot into a Raspberry file system and compile there (see the notes linked above).

A lot of things are already implemented the way they should be for a mobile system. That means that, for example, bitmaps are loaded (and generated, and read back from texture memory…) in either RGB or BGR pixel formats depending on the flavor of OpenGL used and the vertex arrays are smaller now so we save bandwidth. Still, there’s a lot of optimization to do. Our next step is getting things stable and fast. We want hardware video decoding, compressed textures – and in general, we’ll be profiling to find spots that take more time than they should.