Video Nodes

As you might guess, a video node's main purpose is to display a running video. It does this by decoding a video file and displaying the results on the screen. libavg's video support is heavily optimized, so several HD (1080p) videos can play back at once. Also, the per-video overhead is small, so several hundred tiny videos can be played back as well.

While video playback in libavg is conceptually easy to use, video files in general are notoriously error-prone. This is not limited to libavg and a result of several factors. The first one is that video en- and decoding is a complicated subject. Decoding requires balancing CPU/GPU power and network bandwidth (or storage) used as well as human perception issues (read: "how many corners can you cut before people start noticing that something is wrong?"). Different applications have different needs in this area. The second is that there is a lot of competition. Because of this, lots of video formats have evolved, many with unclear specifications. Often the formats (as well as en- and decoders) are rushed to market, so things are not always designed well or bug-free. We'll adress some of the issues involved in this article.

If you'd like to know how video decoding is implemented in libavg, have a look at VideoDecoding.

Basic Playback

In general, video nodes are declared like this:

1videoNode = avg.VideoNode(href="mpeg1-48x48-sound.avi", 
2        pos=(10,10), parent=player.getRootNode())

This video node plays back an avi file. Videos have the usual node attributes like pos, size, angle, blendmode, etc. A video can be in one of three states: stopped, playing and paused. A stopped video is not displayed, a playing video displays the running video, and a paused video displays the current frame of the video. Switching between the three states is done using the three functions play(), pause() and stop(). So, a minimal libavg program that plays back a video contains code like this:

video.py

1canvas = player.createMainCanvas(size=(160,120))
2rootNode = canvas.getRootNode()
3videoNode = avg.VideoNode(href="mpeg1-48x48-sound.avi", pos=(10,10), 
4        parent=rootNode)
5videoNode.play()
6player.play()

If the video has the attribute loop=True, the video will endlessly loop. Otherwise, playback will stop at the last frame.

Seeking

libavg supports seeking in a video using one of two functions:

seekToTime(millisecs)

or

seekToFrame(framenum)

Seeks jump to the precise frame requested.

Multithreading Issues

In the standard configuration, seeking involves a short delay. This is because decoding is performed in separate threads. The same delay appears when looping or when transitioning from a stop state directly to playback. Most other video players work this way as well and the delay is pretty short (in the order of 1-3 frames at 60 Hz), so there is usually nothing to worry about. Also, multithreading distributes processing to more of the computer's cores, so performance in general is increased by using multiple threads.

However, if you want to seek continually in a video, the delay gets in the way. For this reason, multithreaded decoding can be turned off by setting threaded="False". This attribute needs to be set on creation of the video node. It causes all decoding to happen in the main thread. With the right setup, seeking in full hd videos can happen at 30 Hz when threaded="False" is set. The video needs to be in an appropriate format for this to work - see Video Formats below for more information.

Use cases for this are applications that

  • allow 'scrubbing' in videos (seeking by dragging a scroll bar or similar) or
  • use videos as a repository of images to be exchanged quickly. As an example, a video can be composed of images that rotate an object around an axis. The frame selected determines the angle of rotation.

Info Functions

After a video file has been opened by setting the state to playing or paused, the video node can be queried for information about the file. The functions getDuration() and getNumFrames() as well as the attribute fps give you information about the duration and playback speed in frames per second of the video. getCurFrame() and getCurTime() return the current playback position. In addition, there are several functions available that report information about the format of the video. These are getVideoCodec(), getAudioCodec(), getStreamPixelFormat() and hasAudio().

Audio

If a video contains audio, libavg will play back the audio. The loudness can be controlled using the volume attribute. Audio is not supported for non-threaded playback.

Video Formats

The format of the video file used for playback is important. It determines processor and memory usage as well as a few other aspects of playback.

To decode videos, libavg uses libavcodec. libavcodec is the decoder library behind mplayer, ffmpeg and nearly all other open-source video players and utilities. libavcodec supports a lot of video formats. Still, because of the number of formats and sub-formats available, things can go wrong. To test if libavg can play back a video, use the avg_videoplayer.py utility. It takes the names of video files as parameters and displays information (frame size, duration, codecs used for video and audio, pixel format etc.) for one or more videos.

Codecs

To choose a video format to use, a bit of background is needed. Video files are stored in a container format and using one or more codecs. The container format is avi, mov, wmv or something similar. Video containers contain one or more data streams - usually one video stream, probably at least one audio stream, and possibly other streams such as subtitle streams. These data streams are each encoded and decoded using a codec. Examples of codecs are mpeg, mpeg2, mpeg4, h264, Motion jpeg and Quicktime RLE.

The choice of video codec determines performance characteristics. To save space, most codecs are lossy and throw away 'unimportant' parts of the images. They also store most frames as difference images to other frames in the stream - also to save space. Only once in a while, a full frame - an I-frame in mpeg-lingo - is stored. Seeking to a frame other than a full frame is a lot of work in these codecs: the delta frames don't contain the complete data!

As a general guideline, these are codecs that make sense in different settings:

  • h264: A lossy codec with very good compression (using delta frames), modern and popular. Files of the same quality are a lot smaller than mpeg4 or similar files.
  • Motion jpeg (mjpeg): This is essentially a sequence of jpeg images in one file, so no delta frames are used and thus fast seeking is possible. The codec is often used for video editing purposes. The files are bigger than h264 files and decoding takes more CPU time.
  • vp6: A codec used in flash video (.flv) files. Conceptually, it also belongs to the mpeg family, is lossy and uses delta frames. It doesn't compress as well as h264, but is one of the few codecs that can store videos with an alpha (transparency) channel. Unfortunately, not many authoring tools support vp6 encoding.
  • Quicktime RLE (or Quicktime animation): A second codec that supports videos with alpha. Quicktime RLE uses lossless compression, so the files are very large, but the codec is an alternative for videos with alpha if vp6 can't be used.

Wikipedia has a good article on video codecs: http://en.wikipedia.org/wiki/Video_codec

Color Formats

The format that the individual pixels of a video are stored in makes a difference as well. libavg will display videos stored in any color format (within reason), but there may be performance differences. Pixels in video files are usually either stored either with RGB (red, green and blue) or YCbCr (greyscale, blue delta and red delta) components, with a possible additional alpha (transparency) channel. RGB corresponds more or less to the pixels on the monitor, while YUV formats compress better. Converting YUV pixels to RGB can take lots of processor power - but this is where libavg's playback optimization comes in. Specifically, libavg accelerates conversion of yuv420p, yuvj420p and yuva420p-encoded videos. If available, GPU shaders are used to convert the pixels. If the graphics card doesn't support shaders or isn't fast enough, there is SSE2 assembler code that will do the job instead. Of course, you need to make sure that the video is encoded in one of these formats. You can check by using avg_videoinfo.py.

For more on color formats, see: