OpenGL is very suitable for calculating transformations like rotation, scale and translation. Since the video will end up on one rectangular plane, the vertex shader only needs to transform 4 vertices (or 5 with GL_TRIANGLE_STRIP) and map the texture to it. This is a piece of cake for the GPU, since it was designed to do that with many many more vertices, so the performance bottleneck will be uploading the video frame into GPU memory and downloading it again.
The transformations
GStreamer already provides some separate plugins that are basically suitable for doing one of these transformations.
Translation
videomixer: The videomixer does translation of the video with the xpos and ypos properties.
frei0r-filter-scale0tilt: The frei0r plugin is very slow, but it has the advantage of doing scale and tilt (translate) in one plugin. This is why i used it in my 2011 GSoC. It also provides a “clip” propery for cropping the video.
Rotation
rotate: The rotate element is able to rotate the video, but it has to be applied after the other transformations, unless you want borders.
Scale
videoscale: The videoscale element is able to resize the video, but has to be applied after the translation. Additionally it resizes the whole canvas, so it’s also not perfect.
frei0r-filter-scale0tilt: This plugin is able to scale the video, and leave the cansas size as it is. It’s disadvantage is being very slow.
So we have some plugins that do transformation in GStreamer, but you can see that using them together is quite impossible and also slow. But how slow?
Let’s see how the performance of gltransformation compares to the GStreamer CPU transformation plugins.
Benchmark time
All the commands are measured with “time”. The tests were done on the nouveau driver, using MESA as OpenGL implementation. All GPUs should have simmilar results, since not really much is calculated on them. The bottleneck should be the upload.
Pure video generation
gst-launch-1.0 videotestsrc num-buffers=10000 ! fakesink
CPU 3.259s
gst-launch-1.0 gltestsrc num-buffers=10000 ! fakesink
OpenGL 1.168s
Cool the gltestsrc seem to run faster than the classical videotestsrc. But we are not uploading real video to the GPU! This is cheating! Don’t worry, we will do real world tests with files soon.
Rotating the test source
gst-launch-1.0 videotestsrc num-buffers=10000 ! rotate angle=1.1 ! fakesink
CPU 10.158s
gst-launch-1.0 gltestsrc num-buffers=10000 ! gltransformation zrotation=1.1 ! fakesink
OpenGL 4.856s
Oh cool, we’re as twice as fast in OpenGL. This is without uploading the video to the GPU though.
Rotating a video file
In this test we will rotate a HD video file with a duration of 45 seconds. I’m replacing only the sink with fakesink. Note that the CPU rotation needs videoconverts.
gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! videoconvert ! rotate angle=1.1 ! videoconvert ! fakesink
CPU 17.121s
gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! gltransformation zrotation=1.1 ! fakesink
OpenGL 11.074s
Even with uploading the video to the GPU, we’re still faster!
Doing all 3 operations
Ok, now lets see how we perform in doing translation, scale and rotation. Note that the CPU pipeline does contain the problems described earlier.
gst-launch-1.0 videomixer sink_0::ypos=540 name=mix ! videoconvert ! fakesink filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! videoconvert ! rotate angle=1.1 ! videoscale ! video/x-raw, width=150 ! mix.
CPU 17.117s
gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! gltransformation zrotation=1.1 xtranslation=2.0 yscale=2.0 ! fakesink
OpenGL 9.465s
No surprise, it’s still faster and even correct.
frei0r-filter-scale0tilt
Let’s be unfair and benchmark the frei0r plugin. There is one advantage, that it can do translation and scale correctly, but rotation can only be applied at the end. So no rotation at different pivot points is possible.
gst-launch-1.0 filesrc location=/home/bmonkey/workspace/ges/data/hd/fluidsimulation.mp4 ! decodebin ! videoconvert ! rotate angle=1.1 ! frei0r-filter-scale0tilt scale-x=0.9 tilt-x=0.5 ! fakesink
CPU 35.227s
Damn, that is horribly slow.
The gltransformation plugin is up to 3 times faster than that!
Results
The gltransformation plugin does all 3 transformations together in a correct fashion and is fast in addition. Furthermore threedimensional transformations are possible, like rotating around the X axis or translating in Z. If you want, you can even use orthographic projection.
I also want to thank ystreet00 for helping me to get into the world of the GStreamer OpenGL plugins.
To run the test yourself, check out my patch for gst-plugins-bad:
https://bugzilla.gnome.org/show_bug.cgi?id=731722
Also don’t forget to use my python testing script:
https://github.com/lubosz/gst-gl-tests/blob/master/transformation.py
Graphene
gltransformation utilizes ebassi’s new graphene library, which implements linear algebra calculations needed for new world OpenGL without the fixed function pipeline.
Alternatives worth mentioning for C++ are QtMatrix4x4 and of course g-truc’s glm. These are not usable with GStreamer, and I was very happy that there was a GLib alternative.
After writing some tests and ebassi’s wonderful and quick help, Graphene was ready for usage with GStreamer!
Implementation in Pitivi
To make this transformation usable in Pitivi, we need some transformation interface. The last one I did was rendered in Cairo. Mathieu managed to get this rendered with the ClutterSink, but using GStreamer OpenGL plugins with the clutter sink is currently impossible. The solution will either be to extend the glvideosink to draw an interface over it or to make the clutter sink working with the OpenGL plugins. But I am rather not a fan of the clutter sink, since it introduced problems to Pitivi.
