Enhancing FFmpeg for superior encoding performance
Transloadit is a small team of engineers, it is code, it is a highly scalable platform, it is a bunch of quirky robots that you can tell to import and convert media. At the hearts of these robots often lay open source tools that do the heavy lifting. One tool stands out in particular: FFmpeg.
History
We have used FFmpeg ever since we started back in 2009. Back then, we only had one version, which we
now call v0.0.1
. In 2011, we sponsored FFmpeg core developer
Stefano Sabatini to tailor-make a new version for us because we wanted our customers to enjoy
watermarking of webm files, something which wasn't possible at the time.
Technology always moves fast and media encoding forms no exception. To stay current, we have to keep upgrading FFmpeg. At the same time, however, we don't want to break existing setups for customers due to backwards compatibility issues. Transloadit offers convenient presets, for instance to optimize a video for iPad, that are relatively easily ported, but we also offer fine-grained control over FFmpeg's behavior for power users. Especially this last category of customers would get understandably upset if their fine-grained instructions were suddenly no longer supported because of an upgrade.
Multiple versions
The desire to move forward in a way that would not break existing setups led us to engineer a system where new versions of our encoding tools can co-exist, and be completely opt-in.
So far, we have introduced the following stack versions:
v0.0.1
on July 13, 2009v1.0.0
on April 27, 2011v2.0.0
on Jul 7, 2013v2.1.0
on Nov 12, 2013v2.2.3
on Jun 3, 2014
We have always had custom builds, which is why we render a dedicated page that features all the supported formats and codecs per stack.
Soft launches
Over the years, we have accumulated 9905 automated tests, many of which make use of visual diffs to ensure our encoding is working as expected. Even so, there is always that odd scenario where an odd end-user produces an odd mp4 with an odd second video stream that causes problems for which we didn't have tests yet. Of course, at the end of that day, we will have those tests in place. ๐
For this reason, we have thus far always chosen to do soft launches for our stack versions. We typically work together with a handful of customers, and, as confidence in the new build grows, we begin recommending it to more and more customers.
Ultimately, our documentation will recommend the version as the stack of choice. Organic upgrading.
Since we never deprecated a stack, this approach has worked well for us and our customers. However, that might change and we'll explain why.
Deprecation
Thus far, we have never deprecated a stack. However, the truth is that with every stack we add, our deploys become a little bit slower and our customer support load increases slightly. Also, with every Operating System upgrade that we roll out to our clusters, ancient stacks become harder to compile and less capable of running smoothly. Even though we are moving to containers, which will help with these compatibility issues, they will not help to decrease our deploy, support, and maintenance load.
This is why we are now considering to deprecate our 2009 v1.0.0
FFmpeg stack in the coming year.
We felt it would be best to inform you about this as soon as possible. We certainly won't do
anything reckless. It is only because we thought about this today, that we are sharing this with you
at this time.
FFmpeg stack v2.2.3
note: Our versioning may look like SemVer but it is not. It's internal Transloadit versioning that has no public semantic significance. Any version can break backwards compatibility and should be treated as such while upgrading.
Our latest stack version is the one we are most proud of. Sure, v1.0.0
may have been in production
longer, and she certainly was ahead of her time ๐, but we have invested more effort into v2.2.3
than any into other version. It has already been in production for more than a year. It has seen
more traffic than any of its predecessors. We rarely ever (never is a dangerous word from which we
like to steer clear) see issues with it that can be blamed on her being a bad build.
Having been in use by some of our biggest enterprise customers with the edgiest of cases, has given us the confidence to wholeheartedly recommend this build to all customers, for all of their audio and video encoding needs.
Upgrading
Please consider these example instructions in which incoming uploads are converted to iPad format, preview thumbnails are extracted, and all results are then exported to your private server via SFTP:
steps:
ipad:
use : ":original"
robot : "/video/encode"
preset : "ipad-high"
thumbnails:
use : "ipad"
robot : "/video/thumbs"
store:
use : [ ":original", "ipad", "thumbnails" ]
robot : "/sftp/store"
user : "transloadit-uploader"
host : "my.website.com"
path : "./transloadit-uploads"
url_template: "https://my.website.com/transloadit-uploads/${file.url_name}"
note: These instructions have been enhanced for readability, but should be encoded as valid JSON when you feed them to our service.
To switch these instructions so that they make use of our latest stack, add an ffmpeg_stack
parameter at the encoding Steps:
steps:
ipad:
use : ":original"
robot : "/video/encode"
preset : "ipad-high"
ffmpeg_stack: "v2.2.3"
thumbnails:
use : "ipad"
robot : "/video/thumbs"
ffmpeg_stack: "v2.2.3"
store:
use : [ ":original", "ipad", "thumbnails" ]
robot : "/sftp/store"
user : "transloadit-uploader"
host : "my.website.com"
path : "./transloadit-uploads"
url_template: "https://my.website.com/transloadit-uploads/${file.url_name}"
That's it! ๐
You can apply the ffmpeg_stack
parameter to the following bots:
Not only will your encoding go faster, but you will also enjoy higher quality, as well as be able to support more input formats than with any previous version.
Caveats
Too good to be true? Well, no, but there are a few things to watch out for: mapping, and changed parameter names.
Mapping
All v2.2.3
presets (like android
, or iphone
) have been adjusted to discard any data and
subtitle streams. Very few customers used those streams, but because we previously tried to give
them a place in the output files, while sometimes failing at that, we inadvertently caused a lot of
hair-pulling for the majority of our customers who "just want results".
If you rely on data or subtitle streams, you can still use v2.2.3
presets and profit from the
other improvements in this release, but you will have to overrule the map
parameter with 0
. This
way, all input streams will find their place in the output file at the cost of some files breaking
FFmpeg, since it will try to make sense of unsupported data streams.
One customer reported an issue with an input file that had two video streams. Our v2.2.3
presets
will try to give all video and audio streams a place in the new file. If you are only interested in
a single video and audio stream, set the mapping to:
ipad:
use : ":original"
robot : "/video/encode"
preset : "ipad-high"
ffmpeg_stack: "v2.2.3"
ffmpeg:
map: [
"0:v:0?"
"0:a:0?"
"-0:d"
"-0:s"
]
While this gives you the most resilient mapping, take note that a second stream of the same media type would be dropped.
To be honest, that does not seem to be a problem for end-users in 99% of the cases that we see, so we are considering to make this the default for our next stack's presets that target handhelds.
We always recommend to export :original
files along with the encoding results, so that if you are
unhappy with your settings, you can easily replay the Assembly with new encoding
parameters.
Changed parameter names
Where FFmpeg used to work with a parameter -b
for video bitrate, and -ab
for audio bitrate, it
is moving towards a -b:v
, and -b:a
syntax. This is also true for many other parameters such as
-codec:v
vs -codec:a
.
Apart from being more consistent, it also allows you to be more expressive. For instance, if you
just want to specify the codec for the second audio stream, you could write -codec:a:1
(first
stream is indexed 0
, so second is 1
).
So, if you want to overrule a v2.2.3
ipad-high preset's audio bitrate, you have to be aware that
ab
has now become b:a
, or you risk that your preset modifiers have no effect:
ipad:
use : ":original"
robot : "/video/encode"
preset : "ipad-high"
ffmpeg_stack: "v2.2.3"
ffmpeg:
"b:a": 144000
Concluding
We are seeing big improvements with our latest FFmpeg stack. It's faster, delivers higher quality, supports adaptive live streaming (HLS), and much more.
We recommend that all of our customers start using it, even if initially just for a small percentage of their Assemblies, and report to us any issues they may find beyond the caveats that we noted.
Later this year, mostly depending on your feedback, we plan to deprecate our then 7 year old
v1.0.0
stack, to which occasion we will certainly raise a glass together! ๐