tl;dr: dav1d is in a very good shape

If you want a quick summary of this post:

  • dav1d now covers all the spec and features of AV1, for 8bits and 10bits depth,
  • dav1d is very fast, up to 400% faster (more fps) than the libaom decoder, and very often 100% faster.

Now is the right time to integrate it, in your products!

Read the following for more details...

A few reminders about dav1d

AV1 is a new video codec by the Alliance for Open Media, composed of most of the important Web companies (Google, Facebook, Netflix, Amazon, Microsoft, Mozilla...).

AV1 has the potential to be up to 20% better than the HEVC codec, but the patents license is totally free, while HEVC patents licenses are insanely high and very confusing.

The reference decoder for AV1 is great, but it's a research codebase, so it has a lot to improve.

Therefore, the VideoLAN, VLC and FFmpeg communities have started to work on a new decoder, sponsored by the Alliance of Open Media, in order to create the reference optimized decoder for AV1.

Features

We launched dav1d, exactly 2 months ago, during VDD.

We did a lot of work since. And by "we", I mean mostly the others. :)
There are now more than 500 commits from 29 contributors from different open source communities. This is a good result for a new open source project.

First, we've completed all the features, including Film Grain, Super-Res, Scaled References, and other more obscure features of the bitstream. This covers both 8 and 10bits, of course.
We also improved the public API.

Then, we've fuzzed the decoder a lot: we are now above 99% of functions covered, and 97% of lines covered on OSS-FUZZ; and we usually fix all the issues in a couple of days. This should assure you a secure decoding for AV1.

Finally, we've written a lot of assembly, mostly for modern desktop CPUs, but the work has been started for mobile and older desktop CPUs.
We even reduced the size of the C code!

Performance

Today, dav1d is very fast on AVX2 processors, which should cover a bit more than 50% of the CPUs used on the desktop. We wrote 95% of the code needed for AVX2, but there is still a bit more achievable.

We're readying the SSE and the ARM optimizations, to do the same. They will be very fast too, in the next weeks.

The following graphs are comparing dav1d and aomdec top-of-the-tree on master branches. (and yes, aomdec has CONFIG_LOWBITDEPTH=1).
This was done on Windows 10 64bits, using precompiled binaries.

The clips are taken from Netflix, Elecard, and Youtube, because they don't use the same parameters in the encoder, and don't have the same bitstream features.
Film Grain is not run on the CPU, so it is not visible here.

Haswell

Here, on Haswell (i7-4710, a 4 year old CPU with 4 cores), are the results:

And reported to in percentage compared to libaom:

We got in average 2.49x, and we even get 3.48x on the Youtube Summer clip!

Zen

With a more modern Zen machine (Ryzen 5 1600, 6 cores HT), here are the results:

And reported to in percentage compared to libaom:

The average is even higher at 3.49x, and we even get 5.27x on the Youtube Summer clip!

Global comparison

If we put both on the same graphs, here is what we have:

Threading

If you listened to our talks during VDD or during demuxed, we explained that dav1d threading was quite innovative, and should scale way better than libaom.

On an even less powerful machine, an i5-4590, with 4 cores/4 threads, here are our results, for the Youtube Summer clip:

You see that dav1d can scale better, in terms of threading, than libaom.

Conclusion

dav1d is very fast, dav1d is almost complete, dav1d is cool.

We're finishing the rough edges for a release soon, so that we can hope that Firefox 65 will ship with dav1d for AV1 decoding.

On the other platforms, SSE and ARM assembly will follow very quickly, and we're already as fast on ARMv8. Stay tuned for more!

I would like to thank Ewout ter Hoeven (EwoutH) from the community who did all the testing, numbers and computations.