[FFmpeg-devel] [WIP] libcodec2 wrapper + de/muxer in FFmpeg

Tomas Härdin tjoppen at acc.umu.se
Thu Aug 3 18:45:22 EEST 2017


On Thu, 2017-08-03 at 13:00 +0200, Nicolas George wrote:
> Le sextidi 16 thermidor, an CCXXV, Tomas Härdin a écrit :
> > With both the raw demuxer and the encoder you need to specify the
> > desired mode, like this:
> The encoder could default to one the two.

Default on-the-air is 1300 (aka FreeDV 1600), so that seems like a
reasonable default

> > Some remarks:
> > 
> > * I had to make the ffmpeg CLI not complain about the 700 modes,
> > since
> > it thinks encoding at below 1 kbps is a user error
> It is just a warning. I am not really a fan of a special case like
> that,
> and it would be better split into a separate patch I think.

There's other codec-specific special cases in ffmpeg.c, like the one
for AV_CODEC_ID_VP9 which doesn't even have an explanation. Separate
patch is a good idea however.

> > * I have duplicated some minor functions in libcodec2 in
> > libavcodec/codec2utils.*. This avoid having to link libcodec2 just
> > for
> > remuxing, and in case someone writes a native decoder for
> > libavcodec
> The license allows it, but you neglected to give copyright
> attribution.

Alright, I can fix that. Although the only thing lifted straight are
the mode #defines

> > * Not sure if codec2utils should go in libavutil, libavcodec felt
> > good
> > enough
> Definitely libavcodec.

Cool.

> > 
> > * The demuxer tries to read up to 0.1 seconds worth of frames to
> > speed
> > things up a little without making seeking too broken. 3 frames = 12
> > bytes for the 700 bit/s modes, which decode to 960 samples
> I do not like the sound of that, but I will see the code.
> 
> > 
> > * The decoder is able to deal with multiple frames at a time, the
> > encoder does not need to
> ???

The encoder is given frame_size number of samples each time. Are there
times where this is not the case? I haven't set
AV_CODEC_CAP_SMALL_LAST_FRAME.

> > Feel free to bikeshed around whether extradata should be the entire
> > .c2
> > header or just the mode byte. It really only matters if we go with
> > RIFF
> > or ISOM mappings (.wav .avi .mov and friends), which I decided to
> > leave
> > out for now.
> It matters for inclusion in any format: Matroska, Nut, etc. Is
> anybody
> considering normalization?

I raised the possibility on [Freetel-codec2], either by coding mode as
part of the TwoCC (0xC208 for mode 8 for examples) or via extradata. I
imagine version + mode + flags in extradata would not be unreasonable.

> > +libcodec2_decoder_deps="libcodec2"
> > +libcodec2_decoder_select="codec2utils"
> > +libcodec2_encoder_deps="libcodec2"
> > +libcodec2_encoder_select="codec2utils"
> This and the similar hunks are not necessary, see below the comments
> about the Makefile.

I presume you mean codec2utils and not the libcodec2 deps.

> > +OBJS-$(CONFIG_CODEC2UTILS)             += codec2utils.o
> You do not need a separate configuration option, you can just depend
> on
> the actual visible option:
> 
> +OBJS-$(CONFIG_CODEC2_DEMUXER)           += codec2utils.o
> +OBJS-$(CONFIG_CODEC2_MUXER)             += codec2utils.o
> 
> Having the same file several times will not cause it to be included
> several times.
> 
> > +OBJS-$(CONFIG_LIBCODEC2_DECODER)          += libcodec2.o
> > +OBJS-$(CONFIG_LIBCODEC2_ENCODER)          += libcodec2.o
> +OBJS-$(CONFIG_LIBCODEC2_DECODER)          += libcodec2.o
> codec2utils.o
> +OBJS-$(CONFIG_LIBCODEC2_ENCODER)          += libcodec2.o
> codec2utils.o

Fixed.

> > +++ b/libavcodec/codec2utils.c
> > @@ -0,0 +1,118 @@
> > +/*
> > + * codec2 utility functions
> > 
> > + * Copyright (c) 2017 Tomas Härdin
> Missing copyright attribution if the imported code is non-trivial
> (better err on the side of more attribution).

Guess I'll nick the comment from the relevant files in libcodec2.


> > 
> > +int avpriv_codec2_mode_from_str(void *logctx, const char *modestr)
> > {
> Please normalize the coding style. Same below.

I presume you mean putting the function opening braces on their own
lines like the rest of FFmpeg. Fixed.

> > 
> > +    //statically assert the size of avpriv_codec2_header
> > +    //putting it here because all codec2 things depend on
> > codec2utils
> > +    switch(0) {
> > +    case 0:
> > +    case sizeof(avpriv_codec2_header) == 7: //if false then the
> > compiler will complain
> > +        break;
> > +    }
> I see how it works. This is a nice trick. Want to make it an official
> macro FF_ASSERT_STATIC()?

Sure. Could go in avutil.h or something. Maybe even inside a static
function with a fake name like:

#define FF_ASSERT_STATIC(x, name) static void name(){switch(0){...}}

Didn't do that in this patchset however.

> > +int avpriv_codec2_mode_bit_rate(void *logctx, int mode) {
> > 
> > +    int ret = 8 * 8000 * avpriv_codec2_mode_block_align(logctx,
> > mode) / avpriv_codec2_mode_frame_size(logctx, mode);
> > +    if (ret <= 0) {
> > +        av_log(logctx, AV_LOG_WARNING, "unknown codec2 mode %i,
> > can't estimate bitrate\n", mode);
> I bet you did not test the invalid case. Otherwise, I think you would
> have found a division by 0.

Yep, I found it as soon as I crafted a file with an unknown mode.
Running afl-fuzz right now to see if there's some other fun problems.

> > +int avpriv_codec2_mode_frame_size(void *logctx, int mode) {
> > +    switch (mode) {
> > 
> > +    case CODEC2_MODE_3200: return 160;
> > +    case CODEC2_MODE_2400: return 160;
> > +    case CODEC2_MODE_1600: return 320;
> > +    case CODEC2_MODE_1400: return 320;
> > +    case CODEC2_MODE_1300: return 320;
> > +    case CODEC2_MODE_1200: return 320;
> > +    case CODEC2_MODE_700:  return 320;
> > +    case CODEC2_MODE_700B: return 320;
> > +    case CODEC2_MODE_700C: return 320;
> For all these mappings, I think a table would be more elegant.

Sure. it could take its size from the largest known mode ID + 1.

> > +avpriv_codec2_header avpriv_codec2_make_header(int mode) {
> > +    avpriv_codec2_header header = {
> > +        .magic = {0xC0, 0xDE, 0xC2},
> > +        //version 0.8 as of 2017-08-02 (r3345)
> > +        .version_major = 0,
> > +        .version_minor = 8,
> > +        .mode = mode,
> > +        .flags = 0,
> > +    };
> > +    return header;
> > +}
> This would be more efficient as static inline.

Fixed.

> > +#ifndef AVCODEC_CODEC2UTILS_H
> > +#define AVCODEC_CODEC2UTILS_H
> > +
> > 
> > +#include "internal.h"
> Why?

For uint8_t. But perhaps FFmpeg relies on <stdint.h> existing? MSVC is
a pain etc.
Changed to #include <stdint.h>

> > 
> > +typedef struct {
> Anonymous structures are discouraged.
> 
> > 
> > +    uint8_t magic[3];
> > +    uint8_t version_major;
> > +    uint8_t version_minor;
> > +    uint8_t mode;
> > +    uint8_t flags;
> > 
> > +} avpriv_codec2_header;
> Types and structures names do not need to be namespaced when they are
> private. On the other hand, the case is not consistent with the rest
> of
> the code.

So what should I go with? Something like

typedef struct ff_codec2_header {
} ff_codec2_header;

? Using codec2_ as a prefix is likely a Bad Idea since libcodec2 does
that :)
I saw avpriv_ used in various places so I went with that.
Fixed the anonymousness at least.

> > +++ b/libavcodec/codec_desc.c
> > @@ -2657,6 +2657,13 @@ static const AVCodecDescriptor
> > codec_descriptors[] = {
> >          .props     = AV_CODEC_PROP_LOSSY,
> >      },
> >      {
> > +        .id        = AV_CODEC_ID_CODEC2,
> > +        .type      = AVMEDIA_TYPE_AUDIO,
> > +        .name      = "codec2",
> > 
> > +        .long_name = NULL_IF_CONFIG_SMALL("codec2"),
> Not helpful.

Suggestions? I could link freedv.org, or maybe a short blurb like
"codec2 (very low bitrate speech codec, see freedv.org)"

> > +typedef struct {
> > +    const AVClass *class;
> > +    struct CODEC2 *codec;
> > +    char *mode;
> > 
> > +} libcodec2_context;
> Case not consistent with the rest of the code.

My code or FFmpeg code? I see avidec.c uses AVIContext, so I assume the
latter. Changed to Codec2Context and LibCodec2Context in relevant
files.

> > 
> > +
> > +static const AVOption options[] = {
> > 
> > +    //not AV_OPT_FLAG_DECODING_PARAM since mode should come from
> > the demuxer
> ??? If it comes from the raw demuxer, it is still a decoding param,
> is
> it not?

Yes but it is carried in extradata. You shouldn't have to set it on the
decoder. Plus it would create a contradiction - "does -mode apply to
the decoder or the demuxer?"

> > 
> > +    { "mode", "codec2 mode", offsetof(libcodec2_context, mode),
> > AV_OPT_TYPE_STRING, {.str = NULL}, 0, 0, AV_OPT_FLAG_AUDIO_PARAM |
> > AV_OPT_FLAG_ENCODING_PARAM },
> I think it would be better to make the mode option an integer and use
> the CONST system to expose the valid values. Same below.

Cool, I didn't know that. That would remove the need
for avpriv_codec2_mode_from_str().

> > +static const AVClass codec2_class = {
> > +    .class_name = "libcodec2",
> > +    .item_name  = av_default_item_name,
> > +    .option     = options,
> > +    .version    = LIBAVUTIL_VERSION_INT,
> > +};
> IIRC, you cannot use the same class for two components. Does "ffmpeg
> -h
> full" still work?

Well, I get -mode listed in the two expected places:

  libcodec2 AVOptions:
    -mode              <string>     E...A... codec2 mode

  codec2 AVOptions:
    -mode              <string>     .D...... codec2 mode

To be more helpful I suppose they could be given different names like
"libcodec2 decoder", "libcodec2 encoder", "codec2raw demuxer", "codec2
(.c2) demuxer" and so on

> > +    } else {
> > +        if (avctx->extradata_size != sizeof(avpriv_codec2_header))
> > {
> > +            av_log(avctx, AV_LOG_ERROR, "must have exactly %zu
> > bytes of extradata (got %i)\n",
> > +                   sizeof(avpriv_codec2_header), avctx-
> > >extradata_size);
> return AVERROR_INVALIDDATA?

Good catch, fixed. Should AVERROR_INVALIDDATA be preferred over
AVERROR(EINVAL) perhaps? I've seen both used almost interchangably.

> > +    output = (int16_t *)frame->data[0];
> > +
> > +    for (i = 0; i < nframes; i++) {
> > 
> > +        codec2_decode(c2->codec, &output[i*avctx->frame_size],
> > &pkt->data[i*avctx->block_align]);
> I suggest:
> 
>     input  += avctx->frame_size;
>     output += avctx->frame_size;
> 
> and drop the multiplication.

Reasonable enough

> > +
> > +AVCodec ff_libcodec2_decoder = {
> > +    .name                   = "libcodec2",
> > 
> > +    .long_name              = NULL_IF_CONFIG_SMALL("codec2
> > encoder/decoder using libcodec2"),
> This is not an encoder.
> > +    .long_name              = NULL_IF_CONFIG_SMALL("codec2
> > encoder/decoder using libcodec2"),
> And this is not a decoder.

Fixed.

> > +static int codec2_probe(AVProbeData *p)
> > +{
> > +    if (p->buf_size < sizeof(avpriv_codec2_header)) {
> > +        return 0;
> > +    }
> > +
> > +    //file starts wih 0xC0DEC2
> > 
> > +    if (p->buf[0] == 0xC0 && p->buf[1] == 0xDE && p->buf[2] ==
> > 0xC2) {
> > +        return AVPROBE_SCORE_MAX;
> As pointed by Carl Eugen, this is not reliable enough. The problem is
> not misdetecting the version of files, the problem is detecting files
> that are not codec2 as codec2. A lot of files can start with 192 222
> 194.

Noted. I made the probe function a bit fancier, files where only these
three bytes match get a score of 10. Files with modes and minor
versions we don't know about get slightly penalized. There should also
be no .c2 files with version < 0.8 (as verified by Nivex on #freedv)

> > 
> > +    //replicating estimate_timings_from_bit_rate() in utils.c to
> > avoid warnings
> Please elaborate.

Without this utils.c nags about duration not being known. It would be
nice if there was a flag to shut it up for CBR formats

> > +    if (st->codecpar->extradata[0] != 0xC0 ||
> > +        st->codecpar->extradata[1] != 0xDE ||
> > +        st->codecpar->extradata[2] != 0xC2) {
> Since you do it twice, you could make it a function.

Done.

> > 
> > +        av_log(s, AV_LOG_ERROR, "not a .c2 file\n");
> > +    }
> I think setting the time base information is missing.

That is done in codec2_read_header_common()

> > 
> > +    //Read roughly 0.1 seconds worth of data.
> > +    n = st->codecpar->bit_rate / ((int)(8/0.1) * block_align) + 1;
> I do not like that block. Maybe make it an option?

I'd rather just read block_size bytes in that case, an option seems
excessive. Or maybe someone wants to transcode huge amounts of speech
as fast as possible. Hm...

-blocks_per_packet maybe? With a help string like "read multiple blocks
at a time, useful for processing large amounts of data"

> > +
> > +//based off of ff_raw_read_partial_packet()
> > +static int codec2_read_packet(AVFormatContext *s, AVPacket *pkt)
> > +{
> > +    AVStream *st = s->streams[0];
> > +    int ret, size, n, block_align, frame_size;
> > +
> > +    block_align = st->codecpar->block_align;
> > +    frame_size  = st->codecpar->frame_size;
> > > > ++        return ret;
> > +    }
> > +    av_shrink_packet(pkt, ret);
> av_get_packet()?

The SANE_CHUNK_SIZE logic in it looks a bit scary. There's a bug in
this patch however, it should use avio_read() not ffio_read_partial().
I want to be damn sure only multiples of block_align are read. Maybe
someone out there has files with 139 hours of speech in them, who
knows?

> > 
> > +        av_log(s, AV_LOG_ERROR, ".c2 files require exactly %zu
> > bytes of extradata (got %i)\n",
> IIRC, the z modifier is broken on some proprietary system. I
> personally
> do not care.

Portable size_t format strings are a pain, added a cast to int instead.

Thanks for the review. New patchset attached. Some things still TODO,
like tableizing codec2utils and the numerical -mode option thing.
Didn't compile yet because afl-fuzz is still running, and I'm going to
let it run while I'm out and about.

/Tomas
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Don-t-complain-about-codec2-s-700-bit-s-modes-in-ffm.patch
Type: text/x-patch
Size: 1127 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170803/c374d732/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-codec2-muxers-demuxers-and-en-decoder-via-libcod.patch
Type: text/x-patch
Size: 34426 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170803/c374d732/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20170803/c374d732/attachment.sig>


More information about the ffmpeg-devel mailing list