[FFmpeg-devel] [PATCH] libspeex Speex encoding

Tue Oct 27 23:09:32 CET 2009

On Tue, Oct 27, 2009 at 05:54:51PM -0400, Justin Ruggles wrote:
> Michael Niedermayer wrote:
> 
> > On Sun, Oct 25, 2009 at 09:04:45AM -0400, Justin Ruggles wrote:
> >> Hi,
> >>
> >> This patch combines parts of my previous libspeex encoding patch with
> >> parts of the one sent by Art Clarke.
> >>
> >> The rate control is not as intuitive to use as I would like, but it
> >> works.  libspeex has the option to have the library choose a CBR bitrate
> >> based on a quality setting.  Providing that option doesn't really fit
> >> well into our current system since there is no way to tell if the user
> >> is specifying CBR quality or VBR quality.  So instead it just uses
> >> bitrate for CBR and quality for VBR like our other audio encoders.  The
> >> default bitrate of 64kbps is higher than the maximum Speex bitrate, so
> >> at least it will be good quality by default.
> > 
> > [...]
> >> +static av_cold int encode_init(AVCodecContext *avctx)
> >> +{
> >> +    LibSpeexEncContext *s = avctx->priv_data;
> >> +    const SpeexMode *mode;
> >> +    uint8_t *header_data;
> >> +    int header_size;
> >> +    int32_t complexity;
> >> +
> >> +    /* channels */
> >> +    if (avctx->channels < 1 || avctx->channels > 2) {
> >> +        av_log(avctx, AV_LOG_ERROR, "Invalid channels (%d). Only stereo and "
> >> +               "mono are supported\n", avctx->channels);
> >> +        return -1;
> >> +    }
> >> +
> >> +    /* sample rate and encoding mode */
> >> +    switch (avctx->sample_rate) {
> >> +    case  8000: mode = &speex_nb_mode;  break;
> >> +    case 16000: mode = &speex_wb_mode;  break;
> >> +    case 32000: mode = &speex_uwb_mode; break;
> >> +    default:
> >> +        av_log(avctx, AV_LOG_ERROR, "Sample rate of %d Hz is not supported. "
> >> +               "Resample to 8, 16, or 32 kHz.\n", avctx->sample_rate);
> >> +        return -1;
> >> +    }
> >> +
> >> +    /* initialize libspeex */
> >> +    s->enc_state = speex_encoder_init(mode);
> >> +    if (!s->enc_state) {
> >> +        av_log(avctx, AV_LOG_ERROR, "Error initializing libspeex\n");
> >> +        return -1;
> >> +    }
> >> +    speex_init_header(&s->header, avctx->sample_rate, avctx->channels, mode);
> >> +
> >> +    /* rate control method and parameters */
> >> +    if (avctx->flags & CODEC_FLAG_QSCALE) {
> >> +        /* VBR */
> >> +        s->header.vbr = 1;
> >> +        speex_encoder_ctl(s->enc_state, SPEEX_SET_VBR, &s->header.vbr);
> >> +        s->vbr_quality = av_clipf(avctx->global_quality / (float)FF_QP2LAMBDA,
> >> +                                  0.0f, 10.0f);
> >> +        speex_encoder_ctl(s->enc_state, SPEEX_SET_VBR_QUALITY, &s->vbr_quality);
> >> +        avctx->bit_rate = 0;
> >> +    } else {
> >> +        /* CBR */
> >> +        s->header.bitrate = avctx->bit_rate;
> >> +        speex_encoder_ctl(s->enc_state, SPEEX_SET_BITRATE, &s->header.bitrate);
> >> +        speex_encoder_ctl(s->enc_state, SPEEX_GET_BITRATE, &s->header.bitrate);
> >> +        /* stereo side information adds about 800 bps to the base bitrate */
> > 
> >> +        avctx->bit_rate = s->header.bitrate + (avctx->channels == 2 ? 800 : 0);
> > 
> > avctx->bit_rate is set by the user and not the encoder
> 
> The reason for this is for feedback to the user.  libspeex uses the
> closest supported bitrate for the selected mode that is less than or
> equal to the requested bitrate.  I guess this could be taken out and
> just let libspeex do whatever without telling the user except maybe in a
> debug printout.  The bitrate is a nominal value anyway.  The exact
> bitrate also depends on the number of frames per packet because frames
> are not byte aligned, packets are.

hmm, ive no good comment/idea ATM :(


[...]
> > 
> > [...]
> >> +static int encode_frame(AVCodecContext *avctx, uint8_t *frame, int buf_size,
> >> +                        void *data)
> >> +{
> >> +    LibSpeexEncContext *s = avctx->priv_data;
> >> +    void *samples = data;
> >> +    int nframes, i;
> >> +
> >> +    if (!avctx->frame_size)
> >> +        return 0;
> >> +
> >> +    /* handle last packet, which may have fewer frames-per-packet and/or
> >> +       fewer samples in the last frame */
> >> +    nframes = s->header.frames_per_packet;
> >> +    if (avctx->frame_size < nframes * s->header.frame_size) {
> >> +        nframes = (avctx->frame_size + s->header.frame_size - 1) /
> >> +                  s->header.frame_size;
> >> +        if (avctx->frame_size != s->header.frame_size * nframes) {
> >> +            /* allocate new buffer to pad last frame */
> >> +            int new_samples_size;
> > 
> >> +            avctx->frame_size = nframes * s->header.frame_size;
> > 
> > iam not sure if this violates the API but at least i would say it is
> > unexpected by the application
> 
> Hmmm. Yeah, if it doesn't violate API, it is at least not documented.
> Is there another way to report the correct duration of the output frame
> if the user gives, for example, 500 samples and the output frame
> represents 640 due to padding?

decode_audio takes its input from the fields of a AVPacket
encode_audio should produce a AVPacket (various advantages like reallocating
a  oo small buffer make this a good idea)
if we would return a AVPacket then there would be a duration field that
could naturally carry this information


> 
> > 
> >> +            new_samples_size  = avctx->frame_size * avctx->channels *
> >> +                                (avctx->sample_fmt == SAMPLE_FMT_FLT ?
> >> +                                sizeof(float) : sizeof(int16_t));
> >> +            samples = av_mallocz(new_samples_size);
> >> +            if (!samples)
> >> +                return AVERROR(ENOMEM);
> >> +            memcpy(samples, data, new_samples_size);
> > 
> > i think the application is or at least should be required to allocate full
> > frames even for the possibly smaller last
> 
> Where should this be documented?  avcodec_encode_audio()?

yes and close to CODEC_CAP_SMALL_LAST_FRAME


[...]


-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Republics decline into democracies and democracies degenerate into
despotisms. -- Aristotle
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20091027/b896ec79/attachment.pgp>