[FFmpeg-devel] Integrating the mod engine into FFmpeg - what is the best design approach?

Mon Aug 9 14:46:00 CEST 2010

On Thu, Aug 05, 2010 at 02:54:07AM +0200, Stefano Sabatini wrote:
> On date Wednesday 2010-08-04 15:36:25 +0200, Sebastian Vater encoded:
> [...]
> > Yes, we can do this later. Anyway, in the meanwhile I made some thoughts
> > on a more precise generic integration plan for lavseq.
> > As you currently see, I have a directory lavseq containing:
> > avsequencer.h (the connection to rest of FFmpeg).
> > 
> > To give an overview of the AVSequencer:
> > At root, we have avsequencer.h, i.e. the AVSequencerContext which is the
> > only structure linking to remaining of FFmpeg.
> > 
> > root:
> > AVSequencerContext contains a list of modules (module.h and module
> > handling is implemented in module.c), the playback handler and a list of
> > available mixing engines.
> > 
> > depth 1:
> > AVSequencerModule contains songs, instruments, keyboard definitions,
> > arpeggio definitions and envelope structures.
> > 
> > depth 2:
> > AVSequencerSong contains tracks, an order list which references the
> > track numbers being played for each channel (since the sequencer is
> > internally track instead of pattern based to allow different speeds.
> > 
> > AVSequencerInstrument contains samples, which can be assigned using the
> > keyboard definition. For example, I tell the instrument to use sample
> > number 1 for C-5 but number 2 instead for C-6.
> > Instruments also determine how envelopes are used (you can assign them
> > to vibrato, tremolo, volume handling, etc.).
> > 
> > AVSequencerEnvelope contains the actual envelope data and also it
> > properties like loop points.
> > 
> > AVSequencerKeyboard contains the octave/note -> sample mapping for all
> > notes from C-0 to B-9 which are 120 entries (10 octaves * 12 notes per
> > octave).
> > 
> > AVSequencerArpeggio is mostly like AVSequencerEnvelope with the
> > difference you can specify a custom arpeggio layout and the structure is
> > designed for that.
> > 
> > depth 3:
> > AVSequencerSample contains the sample loop points, auto vibrato
> > envelopes and also the PCM data (the PCM data should later be obtained
> > by the lavc, so you can also directly use ogg/mp3/flac/wav/etc.). It
> > also contains a reference to AVSequencerSynth if it is a programmable
> > synth sound.
> > 
> > depth 4:
> > AVSequencerSynth contains a list of "machine code" instructions for
> > programming the synth sound "DSP", a symbol table for human-readability
> > and properties like initial variables (16 general purpose registers).
> > 
> > Hierarchy overview:
> > SequencerContext (avsequencer.[hc])
> >     Module (module.[hc])
> >         Song (song.[hc])
> >             Track (track.[hc])
> >                 TrackData
> >                     TrackDataEffect
> >             OrderList (order.[hc])
> >                 OrderData
> >         Instrument (instr.[hc])
> >             Sample (sample.[hc])
> >                 SynthSound (synth.[hc])
> >         Envelope (instr.[hc])
> >         Keyboard (instr.[hc])
> >         Arpeggio (instr.[hc])
> >     Mixer (allmixers.c, mixer.[hc])
> >         Null mixer (null_mix.[hc])
> >         Low quality PCM mixer (lq_mix.[hc])
> >         High quality PCM mixer (hq_mix.[hc])
> >         FUTURE: OPL2/3 (AdLib/etc.) FM synthesizer
> >                         SID chip FM (as found in C64) synthesizer
> >                         Floating point mixers
> >     Player (player.[hc])
> 
> OK that's a nice description of the whole BSS design.
> 
> Sebastian is currently working on this git branch:
> http://github.com/BastyCDGS/ffmpeg-soc.git
> 
> > Open discussion points are:
> > 1. Best way of integration into rest of FFmpeg
> 
> I'm resuming some of the designs which has been already proposed:
> please correct me if some information is missing / uncorrect.
> 
> 1)
>  The MOD decoder does just one thing: decode a AVPacket to a BSS. It
>  does not know anything about the player (it doesn't even know _if_ it
>  will be played or converted to other format or fed to a visualization code).
>  3- Libavsequencer does just one thing: transforming a BSS in PCM audio.
>  It knows nothing about file formats (it don't care or know if the BSS
>  was made from a MOD file or recorded from a MIDI keyboard).
> 
>  That's why we insist in starting with the implementation of MOD ->  XM
>  conversion: it is much simpler than MOD ->  PCM conversion, it doesn't
>  need an implementation of libavsequencer.
> 
>                            mod file - metadata                      BSS +
>                                                               sequencer SAMPLES
>  MOD file -->  MOD demuxer -------------------->  MOD decoder  ------------------>  application
> 
>  Advantages of this approach as follows:
>  - Allows for conversion from a format with more features to one with
>  less doing no mixing or sampling
>  - Makes each file format very modular (just reading the bitstream and
>  filling up BSS)
>  - Better integration with the way FFmpeg works ATM

A Audio decoder should return PCM. Doing something else is requireing all
applications that use libav to be changed. I dont see the point in going this
way

There are the whole avsequencer apis that allow full and direct access to
all things. The existing audeio decoder API seems sufficient for what it is
intended for, namely decoding audio. We arent returning SAMPLE_FMT_MDCT either
for aac.

If you want to implement convertion between xm->mod before decoding to pcm
thats ok but the BSS goes to a field of AVCodecContext or AVFrame and PCM
samples could be set to 0
thats in line of how video transcoding with reusing motion vectors is done

> 
> 2)
>  The demuxer decodes the file to a BSS an output it in an
>  AVPacket. It would them define a CODEC_ID_SEQUENCER, and the decoder
>  would be just a wrapper to libavsequencer to make the BSS -> PCM
>  conversion.
> 
>  The advantage of this approach is that the concept of demuxing/decoder
>  does not make much sense for these formats, so this avoid the
>  artificial distriction. Moreover, it makes a nice distinction of
>  transcoding from one MOD format to other (with -acodec copy) to
>  decoding it to PCM. The disadvantages is that API-wise it's less clear
>  for external applications to get the BSS data (reading the AVPacket
>  payload). Besides, all the bit-reading API is part of lavc.

This is unacceptable

[...]

-- 
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

The greatest way to live with honor in this world is to be what we pretend
to be. -- Socrates
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20100809/48748e08/attachment.pgp>