[FFmpeg-devel] ? Inserting an image in the middle of a video stream

Thu Apr 26 00:11:40 EEST 2018

Hi all,

I am working on a project where I need to insert a repeating image (10
seconds) in the middle of a video stream while also transcoding from one
video codec to another.  The images are going to be either png or jpg
images.  I have a frame number within the video stream which tells me where
to insert the sequence of image frames.

1) I have been looking over ffmpeg.c trying to figure out the best place to
perform this bit of "magic".  It looks to me like I would need to modify

static int transcode_step(void)

such that if I determine I am at the insertion frame number, that I call a
new function to insert the image, instead of calling process_input(int
file_index).  Something like:

if (ost->st->nb_frames == image_insert_frame_number) {
    ret = process_image_input()
else
    ret = process_input(ist->file_index);

Does that seem reasonable?  If so, then I have to figure out how make my
"image frames" play nice with the rest of the transcode process.

2) I assume I would need to flush the input video stream at the point I
wish to insert the images.  Then start reading from the video stream again
when the images have been processed.

3) process_input() and process_input_packet() seem to do a lot of work to
get the PTS and DTS correct.  Since I am just processing images, can
someone tell me the easiest way to handle that?  Is there anything special
I will have to do, there, when I resume transcoding the video source to
allow for the time my image frames have taken?

4) Is

static int decode_video(InputStream *ist, AVPacket *pkt, int *got_output,
int64_t *duration_pts, int eof, int *decode_failed)

generic enough, that I can directly use it to process the packets I am
reading from the image file?  The examples I have found for reading and
parsing image files, do not bother with creating an InputStream.  If I
wanted to use decode_video() I would have to deal with that.  It would be
nice if I could just call

static void add_input_streams(OptionsContext *o, AVFormatContext *ic)

with my context for my image file, but I am guessing that I don't really
want it added to the "global list of input streams".

5) If decode_video() is not appropriate for me to call with my "image"
packets, then should I call ifilter_send_frame() directly?  I have to
admit, I currently have no idea how to setup a "filter" for the images.

I am hoping someone can either point me to an example of doing this type of
operation, or can offer me some guidance.  I have been doing a lot of
"googling" but it is hard to identify which "API" the examples I have
finding, where written against.  I am using ffmpeg master for this project.

TIA,

John