[FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add a generic filter for image proccessing with dnn networks

Guo, Yejun yejun.guo at intel.com
Thu Nov 7 17:17:44 EET 2019


> > From: Pedro Arthur [mailto:bygrandao at gmail.com]
> > Sent: Thursday, November 07, 2019 1:18 AM
> > To: FFmpeg development discussions and patches
> <ffmpeg-devel at ffmpeg.org>
> > Cc: Guo, Yejun <yejun.guo at intel.com>
> > Subject: Re: [FFmpeg-devel] [PATCH V3] avfilter/vf_dnn_processing: add a
> > generic filter for image proccessing with dnn networks
> >
> > Hi,
> >
> > Em qui., 31 de out. de 2019 às 05:39, Guo, Yejun <yejun.guo at intel.com>
> > escreveu:
> > This filter accepts all the dnn networks which do image processing.
> > Currently, frame with formats rgb24 and bgr24 are supported. Other
> > formats such as gray and YUV will be supported next. The dnn network
> > can accept data in float32 or uint8 format. And the dnn network can
> > change frame size.
> >
> > The following is a python script to halve the value of the first
> > channel of the pixel. It demos how to setup and execute dnn model
> > with python+tensorflow. It also generates .pb file which will be
> > used by ffmpeg.
> >
> > import tensorflow as tf
> > import numpy as np
> > import scipy.misc
> > in_img = scipy.misc.imread('in.bmp')
> > in_img = in_img.astype(np.float32)/255.0
> > in_data = in_img[np.newaxis, :]
> > filter_data = np.array([0.5, 0, 0, 0, 1., 0, 0, 0,
> > 1.]).reshape(1,1,3,3).astype(np.float32)
> > filter = tf.Variable(filter_data)
> > x = tf.placeholder(tf.float32, shape=[1, None, None, 3], name='dnn_in')
> > y = tf.nn.conv2d(x, filter, strides=[1, 1, 1, 1], padding='VALID',
> name='dnn_out')
> > sess=tf.Session()
> > sess.run(tf.global_variables_initializer())
> > output = sess.run(y, feed_dict={x: in_data})
> > graph_def = tf.graph_util.convert_variables_to_constants(sess,
> > sess.graph_def, ['dnn_out'])
> > tf.train.write_graph(graph_def, '.', 'halve_first_channel.pb', as_text=False)
> > output = output * 255.0
> > output = output.astype(np.uint8)
> > scipy.misc.imsave("out.bmp", np.squeeze(output))
> >
> > To do the same thing with ffmpeg:
> > - generate halve_first_channel.pb with the above script
> > - generate halve_first_channel.model with tools/python/convert.py
> > - try with following commands
> >   ./ffmpeg -i input.jpg -vf
> >
> dnn_processing=model=halve_first_channel.model:input=dnn_in:output=dnn_
> > out:fmt=rgb24:dnn_backend=native -y out.native.png
> >   ./ffmpeg -i input.jpg -vf
> >
> dnn_processing=model=halve_first_channel.pb:input=dnn_in:output=dnn_out:f
> > mt=rgb24:dnn_backend=tensorflow -y out.tf.png
> It would be great if you could transform the above steps in a fate test, that
> way one can automatically ensure the filter is always working properly.

sure, I'll add a fate test to test this filter with halve_first_channel.model. There will
be no test for tensorflow part since the fate test requires no external dependency.

furthermore, more industry-famous models can be added into this fate test after we support them by
adding more layers into native mode, and after we optimize the conv2d layer which is now
very very very very slow.

> > +};
> > +
> > +AVFilter ff_vf_dnn_processing = {
> > +    .name          = "dnn_processing",
> > +    .description   = NULL_IF_CONFIG_SMALL("Apply DNN processing
> filter
> > to the input."),
> > +    .priv_size     = sizeof(DnnProcessingContext),
> > +    .init          = init,
> > +    .uninit        = uninit,
> > +    .query_formats = query_formats,
> > +    .inputs        = dnn_processing_inputs,
> > +    .outputs       = dnn_processing_outputs,
> > +    .priv_class    = &dnn_processing_class,
> > +};
> > --
> > 2.7.4
> rest LGTM.

thanks, could we first push this patch? 

I plan to add two more changes for this filter next:
- add gray8 and gray32 support
- add y_from_yuv support, in other words, the network only handles the Y channel,
and uv parts are not changed (or just scaled), just like what vf_sr does.

I currently do not have plan to add specific yuv formats, since I do not see a famous
network which handles all the y u v channels.


> BTW do you have already concrete use cases (or plans) for this filter?

not yet, the idea of this filter is that it is general for image processing and should be very useful, 
and my basic target is to at least cover the features provided by vf_sr and vf_derain

actually, I do have a use case plan for a general video analytic filter, the side data type might be
a big challenge, I'm still thinking about it. I choose this image processing filter first because
it is simpler and community can be familiar with dnn based filters step by step.

> >
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at ffmpeg.org
> > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
> >
> > To unsubscribe, visit link above, or email
> > ffmpeg-devel-request at ffmpeg.org with subject "unsubscribe".


More information about the ffmpeg-devel mailing list