[FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update

Mon May 28 21:04:36 EEST 2018

2018-05-28 3:32 GMT-03:00 Guo, Yejun <yejun.guo at intel.com>:
> looks that no tensorflow dependency is introduced, a new model format is created together with some CPU implementation for inference.   With this idea, Android Neural Network would be a very good reference, see https://developer.android.google.cn/ndk/guides/neuralnetworks/. It defines how the model is organized, and also provided a CPU optimized inference implementation (within the NNAPI runtime, it is open source). It is still under development but mature enough to run some popular dnn models with proper performance. We can absorb some basic design. Anyway, just a reference fyi.  (btw, I'm not sure about any IP issue)
>
> For this patch, I have two comments.
>
> 1. change from "DNNModel* (*load_default_model)(DNNDefaultModel model_type);" to " DNNModel* (*load_builtin_model)(DNNBuiltinModel model_type);"
> The DNNModule can be invoked by many filters,  default model is a good name at the filter level, while built-in model is better within the DNN scope.
I have no problem with either name, both seems reasonable.

>
> typedef struct DNNModule{
>     // Loads model and parameters from given file. Returns NULL if it is not possible.
>     DNNModel* (*load_model)(const char* model_filename);
>     // Loads one of the default models
>     DNNModel* (*load_default_model)(DNNDefaultModel model_type);
>     // Executes model with specified input and output. Returns DNN_ERROR otherwise.
>     DNNReturnType (*execute_model)(const DNNModel* model);
>     // Frees memory allocated for model.
>     void (*free_model)(DNNModel** model);
> } DNNModule;
>
>
> 2. add a new variable 'number' for DNNData/InputParams
> As a typical DNN concept, the data shape usually is: <number, height, width, channel> or <number, channel, height, width>, the last component denotes its index changes the fastest in the memory. We can add this concept into the API, and decide to support <NHWC> or <NCHW> or both.
You mean how often the layer filter data changes?
If yes, I don't see much use for it, atm we are performing only inference.

If there are no more concerns about the patch I may push it by tomorrow.

Thanks.

>
>
> Thanks
> Yejun (郭叶军)
>
> -----Original Message-----
> From: ffmpeg-devel [mailto:ffmpeg-devel-bounces at ffmpeg.org] On Behalf Of Sergey Lavrushkin
> Sent: Saturday, May 26, 2018 2:02 AM
> To: FFmpeg development discussions and patches <ffmpeg-devel at ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [GSOC] [PATCH] DNN module introduction and SRCNN filter update
>
>>
>> You should use the ff_ prefix for internal, non-static functions only.
>> You don't need to use any prefix for internal structs.
>> The AV* prefix is only for public structs, and the av_ prefix is for
>> public functions.
>>
>> >
>> >
>> >     And you need to indent and prettify the tables a bit.
>> >
>> >
>> > Do you mean kernels and biases in dnn_srcnn.h?
>> > Their formatting represents their 4D structure a little bit and it
>> > is similar to one that I used for the first srcnn filter version
>> > that was successfully pushed before.
>>
>> Yes, and i suppose they were pushed like that without the reviewer
>> noticing they had no indentation or vertical alignment.
>>
>
> Here is the patch with DNN module moved to libavfilter and proper formatting of tables in dnn_srcnn.h.
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel