[FFmpeg-devel] GSOC 2018 qualification task.

Rostislav Pehlivanov atomnuker at gmail.com
Tue Apr 10 02:25:48 EEST 2018


On 9 April 2018 at 19:10, Paul B Mahol <onemda at gmail.com> wrote:

> On 4/9/18, Rostislav Pehlivanov <atomnuker at gmail.com> wrote:
> > On 9 April 2018 at 03:59, ANURAG SINGH IIT BHU <
> > anurag.singh.phy15 at iitbhu.ac.in> wrote:
> >
> >> This mail is regarding the qualification task assigned to me for the
> >> GSOC project
> >> in FFmpeg for automatic real-time subtitle generation using speech to
> text
> >> translation ML model.
> >>
> >
> > i really don't think lavfi is the correct place for such code, nor that
> the
> > project's repo should contain such code at all.
> > This would need to be in another repo and a separate library.
>
> Why? Are you against ocr filter too?
>

The OCR filter uses libtessract so I'm fine with it. Like I said, as long
as the actual code to do it is in an external library I don't mind.
Mozilla recently released Deep Speech (https://github.com/mozilla/DeepSpeech)
which does pretty much exactly speech to text and is considered to have the
most accurate one out there. Someone just needs to convert the tensorflow
code to something more usable.


More information about the ffmpeg-devel mailing list