[FFmpeg-user] How to use ocr filter

Paul B Mahol onemda at gmail.com
Thu Sep 17 17:12:17 CEST 2015


On 9/17/15, nicolab <robelt2525 at gmail.com> wrote:
> When I using ocr filter, how to output ocr text file ?
> https://ffmpeg.org/ffmpeg-filters.html#ocr
>
> img.png
> <http://ffmpeg-users.933282.n4.nabble.com/file/n4672454/img.png>
>
> ffmpeg -f lavfi -i
> "movie=img.png,ocr=datapath=tessdata:language=eng,drawgraph=lavfi.ocr.text"
> out.png -y -loglevel 99

drawgraph accepts only floats values.

ffplay ~/img.png -vf
"ocr,split[ocr][o1],[ocr]lutyuv=y=0:u=128:v=128,drawtext=fontcolor=white:x=10:y=10:text=%{metadata\\:lavfi.ocr.text}[o2],[o1][o2]vstack"

> ffmpeg version 2.8.git Copyright (c) 2000-2015 the FFmpeg developers
>   built with gcc 5.2.0 (GCC)
>   configuration: --prefix=/mingw/i686-w64-mingw32 --enable-version3
> --enable-gpl
>  --enable-memalign-hack --enable-w32threads --enable-libtesseract
> --disable-outdev=sdl
>  --disable-ffplay --disable-ffprobe --disable-ffserver --disable-doc
> --disable-htmlpages
>  --disable-manpages --disable-podpages --disable-txtpages --disable-debug
>  --pkg-config-flags=--static
>   libavutil      55.  2.100 / 55.  2.100
>   libavcodec     57.  2.100 / 57.  2.100
>   libavformat    57.  2.100 / 57.  2.100
>   libavdevice    57.  0.100 / 57.  0.100
>   libavfilter     6.  4.100 /  6.  4.100
>   libswscale      4.  0.100 /  4.  0.100
>   libswresample   2.  0.100 /  2.  0.100
>   libpostproc    54.  0.100 / 54.  0.100
> Splitting the commandline.
> Reading option '-f' ... matched as option 'f' (force format) with argument
> 'lavf
> i'.
> Reading option '-i' ... matched as input file with argument
> 'movie=img.png,ocr=d
> atapath=tessdata:language=eng,drawgraph=lavfi.ocr.text'.
> Reading option 'out.png' ... matched as output file.
> Reading option '-y' ... matched as option 'y' (overwrite output files) with
> argu
> ment '1'.
> Reading option '-loglevel' ... matched as option 'loglevel' (set logging
> level)
> with argument '99'.
> Finished splitting the commandline.
> Parsing a group of options: global .
> Applying option y (overwrite output files) with argument 1.
> Applying option loglevel (set logging level) with argument 99.
> Successfully parsed a group of options.
> Parsing a group of options: input file
> movie=img.png,ocr=datapath=tessdata:langu
> age=eng,drawgraph=lavfi.ocr.text.
> Applying option f (force format) with argument lavfi.
> Successfully parsed a group of options.
> Opening an input file:
> movie=img.png,ocr=datapath=tessdata:language=eng,drawgrap
> h=lavfi.ocr.text.
> detected 4 logical cores
> [Parsed_movie_0 @ 02438040] Setting 'filename' to value 'img.png'
> Probing image2 score:50 size:929
> Probing mp3 score:1 size:929
> Probing png_pipe score:99 size:929
> [png_pipe @ 02438480] Format png_pipe probed with size=2048 and score=99
> [png_pipe @ 02438480] Before avformat_find_stream_info() pos: 0 bytes
> read:929 s
> eeks:0
> [png_pipe @ 02438480] 0: start_time: -9223372036854.775 duration:
> -9223372036854
> .775
> [png_pipe @ 02438480] stream: start_time: -9223372036854.775 duration:
> -92233720
> 36854.775 bitrate=0 kb/s
> [png_pipe @ 02438480] After avformat_find_stream_info() pos: 929 bytes
> read:929
> seeks:0 frames:1
> [Parsed_movie_0 @ 02438040] seek_point:0 format_name:(null)
> file_name:img.png st
> ream_index:-1
> [Parsed_ocr_1 @ 04813f80] Setting 'datapath' to value 'tessdata'
> [Parsed_ocr_1 @ 04813f80] Setting 'language' to value 'eng'
> [Parsed_ocr_1 @ 04813f80] Tesseract version: 3.02
> [Parsed_drawgraph_2 @ 024375e0] Setting 'm1' to value 'lavfi.ocr.text'
> [auto-inserted scaler 0 @ 048187c0] w:iw h:ih flags:'bilinear' interl:0
> [Parsed_ocr_1 @ 04813f80] auto-inserting filter 'auto-inserted scaler 0'
> between
>  the filter 'Parsed_movie_0' and the filter 'Parsed_ocr_1'
> [AVFilterGraph @ 02437580] query_formats: 4 queried, 2 merged, 1 already
> done, 0
>  delayed
> [auto-inserted scaler 0 @ 048187c0] picking yuv444p out of 15 ref:rgb24
> alpha:0
> [auto-inserted scaler 0 @ 048187c0] w:160 h:48 fmt:rgb24 sar:1/1 -> w:160
> h:48 f
> mt:yuv444p sar:1/1 flags:0x2
> [lavfi @ 024331e0] All info found
> [lavfi @ 024331e0] 0: start_time: 0.000 duration: -9223372036854.775
> [lavfi @ 024331e0] stream: start_time: 0.000 duration: -9223372036854.775
> bitrat
> e=0 kb/s
> Input #0, lavfi, from
> 'movie=img.png,ocr=datapath=tessdata:language=eng,drawgrap
> h=lavfi.ocr.text':
>   Duration: N/A, start: 0.000000, bitrate: N/A
>     Stream #0:0, 1, 1/25: Video: rawvideo, 1 reference frame (RGBA /
> 0x41424752)
> , rgba, 900x256 [SAR 1:1 DAR 225:64], 1/25, 25 tbr, 25 tbn, 25 tbc
> Successfully opened the file.
> Parsing a group of options: output file out.png.
> Successfully parsed a group of options.
> Opening an output file: out.png.
> Successfully opened the file.
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'video_size' to value
> '900x25
> 6'
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'pix_fmt' to value '28'
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'time_base' to value
> '1/25'
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'pixel_aspect' to value
> '1/1'
>
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'sws_param' to value
> 'flags=2
> '
> [graph 0 input from stream 0:0 @ 04838fa0] Setting 'frame_rate' to value
> '25/1'
> [graph 0 input from stream 0:0 @ 04838fa0] w:900 h:256 pixfmt:rgba tb:1/25
> fr:25
> /1 sar:1/1 sws_param:flags=2
> [format @ 04838a60] compat: called with
> args=[rgb24|rgba|rgb48be|rgba64be|pal8|g
> ray|ya8|gray16be|ya16be|monob]
> [format @ 04838a60] Setting 'pix_fmts' to value
> 'rgb24|rgba|rgb48be|rgba64be|pal
> 8|gray|ya8|gray16be|ya16be|monob'
> [AVFilterGraph @ 04817400] query_formats: 4 queried, 3 merged, 0 already
> done, 0
>  delayed
> Output #0, image2, to 'out.png':
>   Metadata:
>     encoder         : Lavf57.2.100
>     Stream #0:0, 0, 1/25: Video: png, 1 reference frame, rgba, 900x256 [SAR
> 1:1
> DAR 225:64], 1/25, q=2-31, 200 kb/s, 25 fps, 25 tbn, 25 tbc
>     Metadata:
>       encoder         : Lavc57.2.100 png
> Stream mapping:
>   Stream #0:0 -> #0:0 (rawvideo (native) -> png (native))
> Press [q] to stop, [?] for help
> Cliping frame in rate conversion by 0.000008
> [output stream 0:0 @ 048391e0] EOF on sink link output stream 0:0:default.
> No more output streams to write to, finishing.
> [AVIOContext @ 048416e0] Statistics: 0 seeks, 1 writeouts
> frame=    1 fps=0.0 q=-0.0 Lsize=N/A time=00:00:00.04 bitrate=N/A
> video:2kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing
> ove
> rhead: unknown
> Input file #0
> (movie=img.png,ocr=datapath=tessdata:language=eng,drawgraph=lavfi.
> ocr.text):
>   Input stream #0:0 (video): 1 packets read (921638 bytes); 1 frames
> decoded;
>   Total: 1 packets (921638 bytes) demuxed
> Output file #0 (out.png):
>   Output stream #0:0 (video): 1 frames encoded; 1 packets muxed (1543
> bytes);
>   Total: 1 packets (1543 bytes) muxed
> 1 frames successfully decoded, 0 decoding errors
> [AVIOContext @ 02438a80] Statistics: 929 bytes read, 0 seeks
>
>
>
> -----
> https://twitter.com/nico_lab
> http://nico-lab.net/
> --
> View this message in context:
> http://ffmpeg-users.933282.n4.nabble.com/How-to-use-ocr-filter-tp4672454.html
> Sent from the FFmpeg-users mailing list archive at Nabble.com.
> _______________________________________________
> ffmpeg-user mailing list
> ffmpeg-user at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-user
>


More information about the ffmpeg-user mailing list