[FFmpeg-user] smarter scaling filter

Jim Worrall coniophora at gmail.com
Fri Dec 9 15:19:13 CET 2011


On Dec 8, 2011, at 10:16 AM, Jim Worrall wrote:

> 
> On Wed, Dec 7, 2011 at 3:24 PM, Jim Worrall <coniophora at gmail.com> wrote:
> I've been trying to find or write a script/filtergraph that will take a target frame size and automatically scale down the input video (maintaining aspect ratio) if needed to fit the frame, and leave it the same if it is already the size or smaller than the frame.  I'm in a bit over my head and hoping for some pointers.
> 
> I'm starting with a very clever set of calculations that Francesco Turco posted here 12 June 2011. One side of each plus sign will always evaluate to 0 (escapes removed for clarity; his target frame size was 720x576):
> -vf scale = '
> gte(iw/ih,720/576)*720 + lt(iw/ih,720/576)*((576*iw)/ih) :
> lte(iw/ih,720/576)*576 + gt(iw/ih,720/576)*((720*ih)/iw) '
> 
> He thought it was too long, but still seems a great approach.  The problem for me is it will upscale too, which seems undesirable if the device doesn't require an exact size.  So I'm trying to add some logic to keep the input scale if both iw and ih are the size of or smaller than the frame.  While I'm at it, I'm hoping to use some variables from the script for the frame size.
> 
> The bash script asks for the target device (just iPhone 3 or iPhone4 now) and sets device-specific values for the maximum frame width ($FW) and height ($FH) and the corresponding aspect ratio ($FA).  The filter also seems to need some stored variables within ffmpeg, but I've only found one example of their real use on the web and couldn't make much sense out of it.  I can't figure how st(var,expr) is supposed to be incorporated into the filter, since it can't seem to go before it.  And there's a while(cond,expr) I don't know where to put either.
> 
> Here's an idea what I'm trying to do, and I think it is a long way from working.  The st(0,expr) that I put in the beginning (not knowing where it goes) stores 0 if both dimensions fit in the target frame.  I'm not sure if I can use a script variable inside a filter, hope so.  Anyway, the while statement is supposed to convert var 0 to 1 if it is not 0.  The rest is just an add-on to Francesco's filter that should specify the input dimensions if var 0 is 0.
> -vf = "st(0,gt(iw,$FW)+gt(ih,$FH)) ; 
>   while(ld(0),st(0,1) ; 
>   scale= ld(0) * ( gte(a,$FA)*$FW + lt(a,$FA)*(($FH*iw)/ih) ) + eq(0,ld(0))*iw :
>              ld(0) * ( lte(a,$FA)*$FH + gt(a,$FA)*(($FW*ih)/iw) ) + eq(0,ld(0))*ih "
> 
> In case it matters, I'm a user/hobbyist.  Any tips will be appreciated.  Thanks,
> Jim
> 
> By trial and error I learned that the variable storage and manipulation functions have 
> to go where the variables are first used in the actual filter expression.
> Doing that I eventually got it to run without errors, so major progress.
> 
> It worked as expected for video
> with smaller size than the target frame size, but not with a video that it actually
> needed to scale down.  One of the functions is apparently not doing what I think it 
> does.  I would appreciate some help.  
> 
> Here are the input values from the input file and the target values from the script (which 
> are getting read correctly).  The filter should give a video of 640x360, but it actually gives 
> 640x720. Below is the scale filter expression and above each line, how I think it should 
> evaluate for the current case.  I guess there is no way to see the value of variables inside 
> ffmpeg (created with st(var,expr) )?
> 
> Jim
> 
> INPUT values:
> iw    1280
> ih    720
> a    ~1.78
> TARGET FRAME SIZE values:
> $FW    640
> $FH    480
> $FA    ~1.33
> 
>         stored in var 0:    1                         *
>                                    1  +     1
> -vf="scale = st(0, min( 1 , gt(iw,$FW)+gt(ih,$FH) ) ) * \
> 
>                     640                                      :
> (      1    *640 +    0     *    853.33    ) +         0     :
> ( gte(a,$FA)*$FW + lt(a,$FA)*(($FH*iw)/ih) ) + not(ld(0))*iw : \
> 
>                     360
>   1   * (      0    *480 +    1     *   360        ) +    0
> ld(0) * ( lte(a,$FA)*$FH + gt(a,$FA)*(($FW*ih)/iw) ) + not(ld(0))*ih "


I have finally determined that the problem seems to be a bug in the evaluation of expressions.
I reduced the filter to the simplest form needed to show the bug:

-vf "scale = st(0\,1) * 640 + not(ld(0)) * 1080 : ld(0) * 480 + not(ld(0)) * 720"

Since var 0 is declared as 1, this filter should give a video 640x480, right?
Instead, I get 640 x 720.  Somehow the program changes the value of var 0
from 1 to 0 between the first use and the last.  The not(ld(0)) should not do that, should it?
Am I misunderstanding the functions or is this a bug?

Here is the complete command and output:

ffmpeg -i $INPUT -t 2 -c:v libx264 \
-vf "scale = st(0\,1) * 640 + not(ld(0)) * 1080 : ld(0) * 480 + not(ld(0)) * 720" \
-vprofile main -preset veryslow -x264opts level=3.1:ref=8 -c:a libvo_aacenc -strict experimental -y output.m4v

ffmpeg version 0.8.7.git-4547d88, Copyright (c) 2000-2011 the FFmpeg developers
  built on Dec  5 2011 14:31:39 with clang 3.0 (tags/Apple/clang-211.12)
  configuration: --prefix=/Volumes/Ramdisk/sw --cc=clang --enable-gpl --enable-version3 --enable-filters --arch=x86_64 --enable-hardcoded-tables --disable-indevs --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libvo-aacenc --enable-libvpx --disable-decoder=libvpx --enable-libmp3lame --enable-libx264 --enable-libvorbis --enable-libtheora --enable-libspeex
  libavutil    51. 30. 0 / 51. 30. 0
  libavcodec   53. 40. 0 / 53. 40. 0
  libavformat  53. 24. 0 / 53. 24. 0
  libavdevice  53.  4. 0 / 53.  4. 0
  libavfilter   2. 51. 0 /  2. 51. 0
  libswscale    2.  1. 0 /  2.  1. 0
  libpostproc  51.  2. 0 / 51.  2. 0

Seems stream 0 codec frame rate differs from container frame rate: 1200.00 (1200/1) -> 29.97 (30000/1001)
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'kit.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 537199360
    compatible_brands: qt  
    creation_time   : 2011-05-08 00:51:03
  Duration: 00:01:21.61, start: 0.000000, bitrate: 10723 kb/s
    Stream #0:0(und): Video: h264 (Baseline) (avc1 / 0x31637661), yuv420p, 1280x720, 10655 kb/s, 29.81 fps, 29.97 tbr, 600 tbn, 1200 tbc
    Metadata:
      creation_time   : 2011-05-08 00:51:03
      handler_name    : 
    Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, s16, 63 kb/s
    Metadata:
      creation_time   : 2011-05-08 00:51:03
      handler_name    : ?Apple Alias Data Handler
[buffer @ 0x7ff861c16640] w:1280 h:720 pixfmt:yuv420p tb:1/1000000 sar:0/1 sws_param:
[scale @ 0x7ff861c1ff00] w:1280 h:720 fmt:yuv420p -> w:640 h:720 fmt:yuv420p flags:0x4
[libx264 @ 0x7ff862031800] using cpu capabilities: MMX2 SSE2Fast SSSE3 FastShuffle SSE4.2
[libx264 @ 0x7ff862031800] profile Main, level 3.1
[libx264 @ 0x7ff862031800] 264 - core 119 - H.264/MPEG-4 AVC codec - Copyleft 2003-2011 - http://www.videolan.org/x264.html - options: cabac=1 ref=8 deblock=1:0:0 analyse=0x1:0x131 me=umh subme=10 psy=1 psy_rd=1.00:0.00 mixed_ref=1 me_range=24 chroma_me=1 trellis=2 8x8dct=0 cqm=0 deadzone=21,11 fast_pskip=1 chroma_qp_offset=-2 threads=6 sliced_threads=0 nr=0 decimate=1 interlaced=0 bluray_compat=0 constrained_intra=0 bframes=8 b_pyramid=2 b_adapt=2 b_bias=0 direct=3 weightb=1 open_gop=0 weightp=2 keyint=250 keyint_min=25 scenecut=40 intra_refresh=0 rc_lookahead=60 rc=crf mbtree=1 crf=23.0 qcomp=0.60 qpmin=0 qpmax=69 qpstep=4 ip_ratio=1.40 aq=1:1.00
Output #0, ipod, to 'output.m4v':
  Metadata:
    major_brand     : qt  
    minor_version   : 537199360
    compatible_brands: qt  
    creation_time   : 2011-05-08 00:51:03
    encoder         : Lavf53.24.0
    Stream #0:0(und): Video: h264 (avc1 / 0x31637661), yuv420p, 640x720, q=-1--1, 30k tbn, 29.97 tbc
    Metadata:
      creation_time   : 2011-05-08 00:51:03
      handler_name    : 
    Stream #0:1(und): Audio: aac (mp4a / 0x6134706D), 44100 Hz, mono, s16, 128 kb/s
    Metadata:
      creation_time   : 2011-05-08 00:51:03
      handler_name    : ?Apple Alias Data Handler
Stream mapping:
  Stream #0:0 -> #0:0 (h264 -> libx264)
  Stream #0:1 -> #0:1 (aac -> libvo_aacenc)
Press [q] to stop, [?] for help
frame=   48 fps=  0 q=0.0 size=       0kB time=00:00:00.00 bitrate=   0.0kbits/sframe=   60 fps= 10 q=-1.0 Lsize=     141kB time=00:00:01.93 bitrate= 596.3kbits/s    
video:106kB audio:32kB global headers:0kB muxing overhead 2.350829%
[libx264 @ 0x7ff862031800] frame I:1     Avg QP:24.61  size: 14340
[libx264 @ 0x7ff862031800] frame P:12    Avg QP:26.00  size:  3866
[libx264 @ 0x7ff862031800] frame B:47    Avg QP:30.08  size:  1004
[libx264 @ 0x7ff862031800] consecutive B-frames:  1.7%  0.0%  0.0% 26.7% 41.7% 30.0%  0.0%  0.0%  0.0%
[libx264 @ 0x7ff862031800] mb I  I16..4: 51.6%  0.0% 48.4%
[libx264 @ 0x7ff862031800] mb P  I16..4:  1.8%  0.0%  1.3%  P16..4: 38.5%  6.3% 10.2%  0.1%  0.0%    skip:41.7%
[libx264 @ 0x7ff862031800] mb B  I16..4:  0.1%  0.0%  0.0%  B16..8: 42.8%  1.9%  0.2%  direct: 0.3%  skip:54.8%  L0:40.9% L1:58.0% BI: 1.1%
[libx264 @ 0x7ff862031800] direct mvs  spatial:91.5% temporal:8.5%
[libx264 @ 0x7ff862031800] coded y,uvDC,uvAC intra: 43.4% 36.0% 11.9% inter: 1.7% 2.1% 0.0%
[libx264 @ 0x7ff862031800] i16 v,h,dc,p: 49%  7%  6% 38%
[libx264 @ 0x7ff862031800] i4 v,h,dc,ddl,ddr,vr,hd,vl,hu: 29%  2% 13% 10% 13% 11%  9%  8%  6%
[libx264 @ 0x7ff862031800] i8c dc,h,v,p: 49% 19% 25%  7%
[libx264 @ 0x7ff862031800] Weighted P-Frames: Y:0.0% UV:0.0%
[libx264 @ 0x7ff862031800] ref P L0: 55.0% 13.9% 16.3%  4.7%  3.6%  2.6%  2.5%  1.3%
[libx264 @ 0x7ff862031800] ref B L0: 85.0%  8.8%  4.1%  1.1%  0.6%  0.2%  0.1%
[libx264 @ 0x7ff862031800] ref B L1: 89.8% 10.2%
[libx264 @ 0x7ff862031800] kb/s:431.22
logout

[Process completed]



More information about the ffmpeg-user mailing list