[FFmpeg-devel] [PATCH 10/11] avcodec/x86: add an 8-bit simple IDCT function based on the x86-64 high depth functions

Tue Jun 20 19:46:30 EEST 2017

On 2017-06-19 17:11, James Darnley wrote:
> diff --git a/libavcodec/x86/simple_idct10_template.asm b/libavcodec/x86/simple_idct10_template.asm
> index 51baf84c82..02fd445ec0 100644
> --- a/libavcodec/x86/simple_idct10_template.asm
> +++ b/libavcodec/x86/simple_idct10_template.asm
> @@ -258,6 +258,10 @@
>  
>      IDCT_1D     %1, %2, %8
>  %elif %2 == 11
> +    ; This copies the DC-only shortcut.  When there is only a DC coefficient the
> +    ; C shifts the value and splats it to all coeffs rather than multiplying and
> +    ; doing the full IDCT.  This causes a difference on 8-bit because the
> +    ; coefficient is 16383 rather than 16384 (which you can get with shifting).
>      por     m1, m8, m13
>      por     m1, m12
>      por     m1, [blockq+ 16]       ; { row[1] }[0-7]
> @@ -293,8 +297,6 @@
>      por  m9, m6
>      pand m10, m5
>      por  m10, m6
> -    pand m3, m5
> -    por  m3, m6
>  %else
>      IDCT_1D     %1, %2
>  %endif
> 

Now I see where these went.  I've moved these to the previous commit
which added the DC-only hack and as I said earlier I will push that one
soon.