[FFmpeg-devel] libavcodec/lossless_videodsp : add add_bytes AVX2

Martin Vignali martin.vignali at gmail.com
Wed Oct 25 23:29:16 EEST 2017


2017-10-25 22:08 GMT+02:00 Paul B Mahol <onemda at gmail.com>:

> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> > 2017-10-25 21:53 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
> >
> >> On 10/25/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> >> > 2017-10-25 9:43 GMT+02:00 Paul B Mahol <onemda at gmail.com>:
> >> >
> >> >> On 10/21/17, Martin Vignali <martin.vignali at gmail.com> wrote:
> >> >> > Hello,
> >> >> >
> >> >> > In attach patch to add AVX2 version for add_bytes
> >> >> >
> >> >> > 0001-libavcodec-lossless_videodsp-add-add_bytes-avx2-vers :
> >> >> > add AVX2 version
> >> >> >
> >> >> > pass fate-test for me (os 10.12, x86_64)
> >> >> >
> >> >> > checkasm result : (Kaby Lake) (run 10 times, and i took the fastest
> >> >> > version)
> >> >> > checkasm: all 2 tests passed
> >> >> > add_bytes_c: 108.7
> >> >> > add_bytes_sse2: 26.5
> >> >> > add_bytes_avx2: 15.5
> >> >> >
> >> >> >
> >> >> > 0002-libavcodec-lossless_video_dsp-cosmetic-add-better-se:
> >> >> > only cosmetic
> >> >> > like the ref c function declaration in asm file is not consistent
> >> >> > between
> >> >> > each asm file
> >> >> > i think a better separator for each function make the file easier
> to
> >> >> > read
> >> >> >
> >> >> > also add the c declaration for add bytes in comment
> >> >> >
> >> >> >
> >> >> > Martin
> >> >> >
> >> >>
> >> >> Are you sure 32bit alignment is actually enforced?
> >> >>
> >> >>
> >> > Hello,
> >> >
> >> > I think, data used by add_bytes is always aligned
> >> > because dst and src, are start of a line of an AvFrame
> >>
> >> Yes, but try width thats not multiple of 32.
> >> _______________________________________________
> >>
> >>
> > Sorry, not sure i understand.
> > following the doc, AVFrame->linesize, is multiple of max alignment
> >
> > and in the asm, loop will be repeat until, val < width
> >
> > Can you indicate me, the part, where you think, it's not ok ?
>
> I dunno. You should test it with widths not divisible by 32.
>

Tested with the fate sample : vsynth3-huffyuvbgra.avi (34x34)
./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc -

generate same crc than
./ffmpeg -i ./tests/data/fate/vsynth3-huffyuvbgra.avi -f framecrc -
-cpuflags 0


>
> also try encoding cropped video.
>

Are you sure, encoding cropped video, have a link to the decoding dsp func ?

these patch only take care about the decoding func


And the encoding func of huffyuvenc (in huffyuv add add/diff_bytes16 AVX2
discussion)
and losslessencdsp (not made for now), have a test for alignment of dst and
src


Martin


More information about the ffmpeg-devel mailing list