[FFmpeg-devel] [PATCH] improve yuv422p to rgb in libswscale

Michael Niedermayer michaelni
Wed Dec 1 15:39:32 CET 2010


On Tue, Nov 30, 2010 at 11:19:42PM -0800, Jason Garrett-Glaser wrote:
> On Tue, Nov 30, 2010 at 7:34 PM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Tue, Nov 30, 2010 at 07:26:42PM -0800, Baptiste Coudurier wrote:
> >> On 11/30/2010 07:13 PM, Michael Niedermayer wrote:
> >>> On Tue, Nov 30, 2010 at 07:05:05PM -0800, Baptiste Coudurier wrote:
> >>>> On 11/30/2010 06:53 PM, Michael Niedermayer wrote:
> >>>>> On Tue, Nov 30, 2010 at 06:29:32PM -0800, Baptiste Coudurier wrote:
> >>>>>> On 11/30/2010 06:15 PM, Michael Niedermayer wrote:
> >>>>>>> On Tue, Nov 30, 2010 at 05:20:42PM -0800, Baptiste Coudurier wrote:
> >>>>>>>> On 11/30/2010 05:13 PM, Michael Niedermayer wrote:
> >>>>>>>>> On Tue, Nov 30, 2010 at 04:07:20AM -0800, Baptiste Coudurier wrote:
> >>>>>>>>>> Hi
> >>>>>>>>>>
> >>>>>>>>>> $subject, use full vertical data when convert 422p, improve quality a lot.
> >>>>>>>>>>
> >>>>>>>>>> --
> >>>>>>>>>> Baptiste COUDURIER
> >>>>>>>>>> Key fingerprint ? ? ? ? ? ? ? ? 8D77134D20CC9220201FC5DB0AC9325C5C1ABAAA
> >>>>>>>>>> FFmpeg maintainer ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?http://www.ffmpeg.org
> >>>>>>>>>
> >>>>>>>>>> ? ? ?x86/yuv2rgb_template.c | ? 25 ++++-------
> >>>>>>>>>> ? ? ?yuv2rgb.c ? ? ? ? ? ? ?| ?109 ++++++-------------------------------------------
> >>>>>>>>>> ? ? ?2 files changed, 26 insertions(+), 108 deletions(-)
> >>>>>>>>>> 16f384c9b114c76572a539511c42267ce2942c67 ?yuv422p_to_rgb.patch
> >>>>>>>>>
> >>>>>>>>> this looks like it would make 420p->rgb quite a bit slower.
> >>>>>>>>
> >>>>>>>> Do you think changing>>1 to>>vshift would make is quite a bit slower ?
> >>>>>>>
> >>>>>>> no but that stuff:
> >>>>>>> @@ -152,134 +144,102 @@
> >>>>>>> ? ? YUV2RGBFUNC(yuv2rgb_c_48, uint8_t, 0)
> >>>>>>> ? ? ? ? LOADCHROMA(0);
> >>>>>>> ? ? ? ? PUTRGB48(dst_1,py_1,0);
> >>>>>>> - ? ?PUTRGB48(dst_2,py_2,0);
> >>>>>>>
> >>>>>>> ? ? ? ? LOADCHROMA(1);
> >>>>>>> - ? ?PUTRGB48(dst_2,py_2,1);
> >>>>>>> ? ? ? ? PUTRGB48(dst_1,py_1,1);
> >>>>>>>
> >>>>>>> ? ? ? ? LOADCHROMA(2);
> >>>>>>> ? ? ? ? PUTRGB48(dst_1,py_1,2);
> >>>>>>> - ? ?PUTRGB48(dst_2,py_2,2);
> >>>>>>>
> >>>>>>> ? ? ? ? LOADCHROMA(3);
> >>>>>>> - ? ?PUTRGB48(dst_2,py_2,3);
> >>>>>>> ? ? ? ? PUTRGB48(dst_1,py_1,3);
> >>>>>>> ? ? ENDYUV2RGBLINE(48)
> >>>>>>> ? ? ? ? LOADCHROMA(0);
> >>>>>>> ? ? ? ? PUTRGB48(dst_1,py_1,0);
> >>>>>>> - ? ?PUTRGB48(dst_2,py_2,0);
> >>>>>>>
> >>>>>>> ? ? ? ? LOADCHROMA(1);
> >>>>>>> - ? ?PUTRGB48(dst_2,py_2,1);
> >>>>>>> ? ? ? ? PUTRGB48(dst_1,py_1,1);
> >>>>>>> ? ? ENDYUV2RGBFUNC()
> >>>>>>>
> >>>>>>> and then running the code twice
> >>>>>>
> >>>>>> This is the C code, the mmx routine for yuv420 to rgb24 is already present.
> >>>>>>
> >>>>>> I don't understand running the code twice, can you please clarify ?
> >>>>>
> >>>>> yes, iam seeing this:
> >>>>> - ? ?for (y=0; y<srcSliceH; y+=2) {\
> >>>>> + ? ?for (y=0; y<srcSliceH; y++) {\
> >>>>>
> >>>>> and thus iam thinking it runs the code twice after the patch,
> >>>>> do i miss something?
> >>>>
> >>>> The old code was processing 2 lines at once, see py_1 and py_2, that's
> >>>> why a lot of code is removed in the macros.
> >>>
> >>> yes but i think its slower after the patch
> >>
> >> I guess it is a bit slower because of LOADCHROMA.
> >
> > yes
> >
> >
> >>
> >>>>
> >>>> Do you agree that the mmx change looks trivial ?
> >>>
> >>> yes mmx looks good
> >>
> >> Well, problem now is that mmx routine will have different behaviour, and
> >> will be better.
> >
> > they are already different
> >
> >
> >> I mean, the C routine is not supposed to be that fast is it ?
> >
> > our code is supposed to be fast
> >
> >
> >>
> >> Also adding a check for some user parameters would require adding tests
> >> in a lot of places I fear.
> >
> > just a seperate video filter that plays a bit with linesize and turns
> > 422 in 420 as a sideeffect
> >
> > [...]
> > --
> > Michael ? ? GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
> >
> > Its not that you shouldnt use gotos but rather that you should write
> > readable code and code with gotos often but not always is less readable
> >
> > -----BEGIN PGP SIGNATURE-----
> > Version: GnuPG v1.4.10 (GNU/Linux)
> >
> > iEYEARECAAYFAkz1wi4ACgkQYR7HhwQLD6t1??????????
> > /XQAn267Pi6IPJOcubcXVE42JZnkMa8t
> > =EjH7
> > -----END PGP SIGNATURE-----
> >
> > _______________________________________________
> > ffmpeg-devel mailing list
> > ffmpeg-devel at mplayerhq.hu
> > https://lists.mplayerhq.hu/mailman/listinfo/ffmpeg-devel
> >
> 
> Isn't all of this what FULL_CHROMA_INP is for or whatnot?

yes such flag could be used

[...]
--
Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB

Let us carefully observe those good qualities wherein our enemies excel us
and endeavor to excel them, by avoiding what is faulty, and imitating what
is excellent in them. -- Plutarch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/attachments/20101201/5ea29a62/attachment.pgp>



More information about the ffmpeg-devel mailing list