[FFmpeg-devel] [PATCH 2/2] configure: Enable GCC vectorization on ≥4.9
Ronald S. Bultje
rsbultje at gmail.com
Sun May 8 23:31:10 CEST 2016
On Sun, May 8, 2016 at 1:55 PM, James Almer <jamrial at gmail.com> wrote:
> On 5/8/2016 10:44 AM, Ronald S. Bultje wrote:
> > Hi,
> > On Fri, May 6, 2016 at 8:02 PM, James Almer <jamrial at gmail.com> wrote:
> >> On 5/6/2016 8:48 PM, Timothy Gu wrote:
> >>> On Fri, May 06, 2016 at 12:08:14PM +0200, Hendrik Leppkes wrote:
> >>>> Just to document it, this has caused build breakage in various
> >>>> scenarios, even in GCC 5.3 (6.1 not tested).
> >>>> The latest reported on IRC just today here:
> >>>> libavcodec/sbrdsp.c: In function 'sbr_neg_odd_64_c':
> >>>> libavcodec/sbrdsp.c:47:13: internal compiler error: in
> >>>> vect_analyze_data_ref_accesses, at tree-vect-data-refs.c:2596
> >>>> static void sbr_neg_odd_64_c(float *x)
> >>>> There are various other cases which usually involve inline asm when
> >>>> building with SIMD (ie. --cpu=host) and the optimizer running out of
> >>>> registers, for example:
> >>>> libavcodec/x86/cabac.h:192:5: error: 'asm' operand has impossible
> >> constraints
> >>>> IMHO this feature is not quite ready to be enabled unconditionally in
> >>>> our code base, and we should re-evaluate this change.
> >>> I don't have a problem with reverting this commit, but as James
> >> mentioned I do
> >>> prefer the bug to be reported to GCC if possible.
> >>> Have you also considered the possibility to enable this feature only if
> >> inline
> >>> assembly is not enabled?
> >> Nobody disables inline asm when using GCC, so it'd be the same as
> >> tree
> >> vectorization altogether to begin with.
> >> This feature gives some nice speed boost on parts of the code that don't
> >> have
> >> hand written asm, so I'd very much rather keep it and try to get GCC to
> >> fix bugs
> >> on their compilers.
> > Which codepaths are sped up _specifically_? If there's anything where
> > significant and important, we can hand-write assembly for it.
> > Ronald
> I don't recall exactly which, but i remember it was audio dsp functions
> back when i tested when this feature was introduced.
> None that we can't write for that matter, just that nobody will ever write
> because they are either not really important, or interesting, or easy to
> I also remember Ubitux pointed a boost in some code he was working with
> time ago when he suggested to enable this feature, but that ultimately
> Don't hold this as a valid argument against removing the feature since i
> really care enough to go and bench a bunch of functions all over again to
> proof. But if people want to remove it then it would be better to point
> parts of the code are being miscompiled and ideally a way to reproduce it,
> seeing that FATE hasn't complained ever since we introduced this. It would
> least give us something to report these issues to gcc's bug tracker.
I'm not trying to use it as a reason to remove the feature, rather as a "if
there's interesting speedups in code that matters to people, let's do it
the correct way". I have a half-finished blog post on this subject for
years now, but can't get myself to finish it. The point is: well-written,
hand-written assembly is always faster, usually by a huge margin.
So, if it matters, yes, we should hand-write it. If it doesn't matter...
Well, then ... it doesn't matter anyway, right? (I fully realize this is
not a black/white thing, but again, it's not a reason to remove, rather
trying to pick up on code that could be optimized better.)
More information about the ffmpeg-devel