[FFmpeg-devel] [PATCH] Unbreak Altivec PPC optimizations
Wed Dec 24 14:34:15 CET 2008
On Wed, Dec 24, 2008 at 12:40:30PM +0100, Guillaume POIRIER wrote:
> On Wed, Dec 24, 2008 at 11:46 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> > On Wed, Dec 24, 2008 at 10:48:34AM +0100, Guillaume POIRIER wrote:
> >> I'm like to point out that in the future, care should be taken to
> >> _strictly_ avoid adding such dependencies between accelerated
> >> routines, because it's a pain to keep up with such changes for non-x86
> >> maintainers.
> > Id like to point out that this was unavoidable and telling me to avoid
> > that next time is just not helping if you cannot even suggest how exactly
> > the last (aka the current) one should have been avoided.
> > (i hope above doesnt sound too nasty ....)
> No, it doesn't sound nasty. Why I wrote this is just to underline that
> writing optimized routines takes time, so when this effort is killed
> useless because of some _avoidable_ code changes in the C code, then
> I'm not happy.
> If it wasn't avoidable, then all I care is that people explain what
> has to be done to fix the situation without having to go through the
> time-consuming problem-hunting process.
> You did end-up documenting this, so I can't complain.... but you did
> so _after_ you changed the code, that _that_ is what worries me.
it also worries me :)
but truth is, i simply missed the relation between the idct permuations
when i wrote the code.
And only realized it after mans complained about breakage on arm ...
> > "Avoiding" the issue by simply not introducing the dependancy would
> > require to add complexity and overhead to the code that
> > could negate the whole speed gain, besides its work that is known to
> > be thrown away. if a new 4x4 idct function is introduced that simply
> > must use the same permutations as the already used optimized 4x4 idct
> > function.
> > Thus in the end unless you suggest that some kind of optimizations
> > should better not be done at all, this really seems unavoidable. But iam
> > surely open to suggestions if you have some ...
> I don't have any suggestion to make regarding this. I trust your
> judgement when you say that there was no better way around this
> Next time, please directly document what consequences your changes
> have for non-x86 maintainers.
i would have done this, this time as well had i realized the consequences
i simply "forgot" or better said had not thought about that the optimized
idcts used transposed input while c does not and that this couldnt work
when some where unavailable while others where not.
> > The truth and real problem IMHO is that large parts of the non x86 code
> > is not maintained at all, and the solution for this cant be to stop
> > improving code because it might break unmaintained code.
> It's true that there are very few people working on Altivec code, but
> as they say "if it's not broken, don't fix it". The fact that it
> doesn't get constantly changed doesn't mean that nobody uses it, and
> doesn't mean that it doesn't work either.
i did not mean altivec, but all the non x86 code, altivec is still amongth
the better maintained just think of alpha, sparc or bfin ...
> Only a very small fraction of our DNA does anything; the rest is all
> comments and ifdefs.
Id say, we only understand the meaning of a small part of our DNA and
even that only partially.
Speaking of that, does qemu support executing DNA yet?
or another program?
now if no program exists to run the code how are some people so sure
that the parts they dont understand dont do anything ...
Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
When you are offended at any man's fault, turn to yourself and study your
own failings. Then you will forget your anger. -- Epictetus
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Size: 189 bytes
Desc: Digital signature
More information about the ffmpeg-devel