[FFmpeg-devel] [flamefest-start] A little something on MMX/SSE intrinsics
Sat Mar 1 13:09:14 CET 2008
On Thu, Feb 28, 2008 at 12:23 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Wed, Feb 27, 2008 at 09:33:09PM +0000, M?ns Rullg?rd wrote:
> > Michael Niedermayer <michaelni at gmx.at> writes:
> > > On Wed, Feb 27, 2008 at 03:29:56PM -0500, Alexander Strange wrote:
> > >> I don't think anyone can get Altivec asm to work better than
> > >> intrinsics on more than one CPU - PPC is really, really
> > >> scheduling-sensitive, especially the G5 and Cell.
> > >
> > > Until i see benchmarks id guess gcc+intrinsics will be slower than
> > > unsheduled naively written asm()
> > That depends on the CPU. Some CPUs are quite particular about
> > instruction scheduling.
> That is true but can gcc schedule instructions properly on these cpus?
> Also the real question is can gcc beat a human in instruction scheduling ;)
Actually I'd like to get a little bit more background info on this topic.
How are PPCs so scheduling-sensitive?
Usually you write instructions with as much parallelism as possible
and the CPU is expected to execute as much instructions as it can.
Of course there are architectures like VLIW where you have fixed
pre-determined number of instructions that could be grouped together.
Are PPCs one of them?
I just want the summary, not reading 5-6 optimization manuals.
More information about the ffmpeg-devel