[FFmpeg-devel] [PATCH] use AV_RB16 in cabac refill

Måns Rullgård mans
Fri Mar 26 04:31:44 CET 2010


Alexander Strange <astrange at ithinksw.com> writes:

> On Mar 25, 2010, at 4:08 AM, David Conrad wrote:
>
>> On Mar 25, 2010, at 3:30 AM, Alexander Strange wrote:
>> 
>>> Measured 1 cycle faster decode_cabac_residual on x86-64. Didn't try anywhere else, but I'd be a little interested in what arm does.
>> 
>> It ought to be 2 instruction less and faster. However, both llvm and gcc decide to zero extend from 16 bits twice, and (llvm-)gcc-4.2 decides to load bytestream twice.
>
> Hmm, zero-extending in bswap_16 isn't really surprising, since asm
> operands are always extended to int.

That depends on how the asm is written.

> The only solution there is to write AV_RB16 in asm too.
>
> --disable-asm is remarkably bad, I think it should be using
> (p[0] << 8 | p[1]) instead of __attribute__((packed)) and bswap_16
> when FAST_UNALIGNED isn't defined.

I don't quite understand that.

-- 
M?ns Rullg?rd
mans at mansr.com



More information about the ffmpeg-devel mailing list