[FFmpeg-devel] [PATCH] Faster CABAC H.264 residual decoding
Sun Apr 27 15:27:01 CEST 2008
Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
> On Sunday 27 April 2008, M?ns Rullg?rd wrote:
>> Siarhei Siamashka <siarhei.siamashka at gmail.com> writes:
>> > On Sunday 27 April 2008, M?ns Rullg?rd wrote:
>> >> matthieu castet <castet.matthieu at free.fr> writes:
>> >> > Jason Garrett-Glaser wrote:
>> >> >> On the advice of #ffmpeg-devel I have made a version with uint8_t
>> >> >> arrays instead of int.
>> >> >
>> >> > Don't forget that some cpu (arm for example) don't have native 8 bits
>> >> > operation. Everything is done in 32 bits, and 8 bits behavior is
>> >> > emulated with extra operation.
>> >> ARM has byte load and store instructions. All ALU operations are
>> >> 32-bit, except for certain multiplies. I doubt this is a problem
>> >> here.
>> >> The only recent CPU I know of that lacks byte load/store is the first
>> >> generation of the Alpha.
>> > Probably he just wanted to say that reading bytes has higher latency
>> > (+1 cycle extra) than reading ints on at least some ARM cores (ARM9).
>> Where do you find this information? The ARM926 data sheet only
>> mentions the 1-cycle penalty for shifted offsets.
> In DDI0222B_9EJS_r1p2.pdf, section "8.12.1 Interlocks":
> "Unaligned word loads, load byte (LDRB), and load halfword (LDRH)
> instructions use the byte rotate unit in the Write stage of the
> pipeline. This introduces a two-cycle load-use interlock, that can
> affect the two instructions immediately following the load
Thanks. This is still significantly different from the original
claim, since with proper instruction scheduling the extra load latency
should have no effect. It all depends on what other instructions can
be scheduled during the delay, of course.
>> > On the other hand, indexing bytes in array does not require shifted
>> > offset (which may also introduce some kind of penalty).
>> A left shift by 2 has no penalty on ARMv6.
> Yes, I'm well aware of it. And I'm sorry for nitpicking, but you probably
> wanted to say ARM11?
Yes, that's probably what I meant.
mans at mansr.com
More information about the ffmpeg-devel