[FFmpeg-devel] [PATCHv2] avutil/lls: speed up performance of solve_lls

Ganesh Ajjanagadde gajjanag at mit.edu
Thu Nov 26 15:22:25 CET 2015


On Wed, Nov 25, 2015 at 6:29 AM, Michael Niedermayer <michaelni at gmx.at> wrote:
> On Tue, Nov 24, 2015 at 10:13:22PM -0500, Ganesh Ajjanagadde wrote:
>> This is a trivial rewrite of the loops that results in better
>> prefetching and associated cache efficiency. Essentially, the problem is
>> that modern prefetching logic is based on finite state Markov memory, a reasonable
>> assumption that is used elsewhere in CPU's in for instance branch
>> predictors.
>>
>> Surrounding loops all iterate forward through the array, making the
>> predictor think of prefetching in the forward direction, but the
>> intermediate loop is unnecessarily in the backward direction.
>>
>> Speedup is nontrivial. Benchmarks obtained by 10^6 iterations within
>> solve_lls, with START/STOP_TIMER. File is tests/data/fate/flac-16-lpc-cholesky.err.
>> Hardware: x86-64, Haswell, GNU/Linux.
>>
>> new:
>>   17291 decicycles in solve_lls, 2096706 runs,    446 skips
>>   17255 decicycles in solve_lls, 4193657 runs,    647 skips
>>   17231 decicycles in solve_lls, 8384997 runs,   3611 skips
>>   17189 decicycles in solve_lls,16771010 runs,   6206 skips
>>   17132 decicycles in solve_lls,33544757 runs,   9675 skips
>>   17092 decicycles in solve_lls,67092404 runs,  16460 skips
>>   17058 decicycles in solve_lls,134188213 runs,  29515 skips
>>
>> old:
>>   18009 decicycles in solve_lls, 2096665 runs,    487 skips
>>   17805 decicycles in solve_lls, 4193320 runs,    984 skips
>>   17779 decicycles in solve_lls, 8386855 runs,   1753 skips
>>   18289 decicycles in solve_lls,16774280 runs,   2936 skips
>>   18158 decicycles in solve_lls,33548104 runs,   6328 skips
>>   18420 decicycles in solve_lls,67091793 runs,  17071 skips
>>   18310 decicycles in solve_lls,134187219 runs,  30509 skips
>>
>> Reviewed-by: Michael Niedermayer <michael at niedermayer.cc>
>> Signed-off-by: Ganesh Ajjanagadde <gajjanagadde at gmail.com>
>> ---
>>  libavutil/lls.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> LGTM
>
> thx

pushed, thanks

>
> [...]
> --
> Michael     GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB
>
> Those who are best at talking, realize last or never when they are wrong.
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list