[FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 bilinear functions
Shivraj Patil
Shivraj.Patil at imgtec.com
Mon Jul 27 16:11:58 CEST 2015
Hi,
On Mon, Jul 27, 2015 at 7:59 AM, <shivraj.patil at imgtec.com<mailto:shivraj.patil at imgtec.com>> wrote:
From: Shivraj Patil <shivraj.patil at imgtec.com<mailto:shivraj.patil at imgtec.com>>
Signed-off-by: Shivraj Patil <shivraj.patil at imgtec.com<mailto:shivraj.patil at imgtec.com>>
---
libavcodec/mips/vp9_mc_msa.c | 2123 ++++++++++++++++++++++++++++++++++++
libavcodec/mips/vp9dsp_init_mips.c | 2 +
libavcodec/mips/vp9dsp_mips.h | 32 +
3 files changed, 2157 insertions(+)
[..]
+void ff_avg_bilin_4h_msa(uint8_t *dst, ptrdiff_t dst_stride,
+ const uint8_t *src, ptrdiff_t src_stride,
+ int height, int mx, int my)
+{
+ const int8_t *filter = vp9_bilinear_filters_msa[mx - 1];
+
+ if (4 == height) {
+ common_hz_2t_and_aver_dst_4x4_msa(src, src_stride, dst, dst_stride,
+ filter);
+ } else if (8 == height) {
+ common_hz_2t_and_aver_dst_4x8_msa(src, src_stride, dst, dst_stride,
+ filter);
+ }
+}
You're using this construct in various places, how much does it help?
(Otherwise no comments, basically lgtm % the above.)
Shivraj:- For 8 height case, it helps to reduce stalls (perf gain ~20%), as compared to calling 4 height function twice.
More information about the ffmpeg-devel
mailing list