[FFmpeg-devel] h264 threading fate tests

Clément Bœsch ubitux at gmail.com
Thu Oct 4 11:09:19 CEST 2012


On Mon, Oct 01, 2012 at 11:50:19AM +0200, Clément Bœsch wrote:
> On Fri, Sep 28, 2012 at 11:07:54AM +0200, Clément Bœsch wrote:
> > On Mon, Apr 16, 2012 at 04:43:42PM +0200, Clément Bœsch wrote:
> > > Hi,
> > > 
> > > I recently setup a few fate instances to test the threading (2, 8, 16 and
> > > auto), and regularly one of the h264 conformance test fails; look at the
> > > yellow entries here for instance:
> > > http://fate.ffmpeg.org/history.cgi?slot=x86_64-archlinux-gcc-threads-8
> > > 
> > > The other day, I tried to run an automated git bisect run, but
> > > unfortunately testing the potential regression would requires to run
> > > something like "while true; do make fate-h264 -j20 THREADS=8; done" for
> > > around 15 minutes at least each time, and it might not even be reliable.
> > > 
> > > I'm not familiar at all with AVC decoding or threading in FFmpeg, but
> > > maybe someone has an idea of what could cause this?
> > > 
> > > Maybe I should just open an issue in the trac?
> > 
> > OK so here is some more information about that race. Since the failure
> > seems to be always the same (failing frame CRCs are identical), I dumped
> > both outputs and made a diff. Here it is:
> > 
> >     http://lucy.pkh.me/race.html
> > 
> > Basically, we can notice some ±1 byte of difference at times. The
> > generated outputs can be downloaded from that page.
> > 
> > Also, I was able the other day (purely by luck) to have a valgrind
> > backtrace triggering the problem. The output was flooded, but here is a
> > sample: http://pastie.org/4602183. BTW, it seems the test is using a frame
> > based threading.
> > 
> > BTW, we observe only two failing tests: h264-conformance-cama2_vtc_b
> > mostly, but at times h264-conformance-mr1_bt_a fails as well. The
> > information I gave above are related to h264-conformance-cama2_vtc_b only.
> > 
> > Still anyone to look into this? :(
> > 
> 
> More digging info: it seems the problem (or at least part of it) is around
> H264Context->ref_count[2] (frame-based threading), especially when copying it
> from the priv_data in lavc/h264.c:decode_update_thread_context():
> 
[...]

OK, so thanks to Ronald and various tools, we will now soon have the ref_count
issue fixed along with the init cabac tables problem I pushed recently.

Now we still have some problems detected, and it seems related to
unprotected write accesses to p->state, p being the PerThreadContext.
AFAICT, it is most of the time protected by the progress mutex, but not
all the time (something looks fishy in pthread.c:submit_packet for
example).

So my question is: what protection state actually needs? Both p->mutex and
p->progress_state? It thought progress_state would be the only one (that's how
I fixed one of the race in 55ed91c8565a3c562d2982e1cd5e66df06c6c190), but
Helgrind complains here as well:

        pthread_mutex_lock(&p->progress_mutex);
        for (i = 0; i < MAX_BUFFERS; i++)
            if (p->progress_used[i] && (p->got_frame || p->result<0 || avctx->codec_id != AV_CODEC_ID_H264)) {
                p->progress[i][0] = INT_MAX;
                p->progress[i][1] = INT_MAX;
            }
        p->state = STATE_INPUT_READY;

This is the one raised all the time (not only the cama2_vtc_b test) which I
ignored until then.

Anyway, this is not the only problem detected; as I said, sometimes another
problem is triggered around this p->state (I need to check again if it was
indeed in submit_packet).

-- 
Clément B.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 490 bytes
Desc: not available
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20121004/9630fcd8/attachment.asc>


More information about the ffmpeg-devel mailing list