<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html style="direction: ltr;">
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"><style>body
p { margin-bottom: 0cm; margin-top: 0pt; } </style>
</head>
<body style="direction: ltr;"
bidimailui-detected-decoding-type="latin-charset" bgcolor="#ffffff"
text="#000000">
<tt>I have a camera supplying an h264 stream, whose SPS/PPS claims
that it needs 4 reference frames, causing decoding to lag the
input by 4 frames. At 5-fps, this is almost 1 second and is very
noticeable (and undesired for this app, as in many cases the
security person watching the stream can also see the events
happening out the window, and a 1 second delay is confusing and
seems "broken").</tt><br>
<br>
<tt>However, when I look at camera specs, and at the streams
themselves, they only ever contain I pictures and P pictures
(never B pictures). To my (limited) understanding of the h264
protocol, that suggests that it is possible to fully decode and
display every image as it arrives, as they will never be out of
order.</tt><br>
<br>
<tt>Alex Cohn suggested in
<a class="moz-txt-link-rfc2396E" href="http://www.mail-archive.com/libav-user@ffmpeg.org/msg00590.html"><http://www.mail-archive.com/libav-user@ffmpeg.org/msg00590.html></a>
to modify</tt><tt> ff_h264_get_profile(). I've done this as a
test, and it seems to work for the streams that I have; I've also
patched the SPS/PPS manually and it also solves the problem. But I
would rather not have to do either of these patches (the first is
ugly and requires me to rebuild ffmpeg myself all the time; the
second is just plain ugly and error prone).</tt><br>
<br>
<tt>Questions:</tt><br>
<br>
<tt>1. Does the (known from specs) fact that the input stream will
only ever contains I and P really guarantee that it is safe to
decode every picture as it arrives? Or have I just been lucky in
my test streams, and reordering might still be needed at some
point?</tt><br>
<br>
<tt>2. If (1) is correct, would it work to patch
ff_h264_decode_init() and decode_postinit() to also check the
AV_DISCARD mode, and if we are discarding B and/or nonref frames,
would set avctx->has_b_frames=0 and low_delay=1 ?</tt><br>
<tt>Something like that would make it possible to convert _every_
stream to a low-delay stream by dropping the "non-low-delay"
frames. For me, it would solve the problem (there are no B frames,
so I wouldn't even lose anything by discarding B frames), but it
would also be useful for e.g. seek functionality in a media player
- if fast-forwarding by showing only I-frames, you would not need
to read and discard 3 more frames to show an I frame you have just
read.</tt><br>
<br>
<tt>Thanks in advance for your time and thoughts,</tt><br>
<tt>Camera Man</tt><br>
</body>
</html>