[FFmpeg-devel] [PATCH] libavformat: Don't return errors if select is interrupted

Martin Storsjö martin
Mon Mar 22 19:56:02 CET 2010


On Fri, 5 Mar 2010, Martin Storsj? wrote:

> On Thu, 4 Mar 2010, Ronald S. Bultje wrote:
> 
> > On Thu, Mar 4, 2010 at 5:23 PM, Martin Storsj? <martin at martin.st> wrote:
> > > Currently, most network protocols wait for data with a select loop, with a
> > > 100 ms timeout and recheck url_interrupt_cb() for each round in the loop.
> > > If the select is interrupted (e.g. by a signal), most of the protocols
> > > return an error, even if a similar interruption in recv is handled by
> > > retrying. The attached patch makes the protocols simply retry select if it
> > > returned with EINTR or EAGAIN, instead of returning an error in these
> > > cases. This makes the interruption behaviour consistent between select and
> > > recv (or send or similar).
> > >
> > > Ronald, does this sound sensible to you?
> > 
> > +        } else if (n < 0) {
> > [..]
> > +            return AVERROR(EIO);
> > 
> > That should be applied separately and is OK.
> 
> Applied this part.
> 
> > For the rest, does it fix an actual bug? Let's start with EAGAIN:
> 
> I can't say that I've experienced this as a bug, but I'd imagine that this 
> would be a nightmare to track down if it actually was the cause of 
> spurious errors.
> 
> > EAGAIN, according to man 2 select:
> >      [EAGAIN]           The kernel was (perhaps temporarily) unable to allo-
> >                         cate the requested number of file descriptors.
> > This is not the same as man 2 recv:
> >      [EAGAIN]           The socket is marked non-blocking, and the receive
> >                         operation would block, or a receive timeout had been
> >                         set, and the timeout expired before data were
> >                         received.
> > 
> > Since we're blocking, what happens here for recv() is that no data was
> > received and the timer expired. For select(), then the return value is
> > 0 (see man 2 recv:)
> > 
> > RETURN VALUES
> > [..] If the time limit expires, select() returns 0. [..]
> > 
> > Because of that, IMO the EAGAIN part is not OK, the difference is intended.
> 
> Ah, yes, EAGAIN isn't relevant for select, my bad. Updated patch attached.
> 
> > As for EINTR, I in fact consider it bad behaviour to continue a loop
> > when interrupted, so I've always frowned upon its use in recv(). My
> > hypothesis is that people use other functions to manually interrupt,
> > in combination with url_interrupt_cb(), and then want the function to
> > return EINTR instead of EIO (default return value), but I'm not sure.
> > Maybe somebody else knows? Tabled, for now... :-).
> 
> Well, the interruption could be anything (any signal delivered to the 
> process, totally unrelated to interrupting/stopping reading from the 
> network). If the caller really wants to stop the reading, that should be 
> done through url_interrupt_cb and not through just any signal being 
> delivered to the process, otherwise this may simply lead to spurious 
> errors.
> 
> On Fri, 5 Mar 2010, Luca Abeni wrote:
> 
> > I think the idea was to let the application cope with EINTR and EAGAIN,
> > but I might be misremembering (and I have no strong opinions on it :)
> 
> Yes, assuming that all the code (between the lowest level protocol 
> receiving the data and the original av_read_frame call from the user 
> application) is ready to handle EINTR/EAGAIN properly, that's probably the 
> best solution.
> 
> One problem is that a very large portion of the code don't pass the 
> original error codes through properly but simply return AVERROR(EIO) if 
> something occurred.
> 
> Another problem is that code reading a few bytes at a time seldom is ready 
> to actually be interrupted at any time, return an error code and then 
> continue properly when called the next time again.
> 
> One (quite randomly chosen) example of this is ff_rtmp_packet_read, in 
> libavformat/rtmppkt.c, lines 72 and onwards. This reads a packet, a few 
> bytes at a time, using url_read and url_read_complete. On errors, it 
> simply returns AVERROR(EIO), so the original EINTR is lost. These problems 
> generally are quite simple to solve.
> 
> But what if the function has passed halfway through, has read part of the 
> packet, but one of the url_read_complete calls fail with EINTR (since the 
> process received some totally unrelated signal at this time)? Modifying 
> this code to save the state (half of the packet already received, reuse 
> that data on next call) so that it can be called again, since the caller 
> didn't want to interrupt it. I guess that would require some kind of 
> rewind-functionality for the URLContext.
> 
> I guess the same goes for all demuxers, too. Are they ready to handle 
> reading functions spuriously returning EINTR/EAGAIN randomly, saving the 
> state halfway through whatever packet was being read to return the EINTR 
> error, so that the caller can try to read again in order to get the full 
> packet that was being received. Or do they return an incomplete packet 
> with as much data as had been received up to that point? As a caller of 
> av_read_frame(), I don't want to receive an incomplete packet just since 
> my process was interrupted by something unrelated while receiving it.

Ping

// Martin



More information about the ffmpeg-devel mailing list