[FFmpeg-devel] [PATCH] Allow mpjpeg demuxer to process MIME parts which do not include Content-Length header.
Alex Agranovsky
alex at sighthound.com
Tue Nov 24 21:01:28 CET 2015
--
Alex Agranovsky
Sighthound, Inc
www.sighthound.com
On November 24, 2015 at 12:36:30 PM, wm4 (nfxjfg at googlemail.com) wrote:
On Tue, 24 Nov 2015 11:39:07 -0500
Alex Agranovsky <alex at sighthound.com> wrote:
> On November 24, 2015 at 10:32:47 AM, wm4 (nfxjfg at googlemail.com) wrote:
>
> On Tue, 24 Nov 2015 00:16:06 -0500
> Alex Agranovsky <alex at sighthound.com> wrote:
>
> > From a2a0b9e0da14b6e82aa783535ec1878c8ffbede0 Mon Sep 17 00:00:00 2001
> > From: Alex Agranovsky <alex at sighthound.com>
> > Date: Tue, 24 Nov 2015 00:06:14 -0500
> > Subject: [PATCH 1/2] Allow mpjpeg demuxer to process MIME parts which do not
> > include Content-Length header.
> >
> > Fixes ticket 5023
> >
> > Signed-off-by: Alex Agranovsky <alex at sighthound.com>
> > ---
> > libavformat/mpjpegdec.c | 176 ++++++++++++++++++++++++++++++++++++------------
> > 1 file changed, 133 insertions(+), 43 deletions(-)
> >
> > diff --git a/libavformat/mpjpegdec.c b/libavformat/mpjpegdec.c
> > index 2749a48..e494f1a 100644
> > --- a/libavformat/mpjpegdec.c
> > +++ b/libavformat/mpjpegdec.c
> > @@ -23,22 +23,16 @@
> >
> > #include "avformat.h"
> > #include "internal.h"
> > +#include "avio_internal.h"
> >
> > -static int get_line(AVIOContext *pb, char *line, int line_size)
> > -{
> > - int i = ff_get_line(pb, line, line_size);
> >
> > - if (i > 1 && line[i - 2] == '\r')
> > - line[i - 2] = '\0';
> >
> > - if (pb->error)
> > - return pb->error;
> > -
> > - if (pb->eof_reached)
> > - return AVERROR_EOF;
> > -
> > - return 0;
> > -}
> > +/** Context for demuxing an MPJPEG stream. */
>
> This comment is really not needed.
> Will be removed in follow up submission.
>
>
>
> > +typedef struct MPJPEGDemuxContext {
> > + char* boundary;
> > + char* searchstr;
> > + int searchstr_len;
> > +} MPJPEGDemuxContext;
> >
> >
> > static void trim_right(char* p)
> > @@ -46,13 +40,32 @@ static void trim_right(char* p)
> > char *end;
> > if (!p || !*p)
> > return;
> > - end = p + strlen(p) - 1;
> > - while (end != p && av_isspace(*end)) {
> > + int len = strlen(p);
> > + if ( !len )
> > + return;
> > + end = p + len - 1;
> > + while (end >= p && av_isspace(*end)) {
>
> Why this change? Note that strlen(p)>0 is already guaranteed by the
> "!*p" check before.
>
>
> Consider input of a single ‘\r’. It will never get trimmed with the old code.
I don't really see how most of the changes fix this case. The only
change that does anything is replacing the != operator with >= . Which
is invalid as I see just now: it would set end to p-1, which AFAIK is
undefined behavior.
Setting end to p-1 would force us to exit the loop, no? We’re just comparing two pointer values, without dereferencing them.
I do see that length check is redundant with !*p, so removing that.
> > *end = '\0';
> > end--;
> > }
> > }
> >
> > +static int get_line(AVIOContext *pb, char *line, int line_size)
> > +{
> > + ff_get_line(pb, line, line_size);
> > +
> > + if (pb->error)
> > + return pb->error;
> > +
> > + if (pb->eof_reached)
> > + return AVERROR_EOF;
> > +
> > + trim_right(line);
> > + return 0;
> > +}
> > +
> > +
> > +
> > static int split_tag_value(char **tag, char **value, char *line)
> > {
> > char *p = line;
> > @@ -86,12 +99,24 @@ static int split_tag_value(char **tag, char **value, char *line)
> > return 0;
> > }
> >
> > -static int parse_multipart_header(AVIOContext *pb, void *log_ctx);
> > +static int parse_multipart_header(AVIOContext *pb,
> > + int* size,
> > + const char* expected_boundary,
> > + void *log_ctx);
> > +
> > +static int mpjpeg_read_close(AVFormatContext *s)
> > +{
> > + MPJPEGDemuxContext *mpjpeg = s->priv_data;
> > + av_freep(&mpjpeg->boundary);
> > + av_freep(&mpjpeg->searchstr);
> > + return 0;
> > +}
> >
> > static int mpjpeg_read_probe(AVProbeData *p)
> > {
> > AVIOContext *pb;
> > int ret = 0;
> > + int size = 0;
> >
> > if (p->buf_size < 2 || p->buf[0] != '-' || p->buf[1] != '-')
> > return 0;
> > @@ -100,7 +125,7 @@ static int mpjpeg_read_probe(AVProbeData *p)
> > if (!pb)
> > return 0;
> >
> > - ret = (parse_multipart_header(pb, NULL)>0)?AVPROBE_SCORE_MAX:0;
> > + ret = ( parse_multipart_header(pb, &size, "--", NULL) > 0 ) ? AVPROBE_SCORE_MAX : 0;
>
> A stray space got in.
> Parameters to parse_multipart_header had changed.
Yes, but the 2 additional spaces after the ( and before the ) are not
consistent with the rest of the coding style. (Though I admit the
FFmpeg code has some consistency problems, e.g. the line you changed
was missing some spaces.)
While these cosmetic issues are not that important, I think at least
some effort should be put into not making it look worse.
Agreed. It takes effort, though, to switch to a specific project’s style, so I apologize for having to go through that. I’ll try to be more conscious of specific style nuances in the future.
>
>
>
>
>
> >
> > av_free(pb);
> >
> > @@ -110,17 +135,19 @@ static int mpjpeg_read_probe(AVProbeData *p)
> > static int mpjpeg_read_header(AVFormatContext *s)
> > {
> > AVStream *st;
> > - char boundary[70 + 2 + 1];
> > + char boundary[70 + 2 + 1] = {0};
> > int64_t pos = avio_tell(s->pb);
> > int ret;
> >
> > + do {
> > + ret = get_line(s->pb, boundary, sizeof(boundary));
> > + if (ret < 0)
> > + return ret;
> > + } while (!boundary[0]);
> >
> > - ret = get_line(s->pb, boundary, sizeof(boundary));
> > - if (ret < 0)
> > - return ret;
> > -
>
> Can you explain why it suddenly has to skip multiple lines?
> MIME standard, as old as it is, is poorly implemented by many camera manufacturers. In the last few weeks, I have seen things ranging from not sending a proper boundary, to not including a CRLF after a body part, to including multiples. When we process the headers, it is assumed body part had been consumed, and we need to get to the next non-empty lines. It is solving the same problem as 18f9308e6a96bbeb034ee5213a6d41e0b6c2ae74, just the other manifestation of it.
>
OK.
>
>
>
> > - if (strncmp(boundary, "--", 2))
> > + if (strncmp(boundary, "--", 2)) {
> > return AVERROR_INVALIDDATA;
> > + }
>
> Another change that looks like a stray change.
> Will remove in follow-up submission.
>
>
>
> >
> > st = avformat_new_stream(s, NULL);
> > if (!st)
> > @@ -147,28 +174,44 @@ static int parse_content_length(const char *value)
> > return val;
> > }
> >
> > -static int parse_multipart_header(AVIOContext *pb, void *log_ctx)
> > +static int parse_multipart_header(AVIOContext *pb,
> > + int* size,
> > + const char* expected_boundary,
> > + void *log_ctx)
> > {
> > char line[128];
> > int found_content_type = 0;
> > - int ret, size = -1;
> > + int ret;
> > +
> > + *size = -1;
> >
> > // get the CRLF as empty string
> > ret = get_line(pb, line, sizeof(line));
>
> > - if (ret < 0)
> > + if (ret < 0) {
> > return ret;
> > + }
>
> Another stray change?
> Will remove in follow-up submission.
>
>
>
> >
> > /* some implementation do not provide the required
> > * initial CRLF (see rfc1341 7.2.1)
> > */
> > - if (!line[0]) {
> > + while (!line[0]) {
> > ret = get_line(pb, line, sizeof(line));
> > - if (ret < 0)
> > + if (ret < 0) {
> > return ret;
> > + }
> > }
>
> Again, why does it have to skip multiple lines now? The comment
> indicates that skipping 1 line (instead of nothing) is a workaround for
> some buggy servers.
>
>
>
> Please see above.
>
> >
> > - if (strncmp(line, "--", 2))
> > + if ( strncmp(line, expected_boundary, strlen(expected_boundary) ) ) {
>
> (Maybe could use av_strstart. Also, weird spacing.)
> Going to replace with av_strstart
>
>
>
>
>
> > + if (log_ctx) {
> > + av_log(log_ctx,
> > + AV_LOG_ERROR,
> > + "Expected boundary '%s' not found, instead found a line of %lu bytes\n",
> > + expected_boundary,
> > + strlen(line) );
> > + }
> > +
> > return AVERROR_INVALIDDATA;
> > + }
> >
> > while (!pb->eof_reached) {
> > char *tag, *value;
> > @@ -201,32 +244,77 @@ static int parse_multipart_header(AVIOContext *pb, void *log_ctx)
> > } else
> > found_content_type = 1;
> > } else if (!av_strcasecmp(tag, "Content-Length")) {
> > - size = parse_content_length(value);
> > - if (size < 0)
> > - return size;
> > + *size = parse_content_length(value);
> > }
> > }
> >
> > - if (!found_content_type || size < 0) {
> > - return AVERROR_INVALIDDATA;
> > - }
> > -
> > - return size;
> > + return (found_content_type) ? 0 : AVERROR_INVALIDDATA;
> > }
> >
> > +
> > static int mpjpeg_read_packet(AVFormatContext *s, AVPacket *pkt)
> > {
> > - int ret;
> > - int size = parse_multipart_header(s->pb, s);
> > + int size;
> > + MPJPEGDemuxContext *mpjpeg = s->priv_data;
> > + if ( mpjpeg->boundary == NULL ) {
> > + mpjpeg->boundary = av_strdup("--");
> > + mpjpeg->searchstr = av_strdup("\r\n--");
> > + mpjpeg->searchstr_len = strlen( mpjpeg->searchstr );
>
> Missing checks for memory allocation failure.
>
> Also, this is the only place where these fields are set. And they're
> set to constant strings. I see no reason why these fields exist, and
> they certainly don't need to be strdup'ed?
>
> My understanding of this MIME part encoding is very limited, but I
> thought you need to read the boundary as it appears in the header, and
> use that to find the end of the data? Was this forgotten?
>
>
> Let me elaborate on this.
> 1) You are correct — boundary needs to be compared to that provided in HTTP response’s Content-Type.
> However, current code doesn’t compare it (only using the ‘—‘ as the boundary check), so I have kept things as is for this patch.
> 2) That functionality ready to be submitted as a separate patch, once this one is cleared.
Oh I see. Please send the second patch as well then. (It's common to
send patch series which depend on another at once.)
Attaching 2nd patch.
> 3) As I’ve mentioned, some implementations fail to send correct boundaries, so even that future patch will have a compile-time
> flag to enable it, as it will cause regression for some cases.
It should try to deal with all cases at once, and if it's not possible,
it should be a runtime option (look for AVOption use in other
demuxers), not a compile time one.
Thanks for the tip, implemented using AVOption.
> 4) strdup is done to be consistent with the future patch (where we have to alloc the memory)
> 5) Sorry about the checks, those will be added in the follow up patch
>
> > + }
> > +
> > + int ret = parse_multipart_header(s->pb,
> > + &size,
> > + mpjpeg->boundary,
> > + s);
>
> Declaration after statements; apparently we don't like this, and it's
> preferred to put all declarations on top of each block.
>
>
> Correcting.
>
> >
> > - if (size < 0)
> > - return size;
> >
> > - ret = av_get_packet(s->pb, pkt, size);
> > if (ret < 0)
> > return ret;
> >
> > - return 0;
> > + if ( size > 0 ) {
> > + /* size has been provided to us in MIME header */
> > + ret = av_get_packet(s->pb, pkt, size);
> > + } else {
> > + /* no size was given -- we read until the next boundary or end-of-file */
> > +
> > + const int read_chunk = 2048;
> > + av_init_packet(pkt);
> > + pkt->data = NULL;
> > + pkt->size = 0;
> > + pkt->pos = avio_tell(s->pb);
> > +
> > + /* we may need to return as much as all we've read back to the buffer */
> > + ffio_ensure_seekback( s->pb, read_chunk );
> > +
> > +
> > + int remaining = 0, len;
> > +
> > + while ( ( ret = av_append_packet( s->pb, pkt, read_chunk - remaining ) ) >= 0 ) {
> > + /* scan the new data */
> > + len = ret + remaining;
> > + char* start = pkt->data + pkt->size - len;
> > + do {
> > + if ( !memcmp( start, mpjpeg->searchstr, mpjpeg->searchstr_len ) ) {
>
> How is it guaranteed that start is valid for at least searchstr_len
> bytes?
>
>
> with } while ( len >= mpjpeg->searchstr_len );
>
> len decrement occurs for each start increment
>
>
> > + // got the boundary! rewind the stream
> > + avio_seek( s->pb, -(len-2), SEEK_CUR );
> > + pkt->size -= (len-2);
> > + return pkt->size;
> > + }
> > + len--;
> > + start++;
> > + } while ( len >= mpjpeg->searchstr_len );
>
> Couldn't this use strstr? (Just a thought. Would assume that the
> boundary itself ca not contain 0 bytes.)
>
>
> We’re operation on binary data, so can’t use string functions.
>
> > + remaining = len;
> > + }
> > +
> > + /* error or EOF occurred */
> > + if ( ret == AVERROR_EOF ) {
> > + ret = ( pkt->size > 0 ) ? pkt->size : AVERROR_EOF;
>
> Unneeded ( ).
> Correcting.
>
>
>
> > + } else {
> > + av_packet_unref(pkt);
> > + }
> > + }
> > +
> > + return ret;
> > }
>
> The spacing in this whole function is a bit weird and inconsistent with
> the rest of the code, I think.
>
>
> Correcting some (mostly spaces after ‘(' and before ‘)' … let me know if anything else has to be done)
>
I think tools/patcheck checks for some of these, but not sure.
> >
> > AVInputFormat ff_mpjpeg_demuxer = {
> > @@ -234,7 +322,9 @@ AVInputFormat ff_mpjpeg_demuxer = {
> > .long_name = NULL_IF_CONFIG_SMALL("MIME multipart JPEG"),
> > .mime_type = "multipart/x-mixed-replace",
> > .extensions = "mjpg",
> > + .priv_data_size = sizeof(MPJPEGDemuxContext),
> > .read_probe = mpjpeg_read_probe,
> > .read_header = mpjpeg_read_header,
> > .read_packet = mpjpeg_read_packet,
> > + .read_close = mpjpeg_read_close
> > };
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0001-Allow-mpjpeg-demuxer-to-process-MIME-parts-which-do-.patch.v3.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151124/0d428d9c/attachment.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0002-If-available-use-the-actual-boundary-in-HTTP-respons.patch.v3.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151124/0d428d9c/attachment-0001.txt>
More information about the ffmpeg-devel
mailing list