[FFmpeg-devel] [PATCH] Allow mpjpeg demuxer to process MIME parts which do not include Content-Length header.

Alex Agranovsky alex at sighthound.com
Tue Nov 24 17:39:07 CET 2015





On November 24, 2015 at 10:32:47 AM, wm4 (nfxjfg at googlemail.com) wrote:

On Tue, 24 Nov 2015 00:16:06 -0500
Alex Agranovsky <alex at sighthound.com> wrote:

> From a2a0b9e0da14b6e82aa783535ec1878c8ffbede0 Mon Sep 17 00:00:00 2001
> From: Alex Agranovsky <alex at sighthound.com>
> Date: Tue, 24 Nov 2015 00:06:14 -0500
> Subject: [PATCH 1/2] Allow mpjpeg demuxer to process MIME parts which do not
> include Content-Length header.
> 
> Fixes ticket 5023
> 
> Signed-off-by: Alex Agranovsky <alex at sighthound.com>
> ---
> libavformat/mpjpegdec.c | 176 ++++++++++++++++++++++++++++++++++++------------
> 1 file changed, 133 insertions(+), 43 deletions(-)
> 
> diff --git a/libavformat/mpjpegdec.c b/libavformat/mpjpegdec.c
> index 2749a48..e494f1a 100644
> --- a/libavformat/mpjpegdec.c
> +++ b/libavformat/mpjpegdec.c
> @@ -23,22 +23,16 @@
>  
> #include "avformat.h"
> #include "internal.h"
> +#include "avio_internal.h"
>  
> -static int get_line(AVIOContext *pb, char *line, int line_size)
> -{
> - int i = ff_get_line(pb, line, line_size);
>  
> - if (i > 1 && line[i - 2] == '\r')
> - line[i - 2] = '\0';
>  
> - if (pb->error)
> - return pb->error;
> -
> - if (pb->eof_reached)
> - return AVERROR_EOF;
> -
> - return 0;
> -}
> +/** Context for demuxing an MPJPEG stream. */

This comment is really not needed.
Will be removed in follow up submission.



> +typedef struct MPJPEGDemuxContext {
> + char* boundary;
> + char* searchstr;
> + int searchstr_len;
> +} MPJPEGDemuxContext;
>  
>  
> static void trim_right(char* p)
> @@ -46,13 +40,32 @@ static void trim_right(char* p)
> char *end;
> if (!p || !*p)
> return;
> - end = p + strlen(p) - 1;
> - while (end != p && av_isspace(*end)) {
> + int len = strlen(p);
> + if ( !len )
> + return;
> + end = p + len - 1;
> + while (end >= p && av_isspace(*end)) {

Why this change? Note that strlen(p)>0 is already guaranteed by the
"!*p" check before.


Consider input of a single ‘\r’. It will never get trimmed with the old code.

> *end = '\0';
> end--;
> }
> }
>  
> +static int get_line(AVIOContext *pb, char *line, int line_size)
> +{
> + ff_get_line(pb, line, line_size);
> +
> + if (pb->error)
> + return pb->error;
> +
> + if (pb->eof_reached)
> + return AVERROR_EOF;
> +
> + trim_right(line);
> + return 0;
> +}
> +
> +
> +
> static int split_tag_value(char **tag, char **value, char *line)
> {
> char *p = line;
> @@ -86,12 +99,24 @@ static int split_tag_value(char **tag, char **value, char *line)
> return 0;
> }
>  
> -static int parse_multipart_header(AVIOContext *pb, void *log_ctx);
> +static int parse_multipart_header(AVIOContext *pb,
> + int* size,
> + const char* expected_boundary,
> + void *log_ctx);
> +
> +static int mpjpeg_read_close(AVFormatContext *s)
> +{
> + MPJPEGDemuxContext *mpjpeg = s->priv_data;
> + av_freep(&mpjpeg->boundary);
> + av_freep(&mpjpeg->searchstr);
> + return 0;
> +}
>  
> static int mpjpeg_read_probe(AVProbeData *p)
> {
> AVIOContext *pb;
> int ret = 0;
> + int size = 0;
>  
> if (p->buf_size < 2 || p->buf[0] != '-' || p->buf[1] != '-')
> return 0;
> @@ -100,7 +125,7 @@ static int mpjpeg_read_probe(AVProbeData *p)
> if (!pb)
> return 0;
>  
> - ret = (parse_multipart_header(pb, NULL)>0)?AVPROBE_SCORE_MAX:0;
> + ret = ( parse_multipart_header(pb, &size, "--", NULL) > 0 ) ? AVPROBE_SCORE_MAX : 0;

A stray space got in.
Parameters to parse_multipart_header had changed. 





>  
> av_free(pb);
>  
> @@ -110,17 +135,19 @@ static int mpjpeg_read_probe(AVProbeData *p)
> static int mpjpeg_read_header(AVFormatContext *s)
> {
> AVStream *st;
> - char boundary[70 + 2 + 1];
> + char boundary[70 + 2 + 1] = {0};
> int64_t pos = avio_tell(s->pb);
> int ret;
>  
> + do {
> + ret = get_line(s->pb, boundary, sizeof(boundary));
> + if (ret < 0)
> + return ret;
> + } while (!boundary[0]);
>  
> - ret = get_line(s->pb, boundary, sizeof(boundary));
> - if (ret < 0)
> - return ret;
> -

Can you explain why it suddenly has to skip multiple lines?
MIME standard, as old as it is, is poorly implemented by many camera manufacturers. In the last few weeks, I have seen things ranging from not sending a proper boundary, to not including a CRLF after a body part, to including multiples. When we process the headers, it is assumed body part had been consumed, and we need to get to the next non-empty lines. It is solving the same problem as 18f9308e6a96bbeb034ee5213a6d41e0b6c2ae74, just the other manifestation of it.





> - if (strncmp(boundary, "--", 2))
> + if (strncmp(boundary, "--", 2)) {
> return AVERROR_INVALIDDATA;
> + }

Another change that looks like a stray change.
Will remove in follow-up submission.



>  
> st = avformat_new_stream(s, NULL);
> if (!st)
> @@ -147,28 +174,44 @@ static int parse_content_length(const char *value)
> return val;
> }
>  
> -static int parse_multipart_header(AVIOContext *pb, void *log_ctx)
> +static int parse_multipart_header(AVIOContext *pb,
> + int* size,
> + const char* expected_boundary,
> + void *log_ctx)
> {
> char line[128];
> int found_content_type = 0;
> - int ret, size = -1;
> + int ret;
> +
> + *size = -1;
>  
> // get the CRLF as empty string
> ret = get_line(pb, line, sizeof(line));

> - if (ret < 0)
> + if (ret < 0) {
> return ret;
> + }

Another stray change?
Will remove in follow-up submission.



>  
> /* some implementation do not provide the required
> * initial CRLF (see rfc1341 7.2.1)
> */
> - if (!line[0]) {
> + while (!line[0]) {
> ret = get_line(pb, line, sizeof(line));
> - if (ret < 0)
> + if (ret < 0) {
> return ret;
> + }
> }

Again, why does it have to skip multiple lines now? The comment
indicates that skipping 1 line (instead of nothing) is a workaround for
some buggy servers.



Please see above.

>  
> - if (strncmp(line, "--", 2))
> + if ( strncmp(line, expected_boundary, strlen(expected_boundary) ) ) {

(Maybe could use av_strstart. Also, weird spacing.)
Going to replace with av_strstart





> + if (log_ctx) {
> + av_log(log_ctx,
> + AV_LOG_ERROR,
> + "Expected boundary '%s' not found, instead found a line of %lu bytes\n",
> + expected_boundary,
> + strlen(line) );
> + }
> +
> return AVERROR_INVALIDDATA;
> + }
>  
> while (!pb->eof_reached) {
> char *tag, *value;
> @@ -201,32 +244,77 @@ static int parse_multipart_header(AVIOContext *pb, void *log_ctx)
> } else
> found_content_type = 1;
> } else if (!av_strcasecmp(tag, "Content-Length")) {
> - size = parse_content_length(value);
> - if (size < 0)
> - return size;
> + *size = parse_content_length(value);
> }
> }
>  
> - if (!found_content_type || size < 0) {
> - return AVERROR_INVALIDDATA;
> - }
> -
> - return size;
> + return (found_content_type) ? 0 : AVERROR_INVALIDDATA;
> }
>  
> +
> static int mpjpeg_read_packet(AVFormatContext *s, AVPacket *pkt)
> {
> - int ret;
> - int size = parse_multipart_header(s->pb, s);
> + int size;
> + MPJPEGDemuxContext *mpjpeg = s->priv_data;
> + if ( mpjpeg->boundary == NULL ) {
> + mpjpeg->boundary = av_strdup("--");
> + mpjpeg->searchstr = av_strdup("\r\n--");
> + mpjpeg->searchstr_len = strlen( mpjpeg->searchstr );

Missing checks for memory allocation failure.

Also, this is the only place where these fields are set. And they're
set to constant strings. I see no reason why these fields exist, and
they certainly don't need to be strdup'ed?

My understanding of this MIME part encoding is very limited, but I
thought you need to read the boundary as it appears in the header, and
use that to find the end of the data? Was this forgotten?


Let me elaborate on this. 
1) You are correct — boundary needs to be compared to that provided in HTTP response’s Content-Type. 
However, current code doesn’t compare it (only using the ‘—‘ as the boundary check), so I have kept things as is for this patch.
2) That functionality ready to be submitted as a separate patch, once this one is cleared.
3) As I’ve mentioned, some implementations fail to send correct boundaries, so even that future patch will have a compile-time
flag to enable it, as it will cause regression for some cases.
4) strdup is done to be consistent with the future patch (where we have to alloc the memory)
5) Sorry about the checks, those will be added in the follow up patch

> + }
> +
> + int ret = parse_multipart_header(s->pb,
> + &size,
> + mpjpeg->boundary,
> + s);

Declaration after statements; apparently we don't like this, and it's
preferred to put all declarations on top of each block.


Correcting.

>  
> - if (size < 0)
> - return size;
>  
> - ret = av_get_packet(s->pb, pkt, size);
> if (ret < 0)
> return ret;
>  
> - return 0;
> + if ( size > 0 ) {
> + /* size has been provided to us in MIME header */
> + ret = av_get_packet(s->pb, pkt, size);
> + } else {
> + /* no size was given -- we read until the next boundary or end-of-file */
> +
> + const int read_chunk = 2048;
> + av_init_packet(pkt);
> + pkt->data = NULL;
> + pkt->size = 0;
> + pkt->pos = avio_tell(s->pb);
> +
> + /* we may need to return as much as all we've read back to the buffer */
> + ffio_ensure_seekback( s->pb, read_chunk );
> +
> +
> + int remaining = 0, len;
> +
> + while ( ( ret = av_append_packet( s->pb, pkt, read_chunk - remaining ) ) >= 0 ) {
> + /* scan the new data */
> + len = ret + remaining;
> + char* start = pkt->data + pkt->size - len;
> + do {
> + if ( !memcmp( start, mpjpeg->searchstr, mpjpeg->searchstr_len ) ) {

How is it guaranteed that start is valid for at least searchstr_len
bytes?


with } while ( len >= mpjpeg->searchstr_len );

len decrement occurs for each start increment


> + // got the boundary! rewind the stream
> + avio_seek( s->pb, -(len-2), SEEK_CUR );
> + pkt->size -= (len-2);
> + return pkt->size;
> + }
> + len--;
> + start++;
> + } while ( len >= mpjpeg->searchstr_len );

Couldn't this use strstr? (Just a thought. Would assume that the
boundary itself ca not contain 0 bytes.)


We’re operation on binary data, so can’t use string functions.

> + remaining = len;
> + }
> +
> + /* error or EOF occurred */
> + if ( ret == AVERROR_EOF ) {
> + ret = ( pkt->size > 0 ) ? pkt->size : AVERROR_EOF;

Unneeded ( ).
Correcting.



> + } else {
> + av_packet_unref(pkt);
> + }
> + }
> +
> + return ret;
> }

The spacing in this whole function is a bit weird and inconsistent with
the rest of the code, I think.


Correcting some (mostly spaces after ‘(' and before ‘)' … let me know if anything else has to be done)

>  
> AVInputFormat ff_mpjpeg_demuxer = {
> @@ -234,7 +322,9 @@ AVInputFormat ff_mpjpeg_demuxer = {
> .long_name = NULL_IF_CONFIG_SMALL("MIME multipart JPEG"),
> .mime_type = "multipart/x-mixed-replace",
> .extensions = "mjpg",
> + .priv_data_size = sizeof(MPJPEGDemuxContext),
> .read_probe = mpjpeg_read_probe,
> .read_header = mpjpeg_read_header,
> .read_packet = mpjpeg_read_packet,
> + .read_close = mpjpeg_read_close
> };

_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel at ffmpeg.org
http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0001-Allow-mpjpeg-demuxer-to-process-MIME-parts-which-do-.patch.v2.txt
URL: <http://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20151124/3f03eb59/attachment.txt>


More information about the ffmpeg-devel mailing list