[FFmpeg-devel] [PATCH] Add FITS Demuxer

Paras Chadha paraschadha18 at gmail.com
Thu Jul 20 21:37:43 EEST 2017


On Thu, Jul 20, 2017 at 11:17 PM, Nicolas George <george at nsup.org> wrote:

> Le primidi 1er thermidor, an CCXXV, Reimar Döffinger a écrit :
> > I am a bit unclear on the details, but my memory:
> > Well, they work better (only?) if the images are completely independent.
>
> > I am not sure this is entirely the case here, for example the first
> > image would have a different header it looks like from the muxer
> > patch.
>
> According to this mail from Paras Chadha:
> https://ffmpeg.org/pipermail/ffmpeg-devel/2017-July/213180.html
> there is no global header. A first frame that is not identical to the
> next one would count as a global header.
>
> I hope Paras Chadha can tell us more about this issue.
>

The only difference between first image and rest is that, the first image
will always contain the keyword, SIMPLE = T as first keyword while all
other images will have XTENSION = 'IMAGE ' as first keyword. Also, PCOUNT
and GCOUNT are mandatory in all image extensions but not in the first
image. Rest all keywords are same.


>
> > There is also the question: how would the encoding work?
> > Because if encoding will be using a muxer, having things symmetric by
> > having a demuxer is an advantage.
>
> If it is an image format (without global headers), it should work just
> the same as the other image formats, nothing more: the image2pipe muxer
> concatenates the packets.
>
> > I don't think we have the right frameworks for that. Other cases like
> > the MPEG video format parsers and decoders also duplicate that kind of
> > code.
>
> If we do not have the right framework to avoid code duplication, then we
> need to invent it. What you say here only means that sub-standard code
> have been accepted in the past, and that should not count as a
> precedent.
>
> Also, MPEG is a bit more important than FITS, so rushing things for it
> may have been acceptable.
>
> Still, in this particular case, I have reviewed the patches and I know
> the code enough to say that merging both functions would be quite easy
> even without a framework.
>
> > There might be some ways though, I am mostly thinking of an iterative
> > function that is called with a pointer to each 80 byte header buffer
> > and updates a state structure with the info it extracts from there.
> [1]
>
> Yes, exactly, that is one of the possibilities.
>
> > Would still get messy if the decoder needs quite different data than
> > the demuxer.
> > I have not yet checked how similar these functions actually are.
>
> I have looked at the code (enough to make a full review of a previous
> version), and I can say it is not the case. The information required by
> the demuxer is more or less a subset of the information required by the
> decoder.
>
> I say more or less because I just saw something fishy in the code:
>
> +    data_size += pcount;
> +    data_size *= (abs(bitpix) >> 3) * gcount;
>
> Paras Chadha, can you tell us why this is counted in the data size in
> the demuxer and not in the decoder?
>

It is because for image extensions, PCOUNT = 0 and GCOUNT = 1 are mandatory
in the header according to the standard. These keywords are not even
mandatory in the header of first image.
Anyways as i have told before, it is not necessary FITS file will contain
an image always. It is a general data storage and transport format too. It
supports various extensions - ASCII table extension and binary table
extension that are used to store and transport data. After that there is
something called Random Groups structure which is also used for the same
purpose. PCOUNT and GCOUNT can affect the size of data in those extensions.
In demuxer, i am skipping over these extensions and only sending the images
to the decoder. So, these keywords are taken into account in demuxer but
not in decoder.


>
> > Parsers work best when parsing requires very limited context.
> > My impression is that there is no fixed maximum header size, which can
> > easily make things really painful in a parser.
>
> Urgh, I thought the framework was better than that. I concede that
> turning the existing code into a parser would be more work than I
>
> Nonetheless, with the coding organization you suggested above (see [1]),
> adding buf[80] to the parsing structure would do the trick.
>
> But we still need Paras Chadha's knowledge about the possible
> differences in the first frame.
>

I have described them above


>
> > Also considering the complexity: I suspect that the header parsing can
> > be simplified a good bit, and maybe some common helper functions
> > extracted.
>
> I think so too.
>
> > I wouldn't insist on avoiding code duplication when trying to share it
> > might actually result in more complicated code. But it probably needs
> > some thinking to figure out what makes sense.
>
> Having looked at the code, I really think it would not lead to more
> complicated code.
>
> > Ok, I think it is possible to use the same code for those, for example
> > by doing this:
>
> > - in the demuxer (or parser, but that is harder because you cannot
> > just read what you need but have to deal with the data as it is
> > provided), get data until the end of the header.
>
> > The end of the header detection would be somewhat duplicated, but
> > should be fairly trivial.
>
> Or it could go the other way round: code the function to use an AVIO,
> and in the decoder create an AVIO that reads from the packet data.
>
> > - Then use a function that takes such a buffer and returns a
> > AVDictionary with name/value pairs. The decoder and demuxer can then
> > each decide what they want to do with them.
>
> Wait, what?!?
>
> The decoder makes an AVDictionary for the strings metadata that are
> returned as is in the frame, but apart from that, both the decoder and
> demuxer only use a few numeric fields: just return these fields in a
> structure.
>
> > There is one issue at least though: for files with very large
> > meta-data/header this would almost double memory usage in the demuxer
> > due to having copies in the dictionary.
>
> I would say many of our components have this kind of issue. I have not
> checked, but I think cases where a component reads a full block and then
> splits it into string metadata is very common and we never cared.
>
> > There is also to me the issue, while both parse the same things, what
> > these functions do is not at all the same semantically, at least
> > currently.
>
> > The demuxer only parses a select few keywords with well-known format
> > and as a result can e.g, use sscanf directly (not that I like using
> > sscanf personally).
>
> > The decoder one builts a metadata dictionary and thus needs a lot of
> > special-case handling like proper parsing, handling possibly new
> > keywords, deciding what to do with garbage (instead of just skipping),
> > handling escaping for the values, ...
>
> It does not seem very complicated to me: just enclose the parts that are
> needed by the decoder in "if (decoding) { ... }" blocks. Or even, do the
> parsing and discard the value in the demuxer, a few extraneous string
> comparisons or conversions to numeric are not a problem.
>
> I just looked at the code again, and it seems to me only the
> av_dict_set() calls would need to be disabled in the demuxer, the rest
> is just populating a structure with numeric values.
>
> > Personally if it was my job to get this in, I would start by
> > simplifying at least one of those functions while making sure to have
> > proper bounds checking and error handling to see where the complexity
> > really lies and how complex the problem actually is compared to how
> > the first try solution looks.
>
> Indeed.
>
> Regards,
>
> --
>   Nicolas George
>
> _______________________________________________
> ffmpeg-devel mailing list
> ffmpeg-devel at ffmpeg.org
> http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
>


More information about the ffmpeg-devel mailing list