[FFmpeg-devel] [PATCH] libavutil: add clean aperture (CLAP) side data.
Neil Birkbeck
neil.birkbeck at gmail.com
Sat May 30 00:41:35 EEST 2020
On Mon, May 11, 2020 at 9:37 PM Neil Birkbeck <neil.birkbeck at gmail.com>
wrote:
>
>
> On Wed, May 6, 2020 at 8:45 AM James Almer <jamrial at gmail.com> wrote:
>
>> On 5/6/2020 12:22 PM, Neil Birkbeck wrote:
>> > On Tue, May 5, 2020 at 5:11 AM Kieran O Leary <kieran.o.leary at gmail.com
>> >
>> > wrote:
>> >
>> >> Hi,
>> >>
>> >> I broke the threading with my last reply, i apologise. Here goes
>> another
>> >> attempt:
>> >>
>> >> On Tue, Apr 28, 2020 at 6:23 PM Neil Birkbeck <neil.birkbeck at gmail.com
>> >
>> >> wrote:
>> >>
>> >>> On Tue, Apr 28, 2020 at 3:18 AM Nicolas George <george at nsup.org>
>> wrote:
>> >>>
>> >>>> Andreas Rheinhardt (12020-04-28):
>> >>>>> That's expected. The patch provided only provides the structure in
>> >>> which
>> >>>>> the values are intended to be exported; it does not add any demuxing
>> >> or
>> >>>>> muxing capabilities for mov or mkv (as you can see from the fact
>> that
>> >>>>> none of these (de)muxers have been changed in the patch).
>> >>>>
>> >>>> Which is something I intended to comment on: adding code without
>> users
>> >>>> is rarely a good idea. I suggest we do not commit until at least one
>> >>>> demuxer use it, preferably at least two. Otherwise, we may realize
>> that
>> >>>> “oh crap, it doesn't work” because of a tiny unforeseen detail.
>> >>>
>> >>>
>> >>> Thanks for the feedback. I also have patches for the demux (MOV/MKV)
>> and
>> >>> mux (MOV/MKV).
>> >>>
>> >>> As there is still the alternative of using the fields in the
>> >>> AVCodecParameters/AVCodecContext, my intention was to keep the first
>> >> patch
>> >>> small to resolve discussion on that point.
>> >>>
>> >>> I've included the patches, if you'd like to try test it, Kieren. I
>> see on
>> >>> your test file that there may be some slight rounding error making
>> output
>> >>> crop 704 not 703 (MKV file ends up with pixel_crop_{left,right} = 8).
>> >>>
>> >>> /ffprobe ../testdata/clap.mov 2>&1 | grep -A1 "Side"
>> >>> Side data:
>> >>> Clean aperture:[width 41472/59 height:576/1 h_offset:0/1
>> >>> v_offset:0/1]
>> >>> ./ffmpeg -i ../testdata/clap.mov -vcodec copy -acodec copy
>> /tmp/clap.mkv
>> >>> ./ffprobe /tmp/clap.mkv 2>&1 | grep -A1 "Side"
>> >>> Side data:
>> >>> Clean aperture:[width 704/1 height:576/1 h_offset:0/1
>> v_offset:0/1]
>> >>>
>> >>
>> >> I have to look deeper into the MKV side of things and most likely
>> raise it
>> >> with the cellar mailing list so that better minds than mine can weigh
>> in. I
>> >> do see that the rounding up to 704 could be an issue alright.
>> >> As for MOV, your patch appears to generate the same output clap values
>> as
>> >> the input, so that's really great! command line and mediainfo trace
>> below:
>> >>
>> >
>> > Thanks for testing, Kieran and for linking the discussion on the cellar
>> > list.
>> >
>> > Any additional thoughts from ffmpeg devs on container-level SideData vs
>> > adding the extra fields into AVCodecParameters/AVCodecContext and
>> plumbing
>> > into the frame instead? I anticipate some situations where there can be
>> > interaction between cropping in bitstream and container-level cropping.
>> > Maybe the best way forward is for me to share some sample patches for
>> the
>> > alternative to validate whether it supports the various use cases
>> > (transmux, decode+crop, any other application level handling), and to
>> > confirm the interface changes for the structs.
>>
>> One option could be to also introduce a frame side data for this new
>> struct and have it replace the AVFrame fields, which would then be
>> deprecated (But of course keep working until removed).
>> This would allow to either inject stream side data from clap atoms and
>> Matroska crop fields into packets (Which would then be propagated to
>> frames to be applied during decoding), or use and export the bitstream
>> cropping information as it's the case right now, all while preventing
>> the addition of new fields to AVCodecParameters.
>>
>>
> I agree that sharing the SideData with the frame could be nice for the
> above reasons. I guess any interaction or conflict between bitstream &
> container cropping could then get resolved at the decoder (e.g., assuming
> that say SPS from h264 and container-level cropping should be composed).
>
>
>> I would like a developer that makes use of this feature to also comment,
>> especially seeing how the AVFrame fields and this clap side data are
>> pretty different.
>>
>
> The current CleanAperture representation was in part motivated to 1) keep
> the representation capable of representing the CLAP atom (same
> representation and rationals), and 2) to make it unambiguous that this was
> container-level stream metadata. The representation is a tiny bit more
> tedious when trying to actually perform a crop (e.g., to extract the
> top-left corner offset). If sharing as AVFrame side data, a representation
> closer to AVFrame's crop_{top,bottom,left,right} may be more natural.
>
> Within ffmpeg, I see that the existing codecs using AVFrame cropping
> include:
> -libavcodec/h264_slice.c: propagating the crop_fields from the SPS in h264
> in
> -libavcodec/hevc_refs.c: (sly to h264)
> -libavcodec/agm.c: crop_{left, top} inferred from
> avctx->coded_{width,height} - avctx->{width,height}
> -libavcodec/videotoolbox.c: crop fields explicitly set to zero.
> -libvacodec/vp3.c: crop_{left,top} set from coded bitstream’s offset_{x,y}
> (right,bottom from coded_{width,height})
>
> Upon review, I see there is one other format that appears to export
> similar cropping information into key/value pair metadata that could
> potentially also use the stream-level side data: (libavformat/cinedec.c:
> set_metadata_int(&st->metadata, "crop_left", avio_rl32(pb), 1))
>
I've changed the side data to be PixelCrop (instead of CleanAperture) given
the intent to reuse as cropping elsewhere.
-For now, I've kept the rational representation--although CLAP seems to be
the only required case of it. Maybe Kieran could comment on the requirement
of having maintaining the rationals for CLAP (only works on mov to mov
transmuxing).
Any other feedback on the representation?
To test crop on decode behavior, I've passed the metadata through to frame
data. Two minor observations:
1) With the current implementation of inject_global_side_data, the metadata
only makes it onto the first frame. Will need a special case to make sure
the metadata makes it onto every frame in libavformat/utils.c.
2) Cropping in libavutil/frame.c is adjusting frame pointers and by default
needs to respect alignment (unlike say the autorotate approach of adding a
vf_crop filter in ./fftools/ffmpeg_filter.c)
I also have updates to the demuxer/muxer patches, including incorporating
Andreas' MKV feedback, but will wait sending patches until we've resolved
the representation.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-libavutil-add-pixel-crop-clean-aperture-CLAP-side-da.patch
Type: text/x-patch
Size: 8956 bytes
Desc: not available
URL: <https://ffmpeg.org/pipermail/ffmpeg-devel/attachments/20200529/367528f8/attachment.bin>
More information about the ffmpeg-devel
mailing list