[FFmpeg-trac] #8590(undetermined:closed): 'telecine=pattern' error for p24, soft telecined sources

Mon Apr 6 07:44:45 EEST 2020

#8590: 'telecine=pattern' error for p24, soft telecined sources
-------------------------------------+-------------------------------------
             Reporter:  markfilipak  |                    Owner:
                 Type:  defect       |                   Status:  closed
             Priority:  normal       |                Component:
                                     |  undetermined
              Version:  unspecified  |               Resolution:  invalid
             Keywords:               |               Blocked By:
             Blocking:               |  Reproduced by developer:  0
Analyzed by developer:  0            |
-------------------------------------+-------------------------------------

Comment (by markfilipak):

 Replying to [comment:40 pdr0]:
 > Replying to [comment:39 markfilipak]:
 >
 > >
 > > Undersampling is really a vague term, isn't it? Is sampling film at a
 lower resolution than half the size of a silver halide grain
 undersampling? Or is undersampling resampling a digital image (or audio
 stream) at less than twice the frequency (or half the area) of the
 original samples. Dr. Nyquist would say that they are both "undersampled".
 > >
 >
 > Undersampling here is being used in the most simple sense. If you have
 100 samples of something, and you throw away half , now you have 50 .
 Which has more samples ? It's not a trick question..

 Sorry, but I don't know what you had in mind when you wrote "Undersampling
 here". To what does "here" refer? Are you referring to sampling film? Are
 you referring to resampling an image? Or are you referring to something
 else?

 I thought we were discussing interlace and aliasing. Instead of 100
 samples, throw away 50, how about a real case? Each picture on a 720x480
 DVD has 345600 samples. Are you saying that hard telecine with interlaced
 fields, for example, that divides the pictures into half-pictures with
 172800 samples each is throwing half the samples away? What video product
 throws half the pixels away?

 > > Answer for yourself: Is dividing a picture into 2 half-pictures really
 undersampling? Is it really sampling at all?

 > That's not undersampling , because you can reassemble 2 half picture to
 get the full original picture . This is not the case with interlace
 content

 Okay, I see this is a case of miscommunication. According to the MPEG
 spec, "interlaced" means i30-telecast. What I diagram as
 [1/-][-/2][3/-][-/4] etc.   ...TFF
 [-/1][2/-][-/3][4/-]        ...BFF

 But most people erroneously also call progressive video that has been
 unweaved into fields, "interlaced". What I diagram as
 [A/a][B/b][C/c][D/d] etc.                       ...source frames
 [A/-][-/a][B/-][-/b][C/-][-/c][D/-][-/d] etc.   ...interlaced fields

 > > Full temporal sampling? Life doesn't pass in 1/24th second increments.
 Film and video both undersample life. But of course that's not what you
 mean. :)
 > >
 >
 > > What you mean is that transcoding a 24fps stream to anything less than
 48fps is temporal subsampling, and in that, Dr. Nyquist and I would both
 agree with you.
 > >
 >
 > Sampling in this context is in terms of a reference point. It's
 relative.  If you have 24 fps, that's your original reference point .
 There are 24 pictures taken per second. If you discard half the motion
 samples, you now have 12 pictures/s. That is temporal undersampling with
 respect to the original 24 fps.

 Agreed.

 > > When you say that that transcoding 24fps to 24fps is not subsampling
 but merely dividing a picture into half-pictures is subsampling, are you
 being consistent?
 > >
 >
 > Yes it's consistent. Nothing is being discarded when you go 24fps to
 24fps (ignoring lossy compression for this discussion) . Dividing into
 half pictures - nothing is being discarded

 But that's not subsampling, is it?

 > BUT interlace content - now something is being discarded . Do you see
 the difference ?

 No. I don't. Each DVD picture has 345600 samples regardless of whether
 those 345600 samples are in one frame or two interlace fields.

 > > >... Each field has half the spatial information of a full progressive
 frame. ...
 > >
 > > Does that make it subsampling?
 > >
 >
 > For interlace content - yes it does. Half the spatial information is
 missing

 No, it's not. It's just formatted into fields instead of frames. Three's
 nothing missing.

 > For progressive content, arranged in fields (2:2 pulldown, or PsF) , no,
 because you can re-arrange it back to the original (ignoring lossy
 compression again)

 Now you are making sense to me, so I must be misinterpreting everything
 preceeding that paragraph.

 > > >...  A straight line becomes jagged dotted line when deinterlaced,
 because that field is resized to a full sized frame and only 50% the line
 samples are present. ...
 > >
 > > Now, I'm sure you know that fields -- I prefer to call them "half-
 pictures" -- aren't converted to frames without first reinterlacing -- I
 prefer to call it "reweaving" -- the half-picture-pairs to reconstruct the
 original pictures. And I'm sure you know that the only way a reconstructed
 picture has a jagged line is if the original film frame had a jagged line.
 >
 > You need to differentiate between interlaced content and progressive
 content.

 Now you seem to be going back to the definition of "interlace" used in the
 MPEG spec. If, by "interlaced", you mean the fields in a i30-telecast (or
 i25-telecast) -- "NTSC" & "PAL" do not actually exist in digital media --
 then each temporal field is discrete and has 172800 samples. Again,
 nothing is thrown away.

 The only process that I know of that takes one field and makes a frame of
 it is bobbing. Nothing is being thown away, even then, but the results
 indeed are half resolution. Bobbing certainly is the only sane way to turn
 telecast fields into frames. If that's the source of the line aliasing to
 which you refer, then I agree of course (and the discussion is so basic I
 can't see why we're even engaging in it), and I'm surprised because that
 is such an obscure application that has nothing to do with progressive
 sources like movies and, again, nothing is thrown away. So when you refer
 to something thrown away... throwing away half the pixels, I have no idea
 to what you refer. Sorry.

 > With progressive content, 2 field pairs come from the same frame and can
 be reassembled and weaved back to a full progressive frame. No problem.
 This happens everyday . eg. Progressive, PAL content is 2:2 ( broadcast,
 DVD's) , PSF . It can occur out of order too, the fields can be found
 later or earlier, then matched with it's proper field pair in the process
 of "field matching"

 I know that instead of "2 field pairs come" you meant to write "a field
 pair comes". I didn't know about field matching, but then, I know little
 of the internals of streams aside from layers and chroma and macroblocks.
 I do know MPEG PES quite well though, including the offsets to all the
 headers and their metadata and the header codes that are used to locate
 them in the stream.

 > But with interlaced ''content'', the 2 field pairs are different. You're
 missing 1/2 the spatial information. There is no matching field pair.

 Thank you for trying to put this discussion into the terms I favor. Let me
 help because I can tell you're confused.

 When I refer to field pair, I mean the 2 fields that contain the 2 half-
 pictures that have been unweaved from a single origin picture. By "origin"
 I mean the film frame (or digital cinema frame) that serves as the input
 to the mastering process. By "source" I mean the samples that are output
 by the mastering process. By "target" I mean the transformed images that
 result from transcoding.

 By "picture" I mean what is defined in the MPEG spec as "picture". I
 reserve "field" solely for telecast video only, whereas I use "field-pair"
 for progressive content that's been unweaved and interlaced.

 You see, having precise terms is important, and I'm pretty confident that
 you agree. I've seen so many discussions that degraded into abusive
 rhetoric due to misunderstandings caused by vague and/or ambiguous terms.
 But whenever I've advocated for better terms and have proposed more
 precise terms, I've been pretty viciously attacked. So now, I just use my
 terms and wait for people to ask me what I mean. I suppose that, if the
 baked-in video terminology (which is pretty awful) changes, then, like
 politics, the change will take a full generation.

 > That is where you get the twitter artifacts (you have spatial
 undersampling). This is also the case with your combed frame with your
 5555 pulldown . You have 2 fields that come from different times. Each is
 undersampled in terms of their own time. They are missing their "partner"
 field pair . Of course you can find it earlier or later in the stream,
 from field matching, but then you have to decide on which, top or bottom
 again, and you get your 3:2 or 2:3 frame repeats. If you instead weave, or
 try to treat is as progressive, you get back the combing.

 Pictures are better than words.

 |<--------------------------1/6s-------------------------->|
 [A/a__________][B/b__________][C/c__________][D/d__________]   ...p24
 origin
 [A/a_______][B/b_______][B/c_______][C/d_______][D/d_______]   ...2-3-2-3
 pull-down
                         <--------combed-------->
 [A/a_______][A/b_______][B/c_______][C/c_______][D/d_______]   ...3-2-3-2
 pull-down
             <--------combed-------->
 [A/a_][A/a_][A/b_][B/b_][B/b_][C/c_][C/c_][C/d_][D/d_][D/d_]   ...5-5-5-5
 pull-down
             <comb>                        <comb>

 > > So I assume that what you are describing is bobbing. But bobbing isn't
 undersampling either. If anything, bobbing is underdisplaying, don't you
 agree?
 > >
 > Historically, "bobbing" is just separating fields. When you view
 separate fields, each even, odd field has an offset. It looks like
 up/down/up/down or "bobbing" up and down. But the term has come to include
 both separating fields and resizing the fields to frames .
 >
 > So if you bob deinterlace progressive content - then yes you are causing
 undersampling (and resulting artifacts, like twitter) . Each field pair of
 the progressive frame can no longer be field matched properly. 24fps
 becomes 48fps and each frame will have aliasing artifacts

 Nobody bobs progressive content. Bring up such a silly case is pointless.

 > But if you bob deinterlace interlaced content, the undersampling was
 there to begin with . The process of making the interlace, dropping the
 spatial samples is what caused the undersampling

 You're talking about dropping samples again. For telecasts, the non-
 existent field never existed. The 'current' field can never be part of a
 field-pair. Nothing is dropped. it never existed.

 > > >... That line information is undersampled.  In motion, those lines
 appear to "twitter". Higher quality deinterlacing algorithms attempt to
 adaptively fill in the gaps and smooth everything over, so it appears as
 if it was a true progressive frame.
 > >
 > > My opinion is that deinterlacing algorithms should reweave the half-
 picture lines and nothing more. A user can insert more filters if more
 processing is desired, but trying to work around 'mystery' behavior of
 filters that do more than you think they do is crazy making.
 > >
 >
 > For progressive content, you do not deinterlace, because it damages the
 picture. You saw that in the very first example. You should field match.
 Deinterlacing is not the same thing as field matching
 >
 > For interlace content, you can't "reweave the half-picture lines" , or
 you get back the combing

 Those are not half-picture lines. They cannot be reweaved because they
 were never weaved. Those are 2 fields from 2 differing times. They can be
 weaved, but combing results. I know you know this. Do you see how better
 terms work better?

 > > > As motioned earlier, there are other causes, but low quality
 deinterlacing is the most common.  Other common ones are pixel binning ...
 > >
 > > You know, I've run across that term maybe once or twice... I don't
 know what it means.
 > >
 > > >... or sampling every nth pixel. Eg. large sensor DSLR's when
 shooting video mode.
 > >
 > > Is that undersampling or simply insufficient resolution?
 > >
 >
 > It's undersampling. The resolution of DSLR sensors is very high. 30-50
 megapixels on average. 120-150 Megapixel cameras available. 1920x1080 HD
 video is only ~2K . 2Megapixels. It should be massively oversampled for
 video. But because of overheating and processing requirements (it takes
 lots of CPU or hardware to do proper downscaling in realtime), they
 (especially 1st 5-6 generations of DSLRs) , drop every nth pixel when
 taking video. Some newer ones are better now, but this is a very common
 source of aliasing and line twitter

 Okay, I don't consider that undersampling, but you're right, regarding the
 resolution of the CCD, it is undersampling. You see, I don't consider the
 taking camera as part of the mastering process. To me, the origin is what
 comes out of the camera, not what could come out of the camera if the
 sensor was 100% utilized.

 > > You see, a word like "undersampling" can become so broad that it's
 utility as a word is lost.
 >
 > It can be if you want it to be, but it's just the most simple sense of
 the word here

 There is no simple, mutually-agreed sense/understanding/meaning of the
 word "undersample".

 > Stay safe

 Oh, yeah. You too. I'm 73 years old, I understand that almost all people
 my age and older who contract COVID-19 die. I have 3 months of supplies.

--
Ticket URL: <https://trac.ffmpeg.org/ticket/8590#comment:41>
FFmpeg <https://ffmpeg.org>
FFmpeg issue tracker