[FFmpeg-user] bwdif filter question

Tue Sep 22 21:11:12 EEST 2020

On 09/22/2020 04:20 AM, Edward Park wrote:

>> Not so, Ted. The following two definitions are from the glossary I'm preparing (and which cites H.262).
> 
> Ah okay I thought that was a bit weird, I assume it was a typo but I saw h.242 and thought two different types of "frames" were being mixed. Before saying anything if the side project you mentioned was a layman’s glossary type reference material, I think you should base it off of the definitions section instead of the bitstream definitions, just my $.02.

H.242 was indeed a typo ...Oh, wait! Doesn't (H.222+H.262)/2 = H.242?. :-)

I'm not sure what you mean by "definitions section", but I don't believe in "layman's" glossaries. I 
believe novices can comprehend structures at a codesmith's level if the structures are precisely 
represented. The novices who can't comprehend the structures need to learn. If they don't want to 
learn, then they're not really serious. This video processing stuff is for serious people. That 
written, what is not reasonable, IMHO, is to expect novices to learn codesmith-jargon and 
codesmith-shorthand. English has been around for a long time and it includes everything that is needed.

I would show you some of my mpegps parser documentation and some of my glossary stuff, but 90% of it 
is texipix diagrams and/or spreadsheet-style, plaintext tables that are formatted way wider than 70 
characters/line, so won't paste into email.

-snip-
>> Since you capitalize "AVFrames", I assume that you cite a standard of some sort. I'd very much like to see it. Do you have a link?
> 
> This was the main info I was trying to add, it's not a standard of any kind, quite the opposite, actually, since technically its declaration could be changed in a single commit, but I don't think that is a common occurrence. AVFrame is a struct that is used to abstract/implement all frames in the many different formats ffmpeg handles. it is noted that its size could change as fields are added to the struct.
> 
> There's documentation generated for it here: https://www.ffmpeg.org/doxygen/trunk/structAVFrame.html

Oh, Thank You! That's going to help me to communicate/discuss with the developers.

>> H.262 refers to "frame pictures" and "field pictures" without clearly delineating them. I am calling them "pictures" and "halfpictures".
> 
> I thought ISO 13818-2 was basically the identical standard, and it gives pretty clear definitions imo, here are some excerpts. (Wall of text coming up… standards are very wordy by necessity)

--snip 13818-2 excerpts--

To me, that's all descriptions, not definitions. To me, it's vague and ambiguous. To me, it's sort 
of nebulous.

Standards don't need to be wordy. The *more* one writes, the greater the chance of mistakes and 
ambiguity. Write less, not more.

Novices aren't dumb, they're just ignorant. By your use of "struct" in your reply, I take it that 
you're a 'C' codesmith -- I write assy & other HLL & hardware description languages like VHDL & 
Verilog, but I've never written 'C'. I've employed 'C' codesmiths, therefore, I'm a bit conversant 
with 'C', but just a bit.

What I've noted is that codesmiths generally don't know how to write effective English. Writing well 
constructed English is difficult and time consuming at first, as difficult as learning how to 
effectively use any syntax that requires knowledge and experience. There are clear rules but most 
codesmiths don't know them, especially if English is their native language. They write like they 
speak: conversationally. And when others don't understand what's written, rather than revise 
smaller, the grammar-challenged revise longer thinking that yet-another-perspective is what's 
needed. That produces ambiguity because different perspectives promotes ambiguity. IMHO, there 
should solely be just one perspective: structure. Usage is the place for description but that's not 
(or shouldn't be) in the domain of a glossary.

> So field pictures are decoded fields, and frame pictures are decoded frames? Not sure if I understand 100% but I think it’s pretty clear, “two field pictures comprise a coded frame.” IIRC field pictures aren’t decoded into separate fields because two frames in one packet makes something explode within FFmpeg

Well, packets are just how transports chop up a stream in order to send it, piecewise, via a 
packetized media. They don't matter. I think that, for mpegps, start at 'sequence_header_code' (i.e. 
x00 00 01 B3) and proceed from there, through the transport packet headers, throwing out the packet 
headers, until encountering the next 'sequence_header_code' or the 'sequence_end_code' (i.e. x00 00 
01 B7).

I don't know how frames are passed from the decoder to a filter inside ffmpeg. I don't know whether 
the frames are in the form of decoded samples in a macroblock structure or are just a glob of bytes. 
Considering the differences between 420 & 422 & 444, I think that the frames passed from the decoder 
must have some structure. I just don't know what it is.

What I do know (for sure) from my study of macroblocks is that fields do not exist as discrete 
structures. Fields exist piecewise as "fields" (in the programming/spreadsheet sense) within frames. 
"Frame" is the only real structure. "Field" is an abstraction. What I do know (for sure) is that 
fields exist solely as raw halfpictures or raw scans prior to encoding or subsequent to decoding.

A concurrent frame (aka "progressive") holds a raw picture (e.g. 720x576) or 2 halfpictures (e.g. 
720x288) that have been unweaved from the same picture. What to name a frame containing 2 
halfpictures unweaved from differing pictures (i.e. telecine added frames) is to be decided.
A scan frame (aka "interlaced"?) holds 2 scans (e.g. 720x288) that were created as successive fields 
and that exist solely as fields. Oh, they can be weaved to make a picture, but it's not a picture 
that previously existed and it will not look good due to combing (caused by the nonconcurrent nature 
of scans).

It all seems so simple, and it is simple, but it's been made confusing by the ambiguous use of 
general names for things plus a certain lack of specificity regarding structures plus a resulting 
dependence on context. I would really, really like to change that.

First off: Every unique structure should have a unique name and a clear depiction. For depiction I 
prefer diagrams and/or illustrations to written descriptions.

I could use some help in my efforts. Due to my ignorance, it's taking me weeks to figure out things 
that should be resolved in minutes. In 5 days, I'll be 74 years old. With corona virus and age, I 
don't know how much longer I'll be around, but I'm sure I can help the ffmpeg project if the 
principals in the project will just stop sniping at me and share knowledge.

Thank you for your oh-so valuable contributions. I will study AVFrame to see how I can use it to 
better communicate.

Regards,
Mark.