[FFmpeg-devel] [PATCH] Move cbrt tables to a separate cbrt_data(_fixed).c files.

Sun Mar 13 20:53:17 CET 2016

On 13.03.2016, at 19:11, Ganesh Ajjanagadde <gajjanag at gmail.com> wrote:

> On Sun, Mar 13, 2016 at 1:46 PM, Reimar Döffinger
> <Reimar.Doeffinger at gmx.de> wrote:
>> 
>>> 
>>> --enable-hardcoded-tables partially does that; it increases memory
>>> usage as tables are burnt into the library at some gains for
>>> initialization time.
>> 
>> No, exactly not. It increases disk usage, but it decreases memory
>> usage.
>> Tables are not loaded just because they are in the binary, not
>> when they are in .rodata.
>> If you are really lucky, even when the cbrt tables are used but only
>> a small part of them, with --enable-hardcoded-tables you'd not
>> even necessarily end up with all of them in RAM (admittedly
>> you can also see that as a disadvantage: the chances that
>> you end up with a page fault in the middle of decoding when
>> you don't really want it is there, too).
> 
> Correct me if I am wrong, but I don't think an OS/runtime system
> actually loads at the granularity of individual tables. It would load
> at the granularity of pages, there can be a greater chance of page
> faults with larger images, such as due to larger .rodata.

The cbrt tables are a lot larger than a page, so I considered that unnecessary details.
Also all tables, whether in .bss or .rodata share the page granularity issue.
"greater chance of page faults" is not strictly true as initialising the tables in .bss will cause page faults, too. But yes the the .rodata page faults need to go to disk and are more costly.
Unless you run a system supporting in-place execution...

>> Gaining initialization time is absolutely not the point of
>> that option, it doesn't make sense and I have no idea
>> how that myth came to be (it may admittedly be true
>> in some cases, but it's not the point or very relevant).
>> Just for AAC CBRT tables (which are small in comparison),
>> hardcoded tables saves 128 kB of RAM/swap once you no longer
>> use the AAC codecs compared to your patch (64 kB for the
>> double table + 2*32kB for the final tables + possibly
>> a little bit from not needing the initialization code).
> 
> Ok, so the tables are pulled into RAM on a as-needed basis. But then
> when are they offloaded? Savings (apart from the 64 kB for the double
> table) will only kick in at the time of offloading when the page is
> freed.

It will be swapped out normally, whenever the OS decides it has a better use for the RAM.
.rodata will be easier to swap as it is non-dirty (does not require a write), but the exact priority compared to e.g. cache pages is up to the OS. It could keep .rodata pages just as part of its generic file system cache if it wants...
Initialised .bss tables can of course be swapped out, too, but they require a write and the OS is less likely to do it, plus it requires you to have swap (systems designed to leave no traces cannot use swap for example, at least not with flash storage where securely erasing data is kind of not possible). If one were to dedicate sufficient craziness to it one could probably build a system that used munmap/madvise and SIGSEGV or user-space pagefault handling to combine the advantages of both. I certainly wouldn't recommend it though.
And as said, in addition sharing the .bss tables across instances tends to not be possible but is trivial for .rodata tables.