Practical CFLAGS considerations: -Os, -O2, and -O3

The CFLAGS settings that is fastest on your desktop might not be the best settings to use with your Linode.

Some people assume that -O3 will yield the fastest performance compared to -O2 or -Os. That assumption can end up wasting your time (and your cpu cycles). The only thing -O3 guarantees is larger binaries which increase the chances of both page faults and swapping which can outweigh any performance gains.

NOTE: Unlike swapping, page faults don't get reduced by installing more physical RAM. Only smaller binaries (or rearranging function locations within binaries) can reduce page faults. If I'm not mistaken, a page is only 4KB (thats the page size in Windows XP).

Try benchmarking your most frequently used programs using the concurrency levels you encounter during normal use. You might be surprised to find that -Os probably gives you the better performance than -O3 and sometimes even better than -O2 when you're on a linode.

-Os = most optimizations present in -O2, plus size optimizations

slightly slower code, but smaller size benefits speed too

3 Replies

Excellent advice, especially for a Linode where IO is usually the bottleneck.

actually, installing more ram does reduce the overall number of page faults. the reason is that with more ram available, the kernel's vm is able to keep more pages resident, rather than having to free them and then fault later if and when they are needed. i would bet that because you are using gentoo, you are accustomed to high page fault levels that don't noticably change by adding ram. the reason for that is because during compilation, you're constantly forcing the kernel vm to free pages to make way for new source code and object code to be loaded from disk, for large amounts of heap data used by the lalr parser, asm generator, and the linker that's used once and then discarded. emerging is a very vm-intense operation because the intermediate steps in the process produce so much intermediate data.

the rest of your analysis about binary size and code relocation effecting page faults is basically correct.

@inkblot:

actually, installing more ram does reduce the overall number of page faults. the reason is that with more ram available, the kernel's vm is able to keep more pages resident, rather than having to free them and then fault later if and when they are needed.

Keep in mind that page faults will happen regardless of physical RAM availability when new processes start up. So adding physical RAM on a system with sufficient RAM won't reduce these instances of page faults.

@inkblot:

i would bet that because you are using gentoo, you are accustomed to high page fault levels that don't noticably change by adding ram.

Debian 3.1 is the only Linux distro I'm currently using. It looks like cdbs will make controlling CFLAGS across multiple Debian packages easier.

http://packages.debian.org/unstable/devel/cdbs

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct