Good C header style

Question

My C headers usually resemble the following style to avoid multiple inclusion:

#ifndef <FILENAME>_H
#define <FILENAME>_H

// define public data structures / prototypes, macros etc.

#endif  /* !<FILENAME>_H */

However, in his Notes on Programming in C, Rob Pike makes the following argument about header files:

There's a little dance involving #ifdef's that can prevent a file being read twice, but it's usually done wrong in practice - the #ifdef's are in the file itself, not the file that includes it. The result is often thousands of needless lines of code passing through the lexical analyzer, which is (in good compilers) the most expensive phase.

On the one hand, Pike is the only programmer I actually admire. On the other hand, putting several #ifdefs in multiple source files instead of putting one #ifdef in a single header file feels needlessly awkward.

What is the best way to handle the problem of multiple inclusion?

score 11 · Accepted Answer · answered Mar 24 '11 at 14:13

11

In my opinion, use the method that requires less of your time (which likely means putting the #ifdefs in the header files). I don't really mind if the compiler has to work harder if my resulting code is cleaner. If, perhaps, you are working on a multi-million line code base that you constantly have to fully rebuild, maybe the extra savings is worth it. But in most cases, I suspect that the extra cost is not usually noticeable.

answered Mar 24 '11 at 14:13

Mark Wilkins

39,254
5
53
106

2

I find this answer very useful. Funny thing is that I agree in a more ironic way: if the computer can do my job, why I'll do it? ;) – DrBeco Mar 24 '11 at 14:20

score 6 · Answer 2 · answered Mar 24 '11 at 14:13

6

Keep doing what you do - It's clear, less bug-prone, and well known by compiler writers, so not as inefficient as it maybe was a decade or two ago.

You could use the non-standard #pragma once - If you search, there's probably at least a bookshelf's worth of include guards vs pragma once discussion, so I'm not going to recommend one over the other.

answered Mar 24 '11 at 14:13

Erik

80,488
12
185
183

If gcc didn't suck, it would detect the standard idiom the first time the header was included, remember the filename, and ignore all future requests to include that filename. This is purely a case of compiler laziness. – R.. GitHub STOP HELPING ICE Mar 24 '11 at 14:20
1

@R. It still doesn't? I'd have thought every major compiler would have this a decade ago. Then again, reading a file twice isn't really the resource hog it was in 89 when Pike wrote those notes. – Erik Mar 24 '11 at 14:22
2

@R..: it claims to do so - http://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html. However, there are some restrictions that it uses to ensure the optimization is valid, and there could be false negatives where the restrictions are broken, but nevertheless the optimization would be valid in that case. – Steve Jessop Mar 24 '11 at 15:06
1

Reading it twice is a resource hog. Just a single syscall takes more time, on modern machines, than parsing a small header file. – R.. GitHub STOP HELPING ICE Mar 24 '11 at 16:06

user7610 · Answer 3 · 2015-11-26T17:37:00.870

Pike wrote some more about it in https://talks.golang.org/2012/splash.article:

In 1984, a compilation of ps.c, the source to the Unix ps command, was observed to #include <sys/stat.h> 37 times by the time all the preprocessing had been done. Even though the contents are discarded 36 times while doing so, most C implementations would open the file, read it, and scan it all 37 times. Without great cleverness, in fact, that behavior is required by the potentially complex macro semantics of the C preprocessor.

Compilers have become quite clever since: https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html, so this is less of an issue now.

The construction of a single C++ binary at Google can open and read hundreds of individual header files tens of thousands of times. In 2007, build engineers at Google instrumented the compilation of a major Google binary. The file contained about two thousand files that, if simply concatenated together, totaled 4.2 megabytes. By the time the #includes had been expanded, over 8 gigabytes were being delivered to the input of the compiler, a blow-up of 2000 bytes for every C++ source byte.

As another data point, in 2003 Google's build system was moved from a single Makefile to a per-directory design with better-managed, more explicit dependencies. A typical binary shrank about 40% in file size, just from having more accurate dependencies recorded. Even so, the properties of C++ (or C for that matter) make it impractical to verify those dependencies automatically, and today we still do not have an accurate understanding of the dependency requirements of large Google C++ binaries.

The point about binary sizes is still relevant. Compilers (linkers) are quite conservative regarding stripping unused symbols. How to remove unused C/C++ symbols with GCC and ld?

In Plan 9, header files were forbidden from containing further #include clauses; all #includes were required to be in the top-level C file. This required some discipline, of course—the programmer was required to list the necessary dependencies exactly once, in the correct order—but documentation helped and in practice it worked very well.

This is a possible solution. Another possiblity is to have a tool that manages the includes for you, for example MakeDeps.

There is also unity builds, sometimes called SCU, single compilation unit builds. There are tools to help manage that, like https://github.com/sakra/cotire

Using a build system that optimizes for the speed of incremental compilation can be advantageous too. I am talking about Google's Bazel and similar. It does not protect you from a change in a header file that is included in a large number of other files, though.

Finally, there is a proposal for C++ modules in the works, great stuff https://groups.google.com/a/isocpp.org/forum/#!forum/modules. See also What exactly are C++ modules?

As seen in the first paragraph in your second quote, in as late as 2007 the problem is still prominent. If only more recent benchmarks are available. — Yufan Lou, Nov 26 '15 at 13:29
According to https://web.archive.org/web/20021218032155/http://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html, GCC could do the Multiple-Include Optimization already in late 2002. I wonder what compiler was Google using using in 2007. — user7610, Nov 26 '15 at 17:41

score 2 · Answer 4 · answered Mar 24 '11 at 14:15

The way you're currently doing it is the common way. Pike's method cuts a bit on compilation time, but with modern compilers probably not very much (when Pike wrote his notes, compilers weren't optimizer-bound), it clutters modules and its bug-prone.

You could still cut on multi-inclusion by not including headers from headers, but instead documenting them with "include <foodefs.h> before including this header."

score 1 · Answer 5 · answered Mar 24 '11 at 14:16

I recommend you put them in the source-file itself. No need to complain about some thousand needless parsed lines of code with actual PCs.

Additionally - it is far more work and source if you check every single header in every source-file that includes the header.

And you would have to handle your header-files different from default- and other third-party-headers.

score 1 · Answer 6 · answered Mar 24 '11 at 14:18

1

He may have had an argument the time he was writing this. Nowadays decent compilers are clever enough to handle this well.

answered Mar 24 '11 at 14:18

Jens Gustedt

72,200
3
92
164

1

@R., Hm, reference, no I don't have. I read that not too long time ago here on SO though, that include guards are treated by gcc equivalent to `#pragma once`. And BTW the lexical parsing of `#ifdef` to first see what of a file has to be parsed for real should not be too difficult, no? In any case, I never found for me that *this* was a bottleneck when compiling large code. I often do precompilation with `-E` to see what the preprocessing phase produces and this is not measurable by human eye. – Jens Gustedt Mar 24 '11 at 14:40

score 0 · Answer 7 · answered Mar 24 '11 at 14:29

I agree with your approach - as others have commented, its clearer, self-documenting, and lower maintenance.

My theory on why Rob Pike might have suggested his approach: He's talking about C, not C++.

In C++, if you have a lot of classes and you are declaring each one in its own header file, then you'll have a lot of header files. C doesn't really provide this kind of fine-grained structure (I don't recall seeing many single-struct C header files), and .h/.c file-pairs tend to be larger and contain something like a module or a subsystem. So, fewer header files. In that scenario Rob Pike's approach might work. But I don't see it as suitable for non-trivial C++ programs.

Good C header style

7 Answers7