29

How do you maintain the #include statements in your C or C++ project? It seems almost inevitable that eventually the set of include statements in a file is either insufficient (but happens to work because of the current state of the project) or includes stuff that is no longer needed.

Have you created any tools to spot or rectify problems? Any suggestions?

I've been thinking about writing something that compiles each non-header file individually many times, each time removing an #include statement. Continue doing this until a minimal set of includes is achieved.

To verify that header files are including everything they need, I would create a source file that all it does is include a header file and try to compile it. If the compile fails, then the header file itself is missing an include.

Before I create something though, I thought I should ask here. This seems like a somewhat universal problem.

criddell
  • 13,189
  • 9
  • 39
  • 44
  • Good plan, unless you have conditional compilation in your source or headers - your tests might not result in full code coverage. – Mark Ransom Jun 18 '09 at 19:38
  • 1
    Why not just compile the header file itself? If you compiler supports it, do it without outputting anything (e.g. gcc's -fsyntax-only). – CB Bailey Jun 18 '09 at 20:49
  • See also: http://stackoverflow.com/questions/74326/how-should-i-detect-unnecessary-include-files-in-a-large-c-project and http://stackoverflow.com/questions/614794/c-c-detecting-superfluous-includes – Eclipse Jun 19 '09 at 17:08

12 Answers12

24

To verify that header files are including everything they need, I would creating a source file that all it does is include a header file and try to compile it. If the compile fails, then the header file itself is missing an include.

You get the same effect by making the following rule: that the first header file which foo.c or foo.cpp must include should be the correspondingly-named foo.h. Doing this ensures that foo.h includes whatever it needs to compile.

Furthermore, Lakos' book Large-Scale C++ Software Design (for example) lists many, many techniques for moving implementation details out of a header and into the corresponding CPP file. If you take that to its extreme, using techniques like Cheshire Cat (which hides all implementation details) and Factory (which hides the existence of subclasses) then many headers would be able to stand alone without including other headers, and instead make do with just forward declaration to opaque types instead ... except perhaps for template classes.

In the end, each header file might need to include:

  • No header files for types which are data members (instead, data members are defined/hidden in the CPP file using the "cheshire cat" a.k.a. "pimpl" technique)

  • No header files for types which are parameters to or return types from methods (instead, these are predefined types like int; or, if they're user-defined types, then they're references in which case a forward-declared, opaque type declaration like merely class Foo; instead of #include "foo.h" in the header file is sufficient).

What you need then is the header file for:

  • The superclass, if this is a subclass

  • Possibly any templated types which are used as method parameters and/or return types: apparently you're supposed to be able to forward-declare template classes too, but some compiler implementations may have a problem with that (though you could also encapsulate any templates e.g. List<X> as implementation details of a user-defined type e.g. ListX).

In practice, I might make a "standard.h" which includes all the system files (e.g. STL headers, O/S-specific types and/or any #defines, etc) that are used by any/all header files in the project, and include that as the first header in every application header file (and tell the compiler to treat this "standard.h" as the 'precompiled header file').


//contents of foo.h
#ifndef INC_FOO_H //or #pragma once
#define INC_FOO_H

#include "standard.h"
class Foo
{
public: //methods
  ... Foo-specific methods here ...
private: //data
  struct Impl;
  Impl* m_impl;
};
#endif//INC_FOO_H

//contents of foo.cpp
#include "foo.h"
#include "bar.h"
Foo::Foo()
{
  m_impl = new Impl();
}
struct Foo::Impl
{
  Bar m_bar;
  ... etc ...
};
... etc ...
ChrisW
  • 51,820
  • 11
  • 101
  • 201
  • Now that I think about it, you're absolutely right. The cool thing is that I've been doing this for years and years without really thinking about why. I love it when I ignorantly do the right thing. – criddell Jun 18 '09 at 19:24
  • 2
    This is correct. But it would be incorrect to just include header files if you can get away with forward declarations. So for any return values, parameters, and pointer types, the headers for those types should not be included, even though compilation will fail without forward declaring the types. – coombez Jun 18 '09 at 19:49
  • That's sound advice, and codified by NASA's Goddard Space Flight Center in their C and C++ coding standards. – Jonathan Leffler Jun 18 '09 at 19:53
  • Note that if you move between implementations, then you might find that different standard headers include (or don't include) headers that in fact your project needs. That is nasty, but sanctioned by the C++ standard since any of the standard headers may include any of the others. – Jonathan Leffler Jun 18 '09 at 19:54
  • You can forward-declare template classes be you cannot forward-declare template classes of the standard library because it is an undefined behavior. – Rexxar Jun 19 '09 at 17:47
  • @Rexxar I didn't know that but I see you're right, e.g. http://www.cpptalk.net/forward-declaration-and-template-vt12686.html – ChrisW Jun 19 '09 at 18:07
  • This sounded nice at first, but I have my doubts, if it is good practise to use a "standard.h" as described. If you make any change to a header your compiler goes through all the headers. Also I checked the above mentioned NASA coding standard. Let me site: "3.3.7 Each header file shall #include the files it needs to compile, rather than forcing users to #include the needed files. #includes shall be limited to what the header needs; other #includes should be placed in the source file." I'd suggest to do as cited. But I like the check method described above. – AudioDroid Jan 06 '11 at 11:25
11

I have the habit of ordering my includes from high abstraction level to low abstraction level. This requires that headers have to be self-sufficient and hidden dependencies are quickly revealed as compiler errors.

For example a class 'Tetris' has a Tetris.h and Tetris.cpp file. The include order for Tetris.cpp would be

#include "Tetris.h"     // corresponding header first
#include "Block.h"      // ..then application level includes
#include "Utils/Grid.h" // ..then library dependencies
#include <vector>       // ..then stl
#include <windows.h>    // ..then system includes

And now I realize this doesn't really answer your question since this system does not really help to clean up unneeded includes. Ah well..

StackedCrooked
  • 32,392
  • 40
  • 137
  • 267
7

Depending on the size of your project, looking at the include graphs created by doxygen (with the INCLUDE_GRAPH option on ) can be helpful.

albert
  • 5,966
  • 3
  • 13
  • 29
Seth Johnson
  • 13,351
  • 6
  • 52
  • 81
6

Detecting superfluous includes has already been discussed in this question.

I'm not aware of any tools to help detect insufficient-but-happens-to-work includes, but good coding conventions can help here. For example, the Google C++ Style Guide mandates the following, with the goal of reducing hidden dependencies:

In dir/foo.cc, whose main purpose is to implement or test the stuff in dir2/foo2.h, order your includes as follows:

  1. dir2/foo2.h (preferred location — see details below).
  2. C system files.
  3. C++ system files.
  4. Other libraries' .h files.
  5. Your project's .h files.
Community
  • 1
  • 1
Josh Kelley
  • 50,042
  • 19
  • 127
  • 215
5

One big problem with the remove a header and recompile technique is that it can lead to still-compiling, but wrong or inefficient code.

  1. Template specialization: If you have a template specialization for a specific type that is in one header and the more general template in another, removing the specialization may leave the code in a compilable state, but with undesired results.

  2. Overload resolution: A similar issue - if you have two overloads of one function in different headers, but that take somewhat compatible types, you can end up removing the version that is the better fit in one case, but still have the code compile. This is probably less likely than the template specialization version, but it is possible.

Eclipse
  • 42,854
  • 19
  • 110
  • 166
  • 3. Code compiling which misses declarations for functions defined in that file, this is OK initially but means function declarations and the actual function can become out of sync without anyone noticing. I ran into this with the remove & recompile technique - but found I could use sparse (gcc wrapper) to detect these cases. – ideasman42 Feb 29 '12 at 12:28
2

I've been thinking about writing something that compiles each non-header file individually many times, each time removing an #include statement. Continue doing this until a minimal set of includes is achieved.

I think this is misguided, and will lead to "insufficient but just happens to work" include sets.

Suppose your source file uses numeric_limits, but also includes some header file, that for reasons of its own includes <limits>. That doesn't mean that your source file shouldn't include <limits>. That other header file probably isn't documented to define everything defined in <limits>, it just so happens to do so. Some day it might stop: maybe it only uses one value as a default parameter of some function, and maybe that default value changes from std::numeric_limits<T>::min() to 0. And now your source file doesn't compile any more, and the maintainer of that header file didn't even know that your file existed until it broke his build.

Unless you have crippling build problems right this minute, I think the best way to remove redundant includes is just to get into the habit of looking over the list whenever you touch a file for maintenance. If you find that you have dozens of includes, and having reviewed the file you still can't figure out what each one is for, consider breaking down into smaller files.

Steve Jessop
  • 257,525
  • 32
  • 431
  • 672
  • This is applicable especially to system headers. FWIW I might make a "standard.h" which includes all the system files (e.g. STL headers) used by any file in the project, and include that as the first header in every application header file (and tell the compiler to treat this "standard.h" as the 'precompiled header file'). – ChrisW Jun 19 '09 at 16:27
  • Yes, I guess with system headers you don't care so much about including unnecessary stuff. It can be precompiled, and in any case you never change them, so you don't provoke unnecessary rebuilds. On large projects, of course, it's a *really* bad idea to have any such "header of doom" that includes all the user headers. – Steve Jessop Jun 19 '09 at 17:09
2

If you use Visual Studio compiler, you can try /showIncludes compiler option and then parse what it emits to stderr. MSDN: "Causes the compiler to output a list of the include files. Nested include files are also displayed (files that are included from the files that you include)."

Andrey.Dankevich
  • 449
  • 5
  • 11
1

Take a look at the cppclean project. Though they haven't implemented that feature yet, but it's planned to be done.

From the project site:

CppClean attempts to find problems in C++ source that slow development particularly in large code bases. It is similar to lint; however, CppClean focuses on finding global inter-module problems rather than local problems similar to other static analysis tools.

The goal is to find problems that slow development in large code bases that are modified over time leaving unused code. This code can come in many forms from unused functions, methods, data members, types, etc to unnecessary #include directives. Unnecessary #includes can cause considerable extra compiles increasing the edit-compile-run cycle.

And particularly on the #include feature:

  • (planned) Find unnecessary header files #included
    • No direct reference to anything in the header
    • Header is unnecessary if classes were forward declared instead
  • (planned) Source files that reference headers not directly #included, ie, files that rely on a transitive #include from another header

Here you can find a mirror at BitBucket.

Roman Kruglov
  • 2,937
  • 2
  • 34
  • 43
1

Yup. We have a preprocessor of our own which gives us access to our own macro language. It also checks that header files are only included one time. Creating a simple preprocessor checking for multiple includes should be fairly easy.

ralphtheninja
  • 107,622
  • 20
  • 101
  • 118
  • the preprocessor doesn't limit you to including a header file only once. It's the fact that every header file made by a sane person is surrounded by the
    #ifndef MY_HEADER_H
    #define MY_HEADER_H
    //header code
    #endif
    
    – Earlz Jun 18 '09 at 19:26
  • Well, I suppose OP didn't really ask about multiple includes, but the question asks "How do you maintain the #include statements in your C or C++ project? [. . .] Have you created any tools to spot or rectify problems? Any suggestions?" I don't know why this got downvoted – Carson Myers Jun 18 '09 at 19:28
  • I use #pragma once, but I think it's pretty much the same thing on most modern compilers. I prefer it because it's less typing, and less error prone than trying. – criddell Jun 18 '09 at 19:29
1

As far as tools go, I've used Imagix (this was about 6 years ago) on windows to identify includes that are unneeded as well as includes which are needed but are indirectly included thru another include.

no-op
  • 131
  • 4
0

If you are coding in Eclipse with CDT, you can use Organize Includes command. Just hit Ctrl+Shift+O and it will add the necessary includes and remove the unneeded ones.

Sergey Prigogin
  • 312
  • 3
  • 5
-1

I usually create one source file (main.c, for example) and one header file for that source file (main.h). In the source file, I put all the main sort of "interface" functions that I use in that file (in main, it'd be main()), and then whatever functions I get after refactoring those functions (implementation details), go below. In the header file, I declare some extern functions which are defined in other source files, but used in the source file which uses that header. Then I declare whatever structs or other data types that I use in that source file.

Then I just compile and link them all together. It stays nice and clean. A typical include ... section, in my current project looks like this

#include<windows.h>
#include<windowsx.h>
#include<stdio.h>

#include"interface.h"
#include"thissourcefile.h"

//function prototypes

//source

there's an interface header which keeps track of the data structures I use in all of the forms in the project, and then thissourcefile.h which does exactly what I just explained (declares externs, etc).

Also, I never define anything in my headers, I only put declarations there. That way they can be included by different source files and still link successfully. Function prototypes (extern, static, or otherwise) and declarations go in the header, that way they can be used lots of times--definitions go in the source, because they only need to be in one place.

This would obviously be different if you were creating a library, or something like that. But just for internal project linking, I find this keeps everything nice and clean. Plus, if you write a makefile, (or you just use an IDE), then compiling is really simple and efficient.

Carson Myers
  • 34,352
  • 35
  • 118
  • 164