3

Is there any runtime performance difference between including an entire library (with probably hundreds of functions) and then using only a single function like:

#include<foo>

int main(int argc, char *argv[]) {
    bar();//from library foo
    return 0;
}

And between pasting the relevant code fragment from the library directly into the code, like:

void bar() {
...
}

int main(int argc, char *argv[]) {
    bar();//defined just above
    return 0;
}

What would prevent me from mindlessly including all of my favourite (and most frequently used) libraries in the beginning of my C files? This popular thread C/C++: Detecting superfluous #includes? suggests that the compilation time would increase. But would the compiled binary be any different? Would the second program actually outperform the first one?


Related: what does #include <stdio.h> really do in a c program

Edit: the question here is different from the related Will there be a performance hit on including unused header files in C/C++? question as here there is a single file included. I am asking here if including a single file is any different from copy-pasting the actually used code fragments into the source. I have slightly adjusted the title to reflect this difference.

Community
  • 1
  • 1
Matsmath
  • 1,036
  • 2
  • 16
  • 38
  • 2
    My rule of thumb is only include what you need. If you include things you do not need you could confuse people as they may think you included it for a reason when you actually didn't. – NathanOliver May 02 '16 at 13:45
  • 6
    What you are showing - is including headers. Headers are mostly comprised of *declarations*, which are not affecting performance at all. The most you can spoil here is the compillation time. But speaking of libraries - a decent linker will just eliminate all of the unused stuff anyway. – Eugene Sh. May 02 '16 at 13:45
  • @Eugene you should make this an answer! ;) – mame98 May 02 '16 at 13:49
  • 7
    Possible duplicate of [Will there be a performance hit on including unused header files in C/C++?](http://stackoverflow.com/questions/25008801/will-there-be-a-performance-hit-on-including-unused-header-files-in-c-c) and [are-unused-includes-harmful-in-c](http://stackoverflow.com/questions/7919258/are-unused-includes-harmful-in-c-c) – gdlmx May 02 '16 at 13:50
  • 1. As others have mentioned, the header files you include are only partially related to the libraries the linker might end up loading. 2. If you use static linking, the linker will link in only the modules you actually use/need. (Whether anything extra gets pulled in then depends on whether the library implementor built one or multiple functions per module.) 3. If you're using dynamic linking (which is more common these days), you "get" the whole library, but it's typically (a) demand-paged and (b) shared with other processes, so you don't bear the full cost. – Steve Summit May 02 '16 at 13:51
  • I know of a project that had one include file that included everything. They also had a configure script that took over two weeks to run on certain older machines. There could be a correlation. – Art May 02 '16 at 13:51
  • @Art Hearing this making me think that the project had some other problems as well :) – Eugene Sh. May 02 '16 at 13:53
  • About the update: No, it is the same. You can invoke `gcc -E` and see that the included file is just pasted into the source. Except the increased compilation time, again. – Eugene Sh. May 02 '16 at 14:02
  • @EugeneSh. It's pretty widely used. I'd estimate there's a two digit percent chance that something you've done today went through their code. I was exaggerating a bit since the "certain older machines" was a Sun 3/80, but their configure script was epic. Name withheld to protect the guilty. – Art May 02 '16 at 14:03
  • @Art As someone working on a code, that will probably run under something *you* are going to do in a near future, I must say that it is not a good measure of proper coding practices and quality :) – Eugene Sh. May 02 '16 at 14:06
  • @EugeneSh: so it is not true that one would rather not break a program into several files because optimization in-between different files is more difficult? – Matsmath May 02 '16 at 14:07
  • @Matsmath You are confusing header files and translation units (source files). Working with separate translation units makes it harder for compiler to optimize stuff, as it sees one unit at a time, but not the whole picture. But your question is not about it. – Eugene Sh. May 02 '16 at 14:08
  • You may want to note that your comments apply to statically linked libraries (not the default in most modern systems). Dynamically linked libraries are slightly different. – Martin York May 02 '16 at 15:18

4 Answers4

12

There is no performance difference as far as the final program is concerned. The linker will only link functions that are actually used to your program. Unused functions present in the library will not get linked.

If you include a lot of libraries, it might take longer time to compile the program.

The main reason why you shouldn't include all your "favourite libraries" is program design. Your file shouldn't include anything except the resources it is using, to reduce dependencies between files. The less your file knows about the rest of the program, the better. It should be as autonomous as possible.

Lundin
  • 155,020
  • 33
  • 213
  • 341
  • Suppose you are for example writing a linked list ADT. Common sense dictates that everyone using your liked list shouldn't be forced to include math libraries, SQL databases, 3D graphics etc etc. Applying common sense when doing a program design will get you quite far. – Lundin May 02 '16 at 14:07
  • If the library interface is not using any of the types defined in the mentioned libraries, there is no reason the user should be "forced" to include them, right? But if it is, and the accompanying header is not including them, then it is just not a properly written header by any sense. – Eugene Sh. May 02 '16 at 14:14
  • @EugeneSh. If that would be the case, the library would be broken. – Lundin May 02 '16 at 14:16
  • 1
    "The linker will only link functions that are actually used to your program." This is certainly linker dependent and also on how the library was constructed. Should a library source be only 1 .c file with many functions, using 1 function may get the entire package. – chux - Reinstate Monica May 02 '16 at 14:58
6

This is not such simply question and so does not deserve a simple answer. There are a number of things that you may need to consider when determining what is more performant.

  1. Your Compiler And Linker: Different compilers will optimize in different ways. This is something that is easily overlooked, and can cause some issues when making generalisations. For the most part modern compilers and linkers will optimize the binary to only include what is 100% necessary for executions. However not all compilers will optimize your binary.
  2. Dynamic Linking: There are two types of linking when using other libraries. They behave in similar ways however are fundamentally different. When you link against a dynamic library the library will remain separate from the program and only execute at runtime. Dynamic libraries are usually known as shared libraries and are therefore should be treated as if they are used by multiple binaries. Because these libraries are often shared, the linker will not remove any functionality from the library as the linker does not know what parts of that library will be needed by all binaries within that system or OS. Because of this a binary linked against a dynamic library will have a small performance hit, especially immediately after starting the program. This performance hit will increase with the number of dynamic linkages.
  3. Static Linking: When you link a binary against a static library (with an optimizing linker) the linker will 'know' what functionality you will need from that particular library and will remove functionality that will not be used in your resulting binary. Because of this the binary will become more efficient and therefore more performant. This does however come at a cost.

    e.g.

    Say you have a an operating system that uses a library extensively throughout a large number of binaries throughout the entire system. If you were to build that library as a shared library, all binaries will share that library, whilst perhaps using different functionality. Now say you statically link every binary against a library. You would end up with and extensively large duplication of binary functionality, as each binary would have a copy of the functionality it needed from that library.

Conclusion: It is worth noting that before asking the question what will make my program more performant, you should probably ask yourself what is more performant in your case. Is your program intended to take up the majority of your CPU time, probably go for a statically linked library. If your program is only run occasionally, probably go for a dynamically linked library to reduce disk usage. It is also worth noting that using a header based library will only give you a very marginal (if at all) performance gain over a statically linked binary, and will greatly increase your compilation time.

silvergasp
  • 960
  • 8
  • 17
  • You overlook the per-call cost of calling a function in a shared library on non-Windows platforms. Unix systems do lazy dynamic linking by [forcing all calls to functions in shared libraries to bounce through PLT](http://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/). Once dynamic linking is done, it's a single extra `jmp rel32` on x86. – Peter Cordes May 07 '16 at 06:26
2

It depends greatly on the libraries and how they are structured, and possible on the compiler implementation.

The linker (ld) will only assemble code from the library that is referenced by the code, so if you have two functions a and b in a library, but only have references to a then function b may not be in the final code at all.

Header files (include), if they only contain declarations, and if the declarations does not result in references to the library, then you should not see any difference between just typing out the parts you need (as per your example) and including the entire header file.

Historically, the linker ld would pull code by the files, so as along as every function a and b was in different files when the library was created there would be no implications at all.

However, if the library is not carefully constructed, or if the compiler implementation does pull in every single bit of code from the lib whether needed or not, then you could have performance implications, as your code will be bigger and may be harder to fit into CPU cache, and the CPU execution pipeline would have to occasional wait to fetch instructions from main memory rather than from cache.

Soren
  • 13,623
  • 4
  • 34
  • 66
0

It depends heavily on the libraries in question.

They might initialize global state which would slow down the startup and/or shutdown of the program. Or they might start threads that do something in parallel to your code. If you have multiple threads, this might impact performance, too.

Some libraries might even modify existing library functions. Maybe to collect statistics about memory or thread usage or for security auditing purposes.

  • Can library do anything the main program is not telling it to do?? – Eugene Sh. May 02 '16 at 13:54
  • @EugeneSh. A function call from your main program may result in internal function calls in the library. If your program doesn't call any function in the library, then no code in the library will get linked or executed. – Lundin May 02 '16 at 13:57
  • @Lundin Sure. But in this case you can't tell they are unused. You have to link them anyway. – Eugene Sh. May 02 '16 at 13:58
  • 1
    A C++ library can have global objects that have nontrivial constructors. Just linking the library might cause this code to run. This depends on the tool chain and linker options. It usually does require the code to use at least one symbol from the library explicitly, though. – Marvin Sielenkemper May 02 '16 at 15:25