-2

For a large software developed by C, we first declare all the self-defined functions in a separate header file (e.g. myfun.h). After that, once we write a code (e.g. main.c) that uses the functions listed in myfun.h, we have to #include "myfun.h". I'm wondering how it works, because even if I include the function names declared in header file before the main body, the code cannot see the function details in main.c. I guess it will search the library to get the function details...Am I right?

Thomas
  • 57
  • 5
  • You way want to read up function/variable declaration and definition. – R Sahu Apr 03 '18 at 16:43
  • https://stackoverflow.com/questions/1410563/what-is-the-difference-between-a-definition-and-a-declaration. – R Sahu Apr 03 '18 at 16:44
  • 2
    Do you know what **linking** means? –  Apr 03 '18 at 16:45
  • If you are writing a large software, you should consider studying -at least for inspiration- the source code of some [free software](https://en.wikipedia.org/wiki/Free_software) -similar in domain or in ambition- to yours. You'll find many of them, e.g. on [github](http://github.com/) – Basile Starynkevitch Apr 03 '18 at 16:46
  • https://stackoverflow.com/questions/49542064/does-every-h-file-have-a-corresponding-object-file/49542618#49542618 – Ben Apr 03 '18 at 16:48
  • 1
    The header declares what's available elsewhere. For example, `` tells the compiler about the functions etc available from the standard C library for doing I/O operations. The header is used where the functions are defined (to check that the definition matches the declaration), and where the functions are used. Your code will be similar; your `main.c` code uses the declarations from `myfun.h`, but you have to supply the implementations when the code is linked. The `` header is special; the standard C library is linked automatically. You must link your functions explicitly. – Jonathan Leffler Apr 03 '18 at 16:50

1 Answers1

1

When you say "it will search the library for the function details" you're not far off, but that isn't quite right. A function declaration, i.e.. a function prototype only contains enough information for the compiler to do two things:

  • First, the compiler will register the function as a known identifier so that it knows what you're taking about when you call it, as opposed to a random string of letters with parentheses (to the compiler, they are essentially the same thing without a function prototype for either - an error).

  • Second, the compiler uses the function prototype for checking code correctness. Correctness in this sense means that a function call will match the prototype in both arity and type. In other words a function call to int square(int a, int b); will have two arguments, both integers.

The program doesn't "search the library," though. Function names without parentheses are not function calls but rather function's address. Therefore, when you call a function, the processor jumps to the memory location of the function. (This assumes the function has not been inlined.)

Where is this function located though? It depends. If you wrote the function in the same module, i.e... a .c file that got compiled into an object linked with the main.c file into a single executable, then the location of the function will be somewhere in the .TEXT section of the executable. In other words, it's just a slight offset from the main function's entry point. In a huge project this offset won't be so slight, but it will be shorter than the offset of separate objects.

Having said that, if you compiled this hypothetical function into a DLL which you call from your main program, then the function's address will be determined in one of two ways:

  1. Either you will have generated a .lib/.a? (depending on whether you're on Windows or Linux) file containing the function declaration's and addresses, or:
  2. You will use run-time linking where the main program will calculate the function addresses when it loads the .dll/.so into its address space. First, it will determine where to load it. You can set DLL's to have preferred offsets to optimize load time. Otherwise, libraries will start loading from the first segment available and any additional libraries will need their function address recalculated using this new address, hampering initial load times. Once they are loaded into the program's memory though, there shouldn't be any performance hits thereafter.

Going back to the preprocessor, it's important to note two things. First, it runs before any compilation takes place. This is important. Since the program is not really being "compiled" when the preprocessor is doing its thing, macros are not type-safe. (Insert Haskell joke about C "type safety") This is why you don't -or shouldn't- see macros in C++. Anything that can be accomplished with macros in C can be accomplished by const and inline functions in C++, with the added benefit of type safety.

Second, the preprocessor is almost just a search and replace engine. For example, in the following code, nothing happens because the preprocessor if statement evaluates to false, since I never defined anything. The preprocessor removes the code in this section. Remember that since the compiler has not run in earnest yet, this removed code will not be compiled. This fact is usually utilized to implement functions for debugging or logging in debug builds. In release builds the preprocessor definition is then manipulated such that the debug code is not included.

#include <stdio.h>
#include <stdlib.h>

int main()
{
    #if TRUE
    printf("Hello, World!");
    #endif

return EXIT_SUCCESS;
}

In fact, the EXIT_SUCCESS macro I used is defined in stdlib.h, and replaced by 0. (EXIT_FAILURE =1).

Back in the day, the preprocessor was used as duct tape, basically, to compensate for faults in C.

For example, since const values can't be used as array sizes, macros were used instead, like this:

// Not valid C89, possibly even C99
const int DEFAULT_BUFFER_SIZE = 128;
char user_input[DEFAULT_BUFFER_SIZE]; 

// Legal since the dawn of time
#define DEFAULT_BUFFER_SIZE 128
char user_input[DEFAULT_BUFFER_SIZE];

Another significant use of the preprocessor was for code portability, for example:

#ifdef WIN32 
// Do windows things
#elif
// Handle other OS
#endif

One trick was to define a generic function and set it to the appropriate OS-dependent one (Remember that functions without the parentheses represent the function's address, not an actual function call), like this:

void RequestSomeKernelAction();

#ifdef WIN32
RequestSomeKernelAction = WindowsVersion;
#else
RequestSomeKernelAction = OtherOSFunction;
#endif

This is all to say that the code you see in header files follows these same rules. If I have the following header file:

#ifndef SRC_INCLUDES_TEST_H
#define SRC_INCLUDES_TEST_H

int square(int a);

#endif /** SRC_INCLUDES_TEST_H */

And I have this main.c file:

#define SRC_INCLUDES_TEST_H
#include "test.h"

int main()
{
    int n = square(4);
}

This program will not compile. The square function will not be known to main.c because while I did include the header file where square is declared, my #define SRC_INCLUDES_TEST_H statement tells the preprocessor to copy all the header file contents over to main except those in the block where SRC_INCLUDES_TEST_H is defined, i.e... nothing.

These preprocessor commands can be nested, and there are several, which I highly recommend you look up, if only for historical or pedagogical reasons.

The last point I will make is that while the C preprocessor has its faults, it was a powerful tool in the right hands, and in fact, the first C++ compiler Bjarne Stroustroup wrote was essentially just a preprocessor.