123

I need to optimize the size of my executable severely (ARM development) and I noticed that in my current build scheme (gcc + ld) unused symbols are not getting stripped.

The usage of the arm-strip --strip-unneeded for the resulting executables / libraries doesn't change the output size of the executable (I have no idea why, maybe it simply can't).

What would be the way (if it exists) to modify my building pipeline, so that the unused symbols are stripped from the resulting file?


I wouldn't even think of this, but my current embedded environment isn't very "powerful" and saving even 500K out of 2M results in a very nice loading performance boost.

Update:

Unfortunately the current gcc version I use doesn't have the -dead-strip option and the -ffunction-sections... + --gc-sections for ld doesn't give any significant difference for the resulting output.

I'm shocked that this even became a problem, because I was sure that gcc + ld should automatically strip unused symbols (why do they even have to keep them?).

Yippie-Ki-Yay
  • 20,062
  • 23
  • 85
  • 143
  • How do you know that symbols are not used? – zvrba Jul 16 '11 at 14:32
  • 1
    Not referenced anywhere => not being used in the final application. I assume that building call graph while comipling / linking shouldn't be very hard. – Yippie-Ki-Yay Jul 16 '11 at 14:52
  • 1
    Are you trying to reduce the size of the .o file by removing dead *symbols*, or you are trying reduce the size of the actual code footprint once loaded into executable memory? The fact that you say "embedded" hints at the latter; the question you ask seems focused on the former. – Ira Baxter Jul 16 '11 at 16:54
  • @Ira I'm trying to reduce the output executable size, because *(as an example)* if I attempt to port some existing applications, which use `boost` libraries, the resulting `.exe` file contains many unused object files and due to the specifications of my current embedded runtime, starting a `10mb` applications takes much longer than, for example, starting a `500k` application. – Yippie-Ki-Yay Jul 16 '11 at 19:51
  • 9
    @Yippie: You want to get rid of code to minimize load time; the code you want to get rid of are unused methods/etc. from libraries. Yes, you need to build a call graph to do this. It isn't that easy; it has to be a global call graph, it has to be conservative (can't remove something that might get used) and has to be accurate (so you have as close to an ideal call graph, so you really know what isn't used). The big problem is doing a global, accurate call graph. Don't know of many compilers that do this, let alone linkers. – Ira Baxter Jul 16 '11 at 20:12
  • Yes, but how do you know that they are not referenced anywhere? – zvrba Jul 17 '11 at 06:05
  • What version of gcc/ld are you using? – Foo Bah Jul 19 '11 at 05:01
  • If you upgrade your toolchain (should be pretty straight forward, fear not) possibly Nemos advice will start working? – Prof. Falken Jul 19 '11 at 16:13

11 Answers11

139

For GCC, this is accomplished in two stages:

First compile the data but tell the compiler to separate the code into separate sections within the translation unit. This will be done for functions, classes, and external variables by using the following two compiler flags:

-fdata-sections -ffunction-sections

Link the translation units together using the linker optimization flag (this causes the linker to discard unreferenced sections):

-Wl,--gc-sections

So if you had one file called test.cpp that had two functions declared in it, but one of them was unused, you could omit the unused one with the following command to gcc(g++):

gcc -Os -fdata-sections -ffunction-sections test.cpp -o test -Wl,--gc-sections

(Note that -Os is an additional compiler flag that tells GCC to optimize for size)

maxschlepzig
  • 27,589
  • 9
  • 109
  • 146
J T
  • 4,488
  • 4
  • 26
  • 35
  • 4
    Please note this will slow down the executable as per GCC's option descriptions (I tested). – metamorphosis Oct 24 '16 at 21:41
  • 1
    With `mingw` this does not work when linking statically statically libstdc++ and libgcc with the flag `-static`. The linker option `-strip-all` helps quite a bit, but still the generated executable (or dll) is about 4 way bigger than what Visual Studio would generate. Point is, I have no control on how `libstdc++` was compiled. There should be a `ld` only option. – Fabio Dec 27 '17 at 09:18
36

If this thread is to be believed, you need to supply the -ffunction-sections and -fdata-sections to gcc, which will put each function and data object in its own section. Then you give and --gc-sections to GNU ld to remove the unused sections.

sigjuice
  • 25,352
  • 11
  • 62
  • 90
Nemo
  • 65,634
  • 9
  • 110
  • 142
  • Unfortunately this didn't work, I have no idea why, probably the `arm-gcc` compiler issue or something `ARM`-related. If there is something else I could try... – Yippie-Ki-Yay Jul 14 '11 at 02:02
  • @Michael: True, but the documentation links I provided are current and seem to support the idea that it should work... Oh well. – Nemo Jul 14 '11 at 02:46
  • It's still valid. I've never understood why it's not the default for GCC; I'd even question whether the current default (keep unused symbols) makes any sense. – MSalters Jul 14 '11 at 10:48
  • 6
    @MSalters: It's not the default, because it violates the C and C++ standards. Suddenly global initialization doesn't happen, which results in some very surprised programmers. – Ben Voigt Jul 16 '11 at 14:38
  • @Ben Voigt: Obviously such a symbol is in use if its presence matters. So essentially you're saying that GCC can't properly detect which symbols are actually used? – MSalters Jul 18 '11 at 08:07
  • @MSalters: You're assuming that initialization has no side-effects. In C and C++, that is a bad assumption. – Ben Voigt Jul 18 '11 at 12:36
  • @Ben Voigt: Nope, didn't assume that at all. I just observed that _GCC_ assumes that. Did I understand your point correctly, that GCC will also eliminate symbols that are unused _except_ for the side effects of their initialization? – MSalters Jul 18 '11 at 12:39
  • 1
    @MSalters: Only if you pass the non-standard behavior-breaking options, which you proposed to make the default behavior. – Ben Voigt Jul 18 '11 at 12:42
  • @Ben Voigt: Obviously. So I assume that if I'd provide a patch that would properly detect unused symbols, then `-gc-sections` would become the default? – MSalters Jul 18 '11 at 13:05
  • 1
    @MSalters: If you can make a patch that runs static initializers if and only if the side effects are necessary to the correct operation of the program, that would be awesome. Unfortunately I think doing it perfectly often requires solving the halting problem, so you'll probably need to err on the side of including some extra symbols at times. Which basically is what Ira says in his comments to the question. (BTW: "not necessary to the correct operation of the program" is a different definition of "unused" than how that term is used in the standards) – Ben Voigt Jul 18 '11 at 13:18
  • 3
    @BenVoigt in C, global initialization cannot have side-effects (initializers must be constant expressions) – M.M Jul 11 '14 at 01:38
  • 2
    @Matt: But that's not true in C++... and they share the same linker. – Ben Voigt Jul 11 '14 at 02:11
  • @BenVoigt I was rebutting your claim "You're assuming that initialization has no side-effects. In C and C++, that is a bad assumption" and the C part of "it violates the C and C++ standards". How does it allegedly violate the C standard? – M.M Jul 11 '14 at 02:39
  • 1
    @Matt: After some research I think you're right. It's a bad assumption only in C++. I suppose C can get code into the initialization sequence using pragmas and attributes that control sections, but not via initialization. – Ben Voigt Jul 11 '14 at 03:25
  • OK. A good point about C++ nonetheless. I use these switches in production in C, but had not considered how they might interact with static initialization if I were also to use them in C++. – M.M Jul 11 '14 at 03:40
26

You'll want to check your docs for your version of gcc & ld:

However for me (OS X gcc 4.0.1) I find these for ld

-dead_strip

Remove functions and data that are unreachable by the entry point or exported symbols.

-dead_strip_dylibs

Remove dylibs that are unreachable by the entry point or exported symbols. That is, suppresses the generation of load command commands for dylibs which supplied no symbols during the link. This option should not be used when linking against a dylib which is required at runtime for some indirect reason such as the dylib has an important initializer.

And this helpful option

-why_live symbol_name

Logs a chain of references to symbol_name. Only applicable with -dead_strip. It can help debug why something that you think should be dead strip removed is not removed.

There's also a note in the gcc/g++ man that certain kinds of dead code elimination are only performed if optimization is enabled when compiling.

While these options/conditions may not hold for your compiler, I suggest you look for something similar in your docs.

Community
  • 1
  • 1
Michael Anderson
  • 61,385
  • 7
  • 119
  • 164
23

Programming habits could help too; e.g. add static to functions that are not accessed outside a specific file; use shorter names for symbols (can help a bit, likely not too much); use const char x[] where possible; ... this paper, though it talks about dynamic shared objects, can contain suggestions that, if followed, can help to make your final binary output size smaller (if your target is ELF).

ShinTakezou
  • 8,726
  • 23
  • 38
  • 4
    How does it help to choose shorter names for symbols? – fuz Feb 22 '16 at 14:03
  • 1
    if symbols are not stripped away, ça va sans dire—but it seems it needed to be said now. – ShinTakezou Feb 23 '16 at 18:52
  • @fuz The paper is talking about dynamic shared objects (eg. `.so` on Linux), so the symbol names have to be retained so that APIs like Python's `ctypes` FFI module can use them to look up symbols by name at runtime. – ssokolow Aug 11 '19 at 09:24
19

The answer is -flto. You have to pass it to both your compilation and link steps, otherwise it doesn't do anything.

It actually works very well - reduced the size of a microcontroller program I wrote to less than 50% of its previous size!

Unfortunately it did seem a bit buggy - I had instances of things not being built correctly. It may have been due to the build system I'm using (QBS; it's very new), but in any case I'd recommend you only enable it for your final build if possible, and test that build thoroughly.

Timmmm
  • 68,359
  • 51
  • 283
  • 367
  • 1
    "-Wl,--gc-sections" doesn't work on MinGW-W64, "-flto" works for me. Thanks – rhbc73 Mar 18 '16 at 01:33
  • The output assembly is very weird with `-flto` I do not understand what it does behind the scene. – ar2015 Sep 23 '18 at 04:44
  • I believe with `-flto` it doesn't compile each file to assembly, it compiles them to LLVM IR, and then the final link compiles them as if they were all in one compilation unit. That means it can eliminate unused functions and inline non-`static` ones, and probably other things too. See https://llvm.org/docs/LinkTimeOptimization.html – Timmmm Sep 23 '18 at 11:09
13

While not strictly about symbols, if going for size - always compile with -Os and -s flags. -Os optimizes the resulting code for minimum executable size and -s removes the symbol table and relocation information from the executable.

Sometimes - if small size is desired - playing around with different optimization flags may - or may not - have significance. For example toggling -ffast-math and/or -fomit-frame-pointer may at times save you even dozens of bytes.

zxcdw
  • 1,491
  • 9
  • 17
  • Most optimization tweaks will still yield correct code as long as you comply with the language standard, but I've had `-ffast-math` wreak havoc in completely standards-compliant C++ code, so I would never recommend it. – Raptor007 Apr 12 '17 at 23:20
11

It seems to me that the answer provided by Nemo is the correct one. If those instructions do not work, the issue may be related to the version of gcc/ld you're using, as an exercise I compiled an example program using instructions detailed here

#include <stdio.h>
void deadcode() { printf("This is d dead codez\n"); }
int main(void) { printf("This is main\n"); return 0 ; }

Then I compiled the code using progressively more aggressive dead-code removal switches:

gcc -Os test.c -o test.elf
gcc -Os -fdata-sections -ffunction-sections test.c -o test.elf -Wl,--gc-sections
gcc -Os -fdata-sections -ffunction-sections test.c -o test.elf -Wl,--gc-sections -Wl,--strip-all

These compilation and linking parameters produced executables of size 8457, 8164 and 6160 bytes, respectively, the most substantial contribution coming from the 'strip-all' declaration. If you cannot produce similar reductions on your platform,then maybe your version of gcc does not support this functionality. I'm using gcc(4.5.2-8ubuntu4), ld(2.21.0.20110327) on Linux Mint 2.6.38-8-generic x86_64

Gearoid Murphy
  • 10,997
  • 17
  • 60
  • 85
8

strip --strip-unneeded only operates on the symbol table of your executable. It doesn't actually remove any executable code.

The standard libraries achieve the result you're after by splitting all of their functions into seperate object files, which are combined using ar. If you then link the resultant archive as a library (ie. give the option -l your_library to ld) then ld will only include the object files, and therefore the symbols, that are actually used.

You may also find some of the responses to this similar question of use.

Community
  • 1
  • 1
Andrew Edgecombe
  • 35,947
  • 3
  • 32
  • 60
  • 2
    The separate object files in the library is only relevant when doing a static link. With shared libraries, the whole library is loaded, but not included in the executable, of course. – Jonathan Leffler Jul 16 '11 at 03:13
5

From the GCC 4.2.1 manual, section -fwhole-program:

Assume that the current compilation unit represents whole program being compiled. All public functions and variables with the exception of main and those merged by attribute externally_visible become static functions and in a affect gets more aggressively optimized by interprocedural optimizers. While this option is equivalent to proper use of static keyword for programs consisting of single file, in combination with option --combine this flag can be used to compile most of smaller scale C programs since the functions and variables become local for the whole combined compilation unit, not for the single source file itself.

J P
  • 3,692
  • 2
  • 27
  • 37
awiebe
  • 3,257
  • 3
  • 18
  • 31
  • Yeah but that presumably doesn't work with any kind of incremental compilation and is probably going to be a bit slow. – Timmmm May 16 '14 at 13:46
  • @Timmmm: I suspect you're thinking of `-flto`. – Ben Voigt Jul 11 '14 at 03:23
  • Yes! I subsequently found that (why is it not any of the answers?). Unfortunately it seemed a bit buggy, so I'd only recommend it for the final build and then test that build a lot! – Timmmm Jul 18 '14 at 10:27
5

I don't know if this will help with your current predicament as this is a recent feature, but you can specify the visibility of symbols in a global manner. Passing -fvisibility=hidden -fvisibility-inlines-hidden at compilation can help the linker to later get rid of unneeded symbols. If you're producing an executable (as opposed to a shared library) there's nothing more to do.

More information (and a fine-grained approach for e.g. libraries) is available on the GCC wiki.

Luc Danton
  • 33,152
  • 5
  • 66
  • 110
-1

You can use strip binary on object file(eg. executable) to strip all symbols from it.

Note: it changes file itself and don't create copy.

ton4eg
  • 1,874
  • 2
  • 9
  • 10