16

I want to make a dummy Win32 EXE file that is much larger than it should be. So by default a boiler plate Win32 EXE file is 80 KB. I want a 5 MB one for testing some other utilities.

The first idea is to add a resource, but as it turns out embedded resources are not the same as 5 MB of code when it comes to memory allocation. I am thinking I can reference a large library and end up with a huge EXE file? If not, perhaps scripting a few thousand similar methods like AddNum1, AddNum2, etc., etc.?

Any simple ideas are very appreciated.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Phil
  • 221
  • 1
  • 3
  • 7
  • 6
    Could you give us an idea on what problem you are looking to solve? – Francesco Oct 01 '10 at 15:35
  • 1
    The question makes it not clear what the purpose is. So the answer will not help much to others. – frast Oct 01 '10 at 15:41
  • 1
    To add more contect to the question: – Phil Oct 01 '10 at 15:43
  • 1
    I am calling CreateProcess. When doing so, i need it to allocate more memory than a simple (empty) win32 project. In this case, I want CreateProcess to load in the target win32 exe and allocate 5MB of memory to it. – Phil Oct 01 '10 at 15:43
  • 2
    Please use the `edit` button to add more detail to your post. – Hello71 Oct 01 '10 at 22:47

19 Answers19

17

What about simply defining a large static char array?

char const bigarray[5*1024*1024] = { 1 };

See also my other answer in this thread where I suggest statically linking to big libraries. This surely will pull in real code if you just reference enough code of the libraries.

EDIT: Added a non-zero initialization, as data containing zeros only is treated in an optimized fashion by the compiler/linker.

EDIT: Added reference to my other answer.

EDIT: Added const qualifier, so bigarray will be placed amongst code by many compilers.

Peter G.
  • 13,888
  • 6
  • 51
  • 75
  • Not quite what I want to do. i want the exe on disk to be larger, not the memory usage. Thank you though. – Phil Oct 01 '10 at 15:07
  • 1
    If you never use it, it should never get loaded into physical memory. So unless you're concerned about the impact it has on the available virtual address space, don't worry about it. – Tyler McHenry Oct 01 '10 at 15:22
  • 1
    @Phil: You say you want the size larger only on the disk and not the memory usage but then in the actual question, you say memory allocation should be 5MiB. Am I missing something? – legends2k Oct 01 '10 at 15:32
  • Tyler, i retract my initial response :) This looks like it may work well. testing... – Phil Oct 01 '10 at 15:36
  • @legends2k: Sorry, the size on disk is important too, I am using Createprocess which allocates memory. I need it to allocate as much memory as the size of the file on disk. – Phil Oct 01 '10 at 15:42
  • Does it matter whether it's 5MB of code or 5MB of data? Generating 5MB of code is a lot harder. – Ferruccio Oct 01 '10 at 17:23
  • AFAIUnderstand, this code will not make the executable bigger, only the memory allocated at runtime? – Klaim Oct 01 '10 at 17:35
  • 3
    @Klaim, static POD objects are allocated at link time which means they are in the executable. – Peter G. Oct 01 '10 at 17:51
  • Is it true for all compilers? – Klaim Oct 05 '10 at 13:56
  • 1
    @Klaim I know of no exception. It's also true that many compilers will place const static POD objects together with code in a read-only section. I added the const in my code example now. – Peter G. Oct 05 '10 at 14:17
  • Thanks, I thought it was not guaranteed but now that I think about my experience in embedded software, I remember that the size of the executable was dependent on the number of elements in a const static table... Thanks for the confirmation. – Klaim Oct 05 '10 at 14:22
9
char big[5*1024*1024] = {1};

You need to initialize it to something other than 0 or the compiler/linker may optimize it.

Ferruccio
  • 93,779
  • 37
  • 217
  • 294
  • This will only initialize the first element, the rest will be zero. http://stackoverflow.com/questions/201101/how-to-initialize-an-array-in-c – SuperJames Oct 01 '10 at 21:31
  • 2
    That's true, but for the purposes of this question it doesn't matter exactly what it's initialized to. Setting the first element to a non-zero value seems to be enough to prevent the compiler from optimizing that variable. In other words when you set it to all zeros the compiler simply says "there should be 5 million zeroes here". Whereas this forces it to say "there's a one, followed by a zero, followed by a zero..." – Ferruccio Oct 02 '10 at 10:30
9

If it's the file size you want to increase then append a text file to the end of the exe of the required size.

I used to do this when customers would complain of small exes. They didn't realize that small exes are just as professional as larger exes. In fact in some languages there is a bloat() command to increase the size of exes, usually in BASIC compilers.

EDIT: Found an old link to a piece of code that people use: http://www.purebasic.fr/english/viewtopic.php?f=12&t=38994

An example: https://softwareengineering.stackexchange.com/questions/2051/what-is-the-craziest-stupidest-silliest-thing-a-client-boss-asked-you-to-do/2698#2698

Community
  • 1
  • 1
Gary Willoughby
  • 46,098
  • 37
  • 127
  • 193
  • 3
    What??!! Customers complaining of small EXEs? I don't think I've ever dealt with a customer that dumb. – ptomato Oct 01 '10 at 16:20
  • 2
    Yep, believe it or not! it's similar to a heavy camera. The heavier it is, the 'better' it must be! Bloat initial program releases and with each successive update claim smaller memory footprints due to further optimizations! ;) – Gary Willoughby Oct 01 '10 at 16:53
  • Isn't there some checksum validation for EXEs that will fail if you append a file? – Amnon Oct 01 '10 at 17:24
  • Not unless you've programmed the exe to check itself. – Gary Willoughby Oct 01 '10 at 18:31
  • There is one advantage to heavy cameras: they are less prone to camera shake (Newton's F=ma and all that!). Can't really say the same about large EXEs though :-) – psmears Oct 15 '10 at 08:05
8

Fill the EXE file with NOPs in assembler.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Benny
  • 1,226
  • 2
  • 12
  • 21
6

How about just adding binary zeroes to the end of the .exe?

Johan Kotlinski
  • 23,690
  • 9
  • 73
  • 100
5

You can create big static arrays of dummy data. That would bump your exe size, would not be real code though.

jv42
  • 8,308
  • 3
  • 36
  • 60
  • That does seem like the simplest and easiest way to control method to do something like this. – Andrew Barber Oct 01 '10 at 15:03
  • 2
    I thought of this too when I saw the question, but won't it be optimized out? – legends2k Oct 01 '10 at 15:04
  • I was thinking of including a boatload of windows libraries to make it bigger. Any merit to that? – Phil Oct 01 '10 at 15:08
  • @Phil maybe if you can link them statically. Otherwise it's just a bunch of exports/references. – enriquein Oct 01 '10 at 15:11
  • @sth: I know one can turn optimizations off, but I think there should be some other way to do it, without losing optimizations; like adding a resource binary say a 5 MB res via a .rc to the binary. – legends2k Oct 01 '10 at 15:15
  • Reason is, when testing, optimizations might be needed i.e. to match the actual non-bloated code. Also rc should work since OP said it's Win32. – legends2k Oct 01 '10 at 15:18
  • Resource will not work. Somehow the Win32 CreateProcess method KNOWS resources are not allocated in the same memory space. – Phil Oct 01 '10 at 15:21
  • Powerbasic has a compiler macro named #BLOAT that does this. Maybe other compilers can do this as well? I know people use this technique on trojans to attempt to match the real app's size. – enriquein Oct 01 '10 at 15:24
4

Use a big array of constant data, like explicit strings:

char *dummy_data[] = {
    "blajkhsdlmf..(long script-generated random string)..",
    "kjsdfgkhsdfgsdgklj..(etc...)...jldsjglkhsdghlsdhgjkh",
};

Unlike variable data, constant data often falls in the same memory section as the actual code, although this may be compiler- or linker-dependent.

Edit: I tested the following and it works on Linux:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int i, j;

    puts("char *dummy_data[] = {");
    for (i = 0; i < 5000; i++) {
        fputs("    \"", stdout);
        for (j = 0; j < 1000; j++) putchar('a' + rand() % 26);
        puts("\",");
    }
    puts("};");
    return 0;
}

Both this code and its output compile cleanly.

Josh Leitzel
  • 14,093
  • 12
  • 57
  • 76
Edgar Bonet
  • 3,116
  • 12
  • 18
  • I tried something like this and ended up with a C2026 error. Looks like there is a 16K limit on arrays? – Phil Oct 01 '10 at 15:20
  • 1
    If your strings are 1K long, then you only need 5K elements in the array, which makes the array size 20K (it's an array of pointers to constant strings). – Edgar Bonet Oct 01 '10 at 15:31
  • I would've just used inline assembly to create a NOP slide somewhere. It would be cleaner, more semantically correct, and somewhat more self documenting. Not to mention that it's less likely to get messed up by the compiler. – Michael J. Gray Dec 17 '14 at 16:32
3

I've found that even with optimizations, raw strings are kept as is in the compiled executable file.

So the way to go is :

  • go to http://lipsum.org/
  • generate a lot of text
  • add a cpp in your program
  • add a static const string that will have the generated text as value
  • compile
  • check the size.

If your compiler have a limit of raw string size (?) then just make a paragraph per static string.

The added size should be easy to guess.

Klaim
  • 60,771
  • 31
  • 121
  • 186
2

You could try creating some sort of recursive template that would generate a lot of different instantiations. This could possibly cause a big increase in code size.

Mark B
  • 91,641
  • 10
  • 102
  • 179
2

Use Boost and compile the executable with debug information.

tstenner
  • 9,119
  • 10
  • 48
  • 83
1

Write a program that generates a lot of code.

printf("000000000");
printf("000000001");
// ...
printf("010000000");
Amnon
  • 7,202
  • 2
  • 24
  • 34
  • Yes that's the most obvious way to produce a lot of extra code (as opposed to just static data). You can also do it using copy-and-paste, leaning on the paste key. – ChrisW Oct 01 '10 at 16:52
  • 1
    @ChrisW: if you're using copy-and-paste, exponential copy-and-paste is better than leaning on a key: Ctrl-A,C,V,V, repeat log(n) times – Amnon Oct 01 '10 at 17:28
1

I admit, I'm a Linux/UNIX guy. Is it possible to statically link an executable in Windows? You then could reference some heavy libs and blow up your code size as much as you want without writing to much code by yourself.

Another idea I pondered while reading your comment to my first answer is appending zeros to your file. As said, I'm no Windows expert, so this might not work.

Peter G.
  • 13,888
  • 6
  • 51
  • 75
  • 1
    "Is it possible to statically link an executable in Windows?" -- Yes it is, but the linker will only link/include the objects from the library which are needed (referenced) by the application. – ChrisW Oct 01 '10 at 16:50
  • @ChrisW : There may be an option like "--whole-archive" for ld in linux to force the linker to include everything ?? – ThR37 Oct 01 '10 at 17:02
1

Add a 5MB (bmp) image.

jacknad
  • 12,453
  • 38
  • 115
  • 190
1

After you do all the methods listed here, compile with the debug flag and with the highest optimization flag (gcc -g -O3).

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
Aboelnour
  • 1,335
  • 2
  • 14
  • 38
0

If all else fails, you could still create an assembly language source file where you have an appropriate number of db statements emitting bytes into the code segment, and link the resulting code object to your program as extern "C" { ... }.

You might need to play with the compiler/linker to prevent the linker from optimizing away that dummy "code" object.

ndim
  • 30,855
  • 12
  • 43
  • 55
0

Use #define to define lots of macros which holds string with huge length, and use those macros inside your program in many places.

sadananda salam
  • 825
  • 1
  • 7
  • 12
0

You could do this:

REM generate gibberish of the desired size
dd if=/dev/random of=entropy count=5000k bs=1
REM copy the entropy to the end of the file
copy /b someapp.exe + entropy somefatapp.exe

If it were a batch file, you could even add it as a post compilation step so it happened automatically.

You can generally copy as much information as you want to the end of an exe. All the code / resources are stored as offsets from the beginning of the file, so increasing it's size shouldn't affect it.

(I'm assuming you have dd in Windows. If not, get it).

Seth
  • 40,196
  • 9
  • 82
  • 118
0

Write a code generator that generates arbitrary random functions. The only trick then is making sure that it doesn't get optimized out and with separate compilation that shouldn't be hard.

BCS
  • 67,242
  • 64
  • 175
  • 277
0

Statically link wxWidgets to your application. It will instantly become 5 MB large.

Stefan Steiger
  • 68,404
  • 63
  • 337
  • 408