3

Assume that you have not only executable file, but also source code files.

My problem is to calculate correct stack size of running process only for local variables, return address, passing argument. I was trying to use VMMap developed by MS. Because it can catch allocated memory in system with categories such as stack. However, it also contains guard page, paging file(s) and so on. Thus, stack size from VMMap was overestimated.

I would like to change the way to solve the problem. I will trace stack to draw actual call tree by using StackWalker64 of WinAPI and get symbol table from either executable or source code. But there is a problem that symbol table from executable such as ELF is not readable.

Now, I am planning to apply doxygen which is open source project with lexer of compiler. Because doxygen only provides function list with their return type and function argument, I don't know about local variables. So, I also need lexer to make complete symbol table as pre-processing. But it kind of complicated one. I am not sure it is best solution. Is there better way to solve?

Ira Baxter
  • 88,629
  • 18
  • 158
  • 311
Hexoul
  • 161
  • 2
  • 13
  • I have in the past done this by allocating a stack, filling it with a pattern and giving that allocated stack to the thread on creation. These were embedded systems. After running the application, examine the allocated stack to locate where the pattern no longer exists and that is the stack depth used, for that particular run of the application. – dgsomerton Apr 24 '16 at 08:20
  • I think that is about stack allocation problem with fixed size. Your approach have to modify the structure of system. And it also remains overestimated problem. Depending on stack configuration size, there will be free memory in stack. – Hexoul Apr 24 '16 at 08:35
  • 1
    Unless you have really, really good test cases you're probably going to get this estimate wrong. You need to exercise the code inside and out every conceivable way to test the high-water mark on stack allocation. – tadman Apr 24 '16 at 10:20

1 Answers1

1

What OP wants to do is statically compute worst-case stack depth.

This is very hard to do. To accomplish it, he needs:

  • The source code for every function (as raw material used to derive following facts)
  • The "symbol table" (all declarations) for every function
  • A call graph of the application code across compilation units
  • A deep knowledge of what his particular compiler does to generate and optimize code

The "symbol table" needs to be compiler-accurate so that the estimation process knows which declared data (conceptually) goes into the stack. OP will need what amounts to a full compiler front-end for his particular dialect of C++. [OP mistakenly thinks a "lexer" will give him a symbol table. It will not].

Now consider constructing the global call graph. The basics seems simple; if function "bar" contains "foo(x)" then "bar" calls "foo". But there are many complications: overloading, virtual functions, indirect calls, and casts. Overloading is presumably resolved by name and type resolution; this gets messy in the face of templates (consider SFINAE). Virtual functions and indirect calls forces one to build a conservative points-to analyzer; an ugly enough cast may force the assumption of a call to any argument-compatible function. Points-to analyzers come in varying degrees of precision; low precision may produce a call graph with many bogus (conservative) edges which will throw off the size estimation. The compiler won't provide this global call graph since it only operates on single compilation units.

Finally, we can consider constructing a stack size estimate. Here one needs to understand the sizes and alignments use to represent each declared data type, and how the particular compiler of interest allocates local variables to the stack. Generally sequential coding blocks { .... } { ... } overlap stack locations. One also needs to understand how expressions are evaluated and how arguments are passed as these impact the stack usage. Finally, one needs to understand how the compiler allocates registers and what optimizations the compiler can/did(?) apply, as such optimizations will affect expression stack usage as well as how many local variables actually allocated to the stack. This is an awful lot to know, and the only trustworthy source of knowledge is the compiler itself. Rather than trying to replicate all this machinery, it is likely better to either get the compiler to provide its actual stack size allocation per function (I believe GCC will do this), or give up getting a precise result and conservatively estimate the stack demand by assuming every declared local consumes stack space (in this case it is unclear what one should do to estimate expression stack usage; one might assume that every variable and intermediate expression result takes stack space according to its type).

With stack space estimates per function, and a call graph, a simple analysis of the call graph can produce stack demands per call chain from the root. The max of these is the stack estimate needed. (Note: this assumes that each function uses its full stack demand for every call; that's clearly a conservative estimate). This part is pretty easy.

Overall this is a complex analysis. Ideally you can get the compiler to provide stack size estimates and basic call-graph facts. Building the points-to analysis is difficult and gets very hard as the size of the application gets big.

Arguably you can bend GCC to help provide the compilation-unit level data. Clang is presumably designed to be bent to provide that same data. Neither offers specific support for global points-to analysis to my knowledge. It is unclear the GCC and Clang handle the Windows dialects of C++; they may.

Our DMS Software Reengineering Toolkit and its C++ front end is designed to provide symbol tables, the result of name resolution (e.g., resolving overloads) and can easily extract local call facts. It handles both GCC and MS dialects of C++. DMS also provides support for building global points-to analyses and a global call graph; while we have not used this specifically for C++, we have used it to process C applications of some 16 million lines.

All this difficulty explains why people often punt and try to see how big the stack is using dynamic analysis. If OP wants a static analyzer, he needs to be prepared to invest significant effort to get it.

Ira Baxter
  • 88,629
  • 18
  • 158
  • 311