3

SysProf doesn't properly generate call stack without it, GProf isn't accurate at all. And also, are profilers that work without -fno-omit-frame-pointer as accurate as those that rely on it?

lamefun
  • 3,152
  • 4
  • 17
  • 22
  • 2
    Remember that the manpage itself warns about `-fomit-frame-pointer`: “[...] It also makes debugging impossible on some machines.” – jørgensen Mar 01 '12 at 17:41
  • My distro (Fedora) compiles with it by default. – lamefun Mar 01 '12 at 18:00
  • On x86_64 `fomit-frame-pointer` is default, even if not specified in the command line. That's because there is libunwind, which makes `-fno-omit-frame-pointer` obsolete. – Gunther Piez Mar 01 '12 at 18:16
  • Have you tried CodeAnalyst? http://developer.amd.com/tools/CodeAnalyst/codeanalystlinux/Pages/default.aspx – Necrolis Mar 01 '12 at 18:51
  • If what you want to do is find ways to speed up the program, first understand that *[gprof will disappoint you](http://stackoverflow.com/questions/1777556/alternatives-to-gprof/1779343#1779343)*, and second, accuracy of measurement is not what you need. – Mike Dunlavey Mar 02 '12 at 01:39

3 Answers3

4

Recent versions of linux perf can be used (with --call-graph dwarf):

perf record -F99 --call-graph dwarf myapp

It uses .eh_frames (or .debug_frames) with libunwind to unwind the stack.

In my experience, it get lost, sometimes.

With recent version of perf+kernel on Haswell, you might be able to use the Last Branch Record with --call-graph lbr.

ysdx
  • 7,725
  • 1
  • 32
  • 44
  • When I do this, the stack traces are missing from `perf script` output. Any ideas? – Tavian Barnes Aug 05 '15 at 15:57
  • @TavianBarnes, I'm currently having the same [issue on Debian](https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=793409) with perf_4.0 and perf_4.1. When switching to perf_3.16 (for `perf script` => `perf_3.16 script`) it's working as expected. I did not (yet) check if this was an upstream issue a problem with the Debian package. – ysdx Aug 17 '15 at 09:00
  • For references, AFAIK it copies a part of the stack on disk for each sample and unwind it afterwards using CFI. – ysdx Aug 17 '15 at 09:05
2

There are none that I'm aware of. With frame pointers, walking a stack is a fairly simple exercise. You simply dereference the frame pointer to find the old frame pointer, stack pointer, and instruction pointer, and repeat until you're done. Without frame pointers you cannot reliably walk a stack without additional information, which on ELF platforms generally means DWARF CFI. DWARF is fairly complex to parse, and requires you to read in a fair amount of additional information which is tricky to do in the time constraints that profilers need to work in.

One plausible method for implementing this would be to simply save the stack memory at every sample and then walk it offline using the CFI to unwind properly. Depending on the depth of the stack this could require quite a bit of storage, and the copying could be prohibitive. I've never heard of a profiler using this technique, but Julian Seward floated it as a potential implementation strategy for Firefox's built-in profiler.

Ted Mielczarek
  • 3,617
  • 23
  • 30
  • i have been using perf with dwarf (stack unwind) method. it works without having to compiling with -fno-omit-frame-pointer – Nasir Jan 23 '18 at 05:29
1

It would be hard for most profilers to work when -fomit-frame-pointer is asserted. You probably need to not use that and to link against debugging versions of the libraries (which are almost certainly compiled without -fomit-frame-pointer) if you want to do reasonable profiling.

Perry
  • 4,076
  • 15
  • 19
  • Even if a library is compiled without '-fomit-frame-pointer` it is enabled as long as there is `-O` option used: It's a default option in x86_64. With libunwind you can do debugging without frame pointers. – Gunther Piez Mar 01 '12 at 18:20