Questions tagged [vtune]

Use this tag to ask questions about Intel® VTune™ Profiler, which is an advanced performance profiler to find and optimize performance bottlenecks across CPU, GPU, and FPGA systems.

140 questions
2
votes
3 answers

How do I generate symbol information to use with Linux version of Intel's VTune Amplifier?

I am using Intel VTune Amplifier XE 2011 to analyze the performance of my program. I want to be able to view the source code in the analysis results, and the documentation says I need to provide the symbol information. Unfortunately, it does not…
Dylan Klomparens
  • 2,767
  • 6
  • 32
  • 47
2
votes
1 answer

OpenCV build in debug mode with optimizations?

I'm trying to profile OpenCV using Intel VTune Amplifier. In this page, there is a list of compiler options suggested to obtain the best analysis. As you can see, it's a mix of debug flags (e.g. -g) and optimization flags (e.g. -O2 or higher), so we…
justHelloWorld
  • 5,380
  • 5
  • 41
  • 97
2
votes
2 answers

Interpreting Intel VTune's Memory Bound Metric

I see the following when I run Intel VTune on my workload: Memory Bound 50.8% I read the Intel doc, which says (Intel doc): Memory Bound measures a fraction of slots where pipeline could be stalled due to demand load…
Frank
  • 4,055
  • 7
  • 37
  • 54
2
votes
1 answer

How to restrict Vtune Analysis to a specific function

I have a program whose basic structure is as below : main() { some malloc() allocations and file reads into these buffers call to an assembly language routine that needs to be optimized to the maximum write back the…
quasar66
  • 462
  • 3
  • 13
2
votes
1 answer

difficult understanding memory address in Intel's vtune tool

In the above image, I have used vtune tool to see process's flow. Also dumped memory for windbg. I intend to see if that Engine.dll+840c1 disassembled section in windbg, but seems result is different. Can you guys tell what I'm doing wrong??
백경훈
  • 101
  • 4
2
votes
3 answers

vtune - no symbols available

I have used vtune several times in the past, usually without too much trouble. Unfortunately the gaps between each use are often so long that I forget some aspects of how to use it each time. I know that the line number and symbols information needs…
Mick
  • 7,929
  • 20
  • 73
  • 162
2
votes
1 answer

Vtune total time in MKL function

I am working on a university project that asks me to give a breakdown on some tridiagonal eigensolvers implemented in MKL (11.1.). So I implemented some testbed for that and now, I am trying to profile this in vtune (Intel VTune Amplifier XE 2013…
yomar
  • 21
  • 2
2
votes
1 answer

FLOP measurement

I'm trying to estimate FLOPS for my application using intel vtune Amplifier and I'm using this post here as a guideline : https://software.intel.com/en-us/articles/estimating-flops-using-event-based-sampling-ebs/ The problem is that I can't find the…
M_rr113
  • 29
  • 2
2
votes
1 answer

start_thread clone taking most of the time in parallel program - bad parallelization or wrong report?

I'm currently working on parallelizing a C++ program in order to improve its performance on multi-core systems. Using OpenMP and considering the challenges (thread synchronization, data accesses, etc) we finally found a way to make the entire…
leosh
  • 397
  • 2
  • 13
2
votes
0 answers

Intel vtune takes lots of time to collect information

When I am using vtune to collect information of a process I only need to focus on the result of one particular dll(lets say X.dll). But When I finished with running the process and in the collecting information stage, one dll(lets say Y.dll) will…
amilamad
  • 452
  • 6
  • 9
2
votes
0 answers

Big difference between Elapsed Time and CPU Time

VTune hotspots analysis reports my program's execution time (elapsed time) was 60 seconds out of which only 10 seconds are reported as "CPU Time". I'm trying to where the remaining 50 seconds was spent. Using Windows Process Monitor's File System…
DigitalEye
  • 1,222
  • 1
  • 13
  • 24
2
votes
2 answers

How to disassemble a compiler generated code?

I would like to see the disassembled code in the same order that the compiler generates after instruction rescheduling. b.t.w I am using GDB and when I give a command saying disas /m FunctionName it gives me disassembled code in the order of source…
2
votes
1 answer

is it possible to do multiple runs in Intel VTune Amplifier XE

Is there a way to run same test(for example Lightweight Hotspots) multiple times in Intel VTune Amplifier XE ??? It is annoying to do multiple clicks to perform a single test. I have looked though documentation, but found nothing. Thanks !
newprint
  • 6,119
  • 9
  • 55
  • 97
2
votes
1 answer

Understanding VTune report

this is a followup to an existing thread (http://stackoverflow.com/questions/12724887/caching-in-a-high-performance-financial-application) - I found that it's not the cache that hinders my application. To cut the long story short, I have an…
Daniel Bencik
  • 909
  • 1
  • 8
  • 27
2
votes
2 answers

Is it possible to use vtune on certain code snippets in a binary and not an entire binary?

I am adding usage of a small library to a large existing piece of software and would like to analyze (in finder detail than just in&out rdtsc() or gettimeofday calls) the overhead and it's attribution of the small library. Using things like rdtsc()…
Palace Chan
  • 7,625
  • 5
  • 36
  • 83
1 2
3
9 10