Questions tagged [intel-pmu]

Questions related to the use of the Intel Performance Management Unit, which provides performance counters related to the performance of currently executing code.

The Intel performance management unit provides performance counters which track performance related metrics for the currently executing code.

They are useful while profiling code, and are supported by Intel's VTune, Linux's perf command and the Windows Performance Toolkit.

The counters and the details of how to program them vary by CPU architecture and the details are available in Chapter 18 and 19 of the Intel-64 and IA-32 Architectures Software Developer Manual, Volume 3.

65 questions
2
votes
0 answers

Determine fixed counter to event mapping with libpfm4

I'm using libpfm4 to determine Intel performance monitor counter encodings (e.g., to map between a human-readable name and the encoding). Intel PMUs have a number of "fixed counters" which can be enabled or disabled, but when enabled always count…
BeeOnRope
  • 51,419
  • 13
  • 149
  • 309
2
votes
2 answers

Intel Performance Monitor -- any way to monitor per-process?

How would I go about monitoring a particular process's execution (namely, its branches, from the Branch Trace Store) using the Intel Performance Counter monitor, while filtering out other process's information?
user541686
  • 189,354
  • 112
  • 476
  • 821
1
vote
0 answers

Performance Counters and IMC Counter Not Matching

I have an Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz (Haswell) processor. In a relatively idle situation, I ran the following Perf commands and their outputs are shown, below. The counters are offcore_response.all_data_rd.l3_miss.any_response and…
1
vote
1 answer

Let perf use certain performance counters properly with newer processors

I'm trying to use perf to measure certain events, including L1-dcache-stores, on my machine, which has a relatively new processor i9-10900K compared to the relatively old CentOS 7 with kernel 3.10.0-1127 The problem is that perf reports that…
Joshua Chia
  • 1,303
  • 11
  • 23
1
vote
1 answer

Why "setne %al" used "a lot of cycles" in perf annotation?

I was very confused when I saw this perf report. I have tried it for several times, and this setne instruction always takes the most in the function. The function is a big function and below just shows a small piece of the function. The report is…
1
vote
0 answers

Performance difference of two similar assembly instructions in Visual Studio CPU Usage

I have some inline assembly which I try to profile. Interestingly two very similar operations maxss and minss right after each other have a very different performance impact. Does anybody have experience with this? Perhaps it is some caching? Or the…
1
vote
1 answer

Why do kill dependency instructions consume reservation station slots?

I always thought that instructions for killing dependencies, e.g xor reg, reg do not have to be executed and are ready for retirement as soon as the Renamer moves them to the Re-order Buffer. I just measure the number of microoperations getting into…
Some Name
  • 6,872
  • 4
  • 9
  • 32
1
vote
0 answers

trouble with pmi handle on windows 7

I am trying to set up performance monitorint interrupt on counter overflow to collect some information. For this I created driver. I skip some part of code that are irrelevant. driver.c extern VOID EnableReadPmc(); extern VOID PmiHandle(); extern…
1
vote
0 answers

Variable event count based sampling using perf

I am trying to read the PMU event counters whenever a particular event counter overflows using perf. I know that perf works with fixed sample period. What i am looking for is the possibility to read PMU counters each time with a different sample…
1
vote
0 answers

Why do mov reg,reg instructions reading the result of a load account for so many cycles with perf record?

I'm profiling my program in Linux using perf tool, when checking the report I found a place really confuse me. I attach few lines of the report below: 0.94 : 451ab5: mov (%r15),%r8 0.44 : 451ab8: mov …
2power10
  • 1,181
  • 1
  • 10
  • 30
1
vote
1 answer

Reading performance registers from the kernel

I want to read certain performance counters. I know that there are tools like perf, that can do it for me in the user space itself, I want the code to be inside the Linux kernel. I want to write a mechanism to monitor performance counters on…
1
vote
0 answers

Using PEBS and Linux Perf to Count the number of CPU cycles passed to execute X number of instructions

I want to do something like this: After 100 million instructions have passed, query the Linux perf HW CPU cycles and record it in a file. I want to use this code to characterize the performance of applications/benchmark programs during different…
1
vote
1 answer

Intel PEBS sample context

I am using Linux perf tool to monitor system-wide (exclude_kernel == 0) PEBS samples. I was wondering whether PEBS sample can occur at interrupt context (i.e., during an interrupt is being served by the interrupt handler). If it is possible, is…
Proy
  • 189
  • 9
1
vote
0 answers

how to reset general purpose performance counter of intel

I know we can use wrmsr and rdmsr instruction to set the performance counter and read the general purpose performance counter register. However, my question is: Do we need to reset the general purpose performance counter register before we issue…
Mike
  • 1,521
  • 1
  • 15
  • 33
1
vote
0 answers

Power counter on Intel processor or GPUs

Anyone has any experiences on power counters on Intel processors(intel performance counter management library) or GPUs, which type of CPUs and GPUs support such counters, how accurate are these counters? Do such counters needs special motherboard?
Lu Li
  • 11
  • 1