Questions tagged [intel-pmu]

Questions related to the use of the Intel Performance Management Unit, which provides performance counters related to the performance of currently executing code.

The Intel performance management unit provides performance counters which track performance related metrics for the currently executing code.

They are useful while profiling code, and are supported by Intel's VTune, Linux's perf command and the Windows Performance Toolkit.

The counters and the details of how to program them vary by CPU architecture and the details are available in Chapter 18 and 19 of the Intel-64 and IA-32 Architectures Software Developer Manual, Volume 3.

65 questions
3
votes
3 answers

How can we know the exact number of the hardware performance counters built-in CPU?

After I have done several reading on Hardware Performance Counter, I can claim that all of the Intel processors have supported with Hardware Performance Counter. So, In order to access these additional hardware registers ,i.e. hardware performance…
M.Mrd
  • 31
  • 1
2
votes
1 answer

Using the perf events from perf list programatically

When I run perf list on my Linux system I get a long list of available perf events. Is it possible to list and use these events programatically from another process, using perf_event_open(2)? That is, how can I get this list from another process and…
BeeOnRope
  • 51,419
  • 13
  • 149
  • 309
2
votes
1 answer

only 2 PERF_TYPE_HW_CACHE events in perf event group

Working on a custom implementation on top of perf_event_open I need to monitor multiple PERF_TYPE_HW_CACHE concurrently. The Intel manual states that there are 4 programmable counters per thread (or 8 if HyperThreading is disabled) for my CPU's…
Orion Papadakis
  • 338
  • 1
  • 14
2
votes
1 answer

Perf Imprecise Call-Graph Report

Recent Intel processors provide a hardware feature (a.k.a., Precise Event-Based Sampling (PEBS)) to access precise information about the CPU state on some sampled CPU events (e.g., e). Here is an extract from Intel 64 and IA-32 Achitecture's…
TheAhmad
  • 700
  • 1
  • 5
  • 17
2
votes
1 answer

How to read PMC(Performance Monitoring Counter) of x86 intel processor

My desktop is Intel x86_64 processor with Ubuntu operating system. I know there is perf tool to get a list of statistics of a program. But what I am trying to do is read performance counter directly without using the perf tool. First…
rhdhyekw93
  • 23
  • 3
2
votes
1 answer

Best event counter to use for measuring wall clock time using perf tools

Simple but yet complicated question: What counter to use to get perf tools to measure wall clock time? As a base line the first thing when profiling code I think I need to measure is just wall clock time to get an first idea where the code takes…
Peter
  • 715
  • 2
  • 7
  • 18
2
votes
1 answer

Inconsistent values of ARM PMU cycles counter

I'm trying to measure performance of my code in linux kernel with pmu. First of all I want to test pmu therefore created simple loop of couple operations in kernel. I placed it under spin lock with disabled interrupts so my test code can't be…
scopichmu
  • 109
  • 9
2
votes
1 answer

What is the meaning of IB read, IB write, OB read and OB write. They came as output of Intel® PCM while monitoring PCIe bandwidth

I am trying to measure the PCIe bandwidth of NIC devices using Intel® Performance Counter Monitor (PCM) tools. But, I am not able to understand the output of it. To measure the PCIe bandwidth, I executed the binary pcm-iio. This binary helps to…
2
votes
0 answers

Determine L1 fill buffer occupancy related to stores on Intel

To determine the L1D fill buffer occupancy related to loads, one can use the L1D_PEND_MISS events, in particular L1D_PEND_MISS.PENDING, which is documented as follows: Counts duration of L1D miss outstanding, that is each cycle number of Fill…
BeeOnRope
  • 51,419
  • 13
  • 149
  • 309
2
votes
0 answers

How to narrow down intel PCM data to a single process?

I'm trying to use Intel Performance Counter Monitor (PCM) to understand L3 cache miss and some other performance criteria in my code. I'm not sure how to make sense out of the numbers I'm getting and would appreciate some insight. I expect ideally…
Amir
  • 151
  • 1
  • 11
2
votes
0 answers

value of PMC (Performance Monitoring Counter) for L3 cache-misses is too high

I'm searching a way to estimate the number of L3 cache-misses by using 'IA32_PERFEVTSELx' and 'IA32_PMCx' MSR pair on my Linux PC with Intel CPU (Intel i7 6700 skylake). To do that, I installed a timer in the kernel and it reported the value of a…
nickeys
  • 117
  • 2
  • 9
2
votes
0 answers

Intel PMU event for L1 cache hit event

I'm trying to count the number of cache hit at different levels (L1, L2 and L3) of cache for a program on Intel Haswell processor. I wrote a program to count the number of L2 and L3 cache hits by monitoring the respective events. To achieve that, I…
Mike
  • 1,521
  • 1
  • 15
  • 33
2
votes
2 answers

Perf tool stat output: multiplex and scaling of "cycles"

I am trying to understand the multiplex and scaling of "cycles" event in the "perf" output. The following is the output of perf tool: 144094.487583 task-clock (msec) # 1.017 CPUs utilized 539912613776 instructions …
Kailash Akilesh
  • 133
  • 2
  • 6
2
votes
1 answer

How to measure late prefetches and killed prefetches on Haswell micro-architecture?

I am using Intel Xeon 2660 v3 and issuing lots of software prefetches to exploit the MLP as well as to reduce the stall time. Now I want to profile the application to get the overall gain due to software prefetches. In the paper "Improving the…
A-B
  • 487
  • 2
  • 18
2
votes
1 answer

Reading performance counters for Intel Xeon in userspace

I want to read performance counters for intel xeon using a shell script in userspace. Oprofile will not work as it is too rigid to fulfill my requirements. I am using FC13. Thanks
ahmed
  • 21
  • 2