5

I have read that there are AMD processors out there that allow you to measure the number of cache hits and misses. I am wondering if also such a feature is available on Intel Core Duo machines or if they do not support this yet.

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
Alex12
  • 81
  • 3

4 Answers4

4

Yes, there are a lot of hardware performance counters since ancient Pentium Pro.

Oprofile and perf in Linux, Vtune in Linux/Windows, Shark in MacOSX can use them.

All counters are listed in intel architecture documentation (Volume 3B, chapter 30; list in Appendix A): http://www.intel.com/products/processor/manuals/

Even Atom have some performance registers.

One of good list for different CPUs is here http://oprofile.sourceforge.net/docs/

osgx
  • 80,853
  • 42
  • 303
  • 470
  • Yes, Shark can use hardware counters, take a look at http://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/SharkUserGuide/AdvancedHardwareCounterConfiguration/AdvancedHardwareCounterConfiguration.html#//apple_ref/doc/uid/TP40005233-CH10-SW1 – osgx Nov 10 '10 at 04:16
1

If you're working on Linux, there's an interesting library called LiMiT being developed at Columbia University that can read the performance counters quickly and also virtualizes them to avoid problems with processes being started and stopped, moved between processors, etc. I'm taking a class with the developer at the moment, though I don't have anything to do with the project myself.

osgx
  • 80,853
  • 42
  • 303
  • 470
dsolimano
  • 8,272
  • 3
  • 45
  • 61
0

This document certainly suggests that Intel Core Duo processors can provide the information you seek. I guess that googling around the Intel web-site would be useful too.

High Performance Mark
  • 74,067
  • 7
  • 97
  • 147
-1

I personally use the Time Stamp Counter via an assembly wrapper that executes the instruction rdtsc. I then get an unsigned 64 bit integer containing the number of internal clock cycles that have passed since the processor was powered up. The difference between two read-outs is the number of code cycles required to execute the piece of code inbetween. Access to the instructions for cache hit readouts may be implemented in the same manner.

I find it difficult to understand what conclusions to draw from reading the cache counters without having a time frame to relate to. This time frame should not be too long or a task switch or interrupt might affect the value.

According to Microsoft the rdtsc instruyction may not be accurate if down-throttling functionality is enabled on the processor (to lower energy consumption) which should be kept in mind (or switched off!).

Olof Forshell
  • 3,001
  • 20
  • 25
  • This question is about reading the performance counters, not the time stamp counter (which anyways these days reads out real time, not cycles). – BeeOnRope Oct 15 '17 at 18:07