Questions tagged [intel]

For issues related to Intel semiconductor chips and assemblies, Intel architectural features and ISA extensions, and Intel chips micro-architecture.

Intel Corporation is an American multinational semiconductor chip maker corporation headquartered in Santa Clara, California, United States. Intel is the inventor of the x86 processor architecture and makes central processing units, motherboard chipsets, graphic processing units, network interface controllers and much more devices related to communications and computing.

In addition to their hardware offerings Intel also produces a variety of software including compilers, libraries for mathematical computation(Intel MKL), threading(OpenMP, Intel Performance Primatives, Threading Building Blocks), parallel communication(MPI,OFED/True Scale Infiniband Stack) and several other products included in the Intel Parallel Studio toolkit. In addition to these offerings which are widely used in HPC Intel also produces software for datacenter management and is one of the most prolific contributors to the Linux kernel.

This tag should be used for questions about Intel hardware and software.

The x86 and/or x86-64 tags are better choices for questions about assembly programming for the architecture, rather than things like performance tuning specifically for Intel's implementation of x86.


Useful links

Related tags

3087 questions
41
votes
3 answers

C code loop performance

I have a multiply-add kernel inside my application and I want to increase its performance. I use an Intel Core i7-960 (3.2 GHz clock) and have already manually implemented the kernel using SSE intrinsics as follows: for(int i=0; i
Ricky
  • 1,633
  • 2
  • 19
  • 22
41
votes
2 answers

How are x86 uops scheduled, exactly?

Modern x86 CPUs break down the incoming instruction stream into micro-operations (uops1) and then schedule these uops out-of-order as their inputs become ready. While the basic idea is clear, I'd like to know the specific details of how ready…
BeeOnRope
  • 51,419
  • 13
  • 149
  • 309
40
votes
2 answers

How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent

This loop runs at one iteration per 3 cycles on Intel Conroe/Merom, bottlenecked on imul throughput as expected. But on Haswell/Skylake, it runs at one iteration per 11 cycles, apparently because setnz al has a dependency on the last imul. ;…
Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
38
votes
3 answers

Why use _mm_malloc? (as opposed to _aligned_malloc, alligned_alloc, or posix_memalign)

There are a few options for acquiring an aligned block of memory but they're very similar and the issue mostly boils down to what language standard and platforms you're targeting. C11 void * aligned_alloc (size_t alignment, size_t size) POSIX int…
Praxeolitic
  • 17,768
  • 12
  • 57
  • 109
38
votes
4 answers

Intel SSE and AVX Examples and Tutorials

Is there any good C/C++ tutorials or examples for learning Intel SSE and AVX instructions? I found few on Microsoft MSDN and Intel sites, but it would be great to understand it from the basics..
veda
  • 5,488
  • 14
  • 53
  • 76
37
votes
2 answers

SIMD instructions lowering CPU frequency

I read this article. It talked about why AVX-512 instruction: Intel’s latest processors have advanced instructions (AVX-512) that may cause the core, or maybe the rest of the CPU to run slower because of how much power they use. I think on…
HCSF
  • 1,634
  • 9
  • 23
37
votes
2 answers

Why is Intel Haswell XEON CPU sporadically miscomputing FFTs and ART?

During the last days I observed a behaviour of my new workstation I couldn't explain. Doing some research on this problem, there might be a possible bug in the INTEL Haswell architecture as well as in the current Skylake Generation. Before writing…
semm0
  • 917
  • 7
  • 17
36
votes
7 answers

Intel x86 Opcode Reference?

What is a relatively quick and easy method of looking up what an arbitrary opcode means (say, 0xC8) in x86? The Intel Software Developer's manual isn't very fun to search through...
user541686
  • 189,354
  • 112
  • 476
  • 821
36
votes
11 answers

Intel HAXM on macOS high sierra (10.13)

Is there any way of using Android emulator on High Sierra (10.13)? When I run ./HAXM\ installation -u It says: HAXM silent installation only supports macOS from 10.8 to 10.12 !
Andrii Kovalchuk
  • 3,155
  • 2
  • 30
  • 29
35
votes
8 answers

Why is floor() so slow?

I wrote some code recently (ISO/ANSI C), and was surprised at the poor performance it achieved. Long story short, it turned out that the culprit was the floor() function. Not only it was slow, but it did not vectorize (with Intel compiler, aka…
Roger
35
votes
2 answers

Significant FMA performance anomaly experienced in the Intel Broadwell processor

Code1: vzeroall mov rcx, 1000000 startLabel1: vfmadd231ps ymm0, ymm0, ymm0 vfmadd231ps ymm1, ymm1, ymm1 vfmadd231ps ymm2, ymm2, ymm2 vfmadd231ps ymm3, ymm3, ymm3 vfmadd231ps ymm4, ymm4, ymm4 vfmadd231ps ymm5,…
User9973
  • 509
  • 4
  • 8
33
votes
1 answer

What does Intel mean by "retired"?

In the Intel Manual, there is mention of a lot of performance events which have descriptions like "Mispredicted taken branch instructions retired.". What exactly does retired mean in this context? Note that I have already looked at Intel's…
merlin2011
  • 63,368
  • 37
  • 161
  • 279
33
votes
6 answers

Do Intel and AMD processor have the same assembler?

The C language was used to write UNIX to achieve portability -- the same C language program compiled using different compilers produces different machine instructions. How come Windows OS is able to run on both Intel and AMD processors?
user136281
  • 339
  • 1
  • 3
  • 4
32
votes
4 answers

Running Intel® HAXM installer takes forever with Android Studio Setup Wizard on Windows 10

I have a newly installed Android Studio, upon downloading its components I've stuck on the setup wizard Running Intel® HAXM installer: What should I do? Will all my downloaded components lost if I end the task of my Android Studio with my Task…
5ervant
  • 3,928
  • 6
  • 34
  • 63
32
votes
5 answers

Can one construct a "good" hash function using CRC32C as a base?

Given that SSE 4.2 (Intel Core i7 & i5 parts) includes a CRC32 instruction, it seems reasonable to investigate whether one could build a faster general-purpose hash function. According to this only 16 bits of a CRC32 are evenly distributed. So what…
DavidD
  • 351
  • 4
  • 5