27

can anyone recommend portable SIMD library that provides a c/c++ API, works on Intel and AMD extensions and Visual Studio, GCC compatible. I'm looking to speed up things like scaling a 512x512 array of doubles. Vector dot products, matrix multiplication etc.

So far the only one I found is: http://simdx86.sourceforge.net/ but as the very first page says it doesn't compile on visual studio.

There's also Intel IPP which doesn't work on AMD from what I gather. And there's Framewave from AMD, but I was having some problems compiling and linking their library and their forums are completely dead. Anyone managed to use Framewave anywhere?

Thanks.

Budric
  • 3,291
  • 8
  • 32
  • 38
  • I saw someones master's thesis on this topic once. Can't for the life of me recall what terms would bring it up in a search. – dmckee --- ex-moderator kitten Jun 11 '09 at 16:01
  • 8
    Check out [libsimdpp](https://github.com/p12tic/libsimdpp) library - it provides a common interface for SSE2-SSE4.1, AVX, AVX2, NEON, FMA3/4 and XOP intrinsics. As a bonus, convenient dynamic dispatch mechanism is provided: the same source code can be compiled several times with different compiler options (namespaces take care of ODR), linked into the same executable and the library will automatically select the best implementation for the target processor. (disclaimer: I'm the author) – user12 Oct 31 '13 at 03:40
  • It would be better to migrate this question to http://softwarerecs.stackexchange.com – eonil May 24 '14 at 08:45

5 Answers5

11

Eigen is an MPL2-licensed header-only C++ library that has vector / matrix math that is optimized for SSE, Neon, and Altivec. They have more more sophisticated math operations in their add-on modules.

Jim Hunziker
  • 11,545
  • 7
  • 50
  • 57
9

Since you mention high-level operations on matrices and vectors, ATLAS, Intel's MKL, PLASMA, and FLAME may be of interest.

Some C++ matrix math libraries include uBLAS from Boost, Armadillo, Eigen, IT++, and Newmat. The POOMA library probably also includes some of these things. This question also refers to MTL.

If you're looking for lower-level portability primitives, a colleague of mine has developed a wrapper around SSE2, Altivec, VSX, Larrabee, and Cell SPE vector operations. It can be found in our source repository, but its licensing (academic) may not be appropriate if you want to distribute it as part of your work. It is also still under significant development to cover the range of application needs that it's targeted at.

Community
  • 1
  • 1
Phil Miller
  • 32,214
  • 11
  • 62
  • 86
3

Try liboil or the related ORC. Especially ORC is interesting; it implements a high-level assembly language that is compiled into architecture specific code. Pretty sophisticated, much more so than a simple wrapper library.

mabraham
  • 2,318
  • 1
  • 25
  • 23
dietr
  • 1,188
  • 10
  • 22
3

Check out macstl: http://www.pixelglow.com/macstl/

2

If you don't mind getting down and dirty with assembler then you can always use the intrinsic functions for all the SIMD instructions. They will be processor specific, i.e. SSE4 intrinsics will only run on SSE4 enabled CPUs and it's up to you to make sure the extensions are there.

There is a good article here about applying SIMD.

You could, however, use a compiler that generates SIMD code for you without any external libraries. VectorC is supposed to be good although I've never used it personally. It doesn't require any special libraries as far as I know, it just spots those bits of source code that can benfit from SIMD and compiles to whatever level of SSE you specify.

Tim Cooper
  • 144,163
  • 35
  • 302
  • 261
Skizz
  • 64,439
  • 10
  • 63
  • 105
  • Thanks, getting away from processor specific assembly is my main goal. I don't want to have to worry about whether the CPU supports SSE or SSE2 and write 2 different versions of the code in some cases. I was hoping someone already did that in a library =). Same for compiler specific extensions etc. – Budric Jun 11 '09 at 18:57