22

I have an application that involves large n-dimensional arrays which are very sparse. scipy.sparse has a useful 'vectorized getting and setting' feature, so that Cython can be used to populate a sparse matrix quickly.

Of course the scipy package can't handle n-dimensions. There are two packages I have found that do n-dimensional sparse arrays in python sparray and ndsparse. However it seems neither has the vectorized getting and setting feature.

So I need either:

  • a python package for n-dimensional arrays with vectorized get and set or
  • a c library for sparse arrays which I can easily access with Cython or
  • some 'roll your own' option which I guess would require a c equivalent to a python dict

For my purpose I think mapping the n-dimension coordinates back to 1 or two dimensions could work. What would be better though is to have a dict equivalent that i can access fast inside a Cython loop. I assume this rules out the python dict.

Wondering if someone could give me an example of how to use the c++ map object from within Cython?

Neal Hughes
  • 502
  • 3
  • 13
  • 14
    It's questions like this when I disagree with the `"asking us to recommend or find a tool, library or favorite off-site resource are off-topic"` close reason. The man has obviously done his research, what else can one do? – Jonathon Reinhart Nov 21 '13 at 06:46
  • 1
    Sorry I didn't realise this was a problem. I suppose I can argue, I'm not just after a library but any tips on how to do this myself. – Neal Hughes Nov 21 '13 at 06:51
  • 8
    The close reason exists because of worthless "developers" who continually post questions that ask nothing but "how do i make zipfile using jquery show me teh codez plz". But in this case, I believe this is a perfectly valid, well-researched question. +1. – Jonathon Reinhart Nov 21 '13 at 07:02
  • @JonathonReinhart: +1 for your comment. I second it. – Yavar Nov 21 '13 at 09:44
  • Add another option to your list: transform `n`-dimensional coordinates to `2`-dimensional coordinates, and *vice-versa*. I suspect this will wreck (or at least diminish) the worth of *vectorized getting and setting* but you might want to validate that. – High Performance Mark Nov 21 '13 at 10:01
  • 1
    This kind of question is indeed a problem, because if the rules were followed to the letter, it should be closed. – Ivan Nov 21 '13 at 13:59

1 Answers1

5

If you decide to go with the C dict option, you can use the C++ STL's std::map. It's unlikely that you'll find faster or more robust native code that implements a dictionary/map.

cppmap.pyx:

# distutils: language = c++

cdef extern from "<map>" namespace "std":
    cdef cppclass mymap "std::map<int, float>":
        mymap()
        float& operator[] (const int& k)

cdef mymap m = mymap()
cdef int i
cdef float value

for i in range(100):
    value = 3.0 * i**2
    m[i] = value

print m[10]

setup.py:

from distutils.core import setup
from Cython.Build import cythonize
setup(name = "cppmapapp"
  ext_modules = cythonize('*.pyx'))

Command line:

$ python setup.py build
$ cd build/lib.macosx-10.5-x86_64-2.7
$ python -c 'import cppmap'
300.0
Robert T. McGibbon
  • 4,757
  • 2
  • 32
  • 44