1

I have more than 5 Million pairs of two 3D Vectors and I need to calculate the angle between each pair. I tried with:

# calculate angle 
def unit_vector(vector):
    return vector / np.linalg.norm(vector)

def angle_between(v1, v2):
    v1_u = unit_vector(v1)
    v2_u = unit_vector(v2)
    return np.arccos(np.clip(np.dot(v1_u, v2_u), -1.0, 1.0))

as in this post

E.g.:

a=[[1,1,1],[1,2,1],[6,4,5]]
b=[[1,0,0],[6,2,2],[1,9,2]]

anglevec=np.zeros(len(a))

for i in range(len(a)):
    anglevec[i]=angle_between(a[i], b[i])

print(anglevec)

but the implementation with the loop is far too slow.

Can anyone help?

smaica
  • 397
  • 6
  • 19
  • The problem is: do you really need the angle? With many 3d (and 2d) calculations, you can often skip the angle (arccos) and compare the dot product (or product*s*) itself. Arccos is the worst one here for performance, dot product is made of simple calculations. – h4z3 Jun 12 '19 at 10:01
  • 1
    If you're desperate enough you could create an arccos lookup table, say with thousands of entries. Then the argument can be scaled and truncated to an integer, the index in your table. Lookups are fast. Your precision would be limited but you could estimate the error and see if you could live with it. – Mike O'Connor Jun 12 '19 at 10:30
  • when I precalculate the unit vectors, can I do the dot product without looping over a and b? – smaica Jun 12 '19 at 10:41

2 Answers2

1

A Numba approach

The task is quite simple to implement eg. https://stackoverflow.com/a/16544330/4045774 The only problem is that Python loops are extremely slow. This can be avoided by using Numba or Cython.

Example

import numba as nb
import numpy as np

#You can disable parallelization with parallel=False
@nb.njit(fastmath=True,error_model="numpy",parallel=True)
def angle(v1,v2):
    #Check the dimensions, this may also have an effect on SIMD-vectorization
    assert v1.shape[1]==3
    assert v2.shape[1]==3
    res=np.empty(v1.shape[0])

    for i in nb.prange(v1.shape[0]):
        dot=0.
        a=0.
        b=0.
        for j in range(3):
            dot+=v1[i,j]*v2[i,j]
            a+=v1[i,j]**2
            b+=v2[i,j]**2
        res[i]=np.arccos(dot/(np.sqrt(a*b)))
    return res

Timings

#Use numpy-arrays when working with arrays
a=np.random.rand(500_000,3)
b=np.random.rand(500_000,3)

%timeit res=your_func(a,b)
5.65 s ± 45.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit res=angle(a,b) #without using Numba 
3.15 s ± 13 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit res=angle(a,b) #with Numba
2.13 ms ± 441 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
max9111
  • 5,288
  • 1
  • 11
  • 22
-1

Here a solution using Pool and Pool.map.

from multiprocessing import Pool, cpu_count
import numpy as np

def unit_vector(vector):
    return vector / np.linalg.norm(vector)

def angle_between(v1, v2):
    #if you want to maximize speed, avoid making variables
    #v1_u = unit_vector(v1)
    #v2_u = unit_vector(v2)
    return np.arccos(np.clip(np.dot(unit_vector(v1_u),unit_vector(v2_u)),-1.0,1.0))

def calc_angle(_list):
    #use list unpacking instead of instantiate variables
    return angle_between(*_list) 


a=[[1,1,1],[1,2,1],[6,4,5]]
b=[[1,0,0],[6,2,2],[1,9,2]]

with Pool(cpu_count()) as p: #use the context manager
    angles = p.map(calc_angle, zip(a,b))

The output:

>>> angles
[0.9553166181245092, 0.7398807743787404, 0.8775836986593762]
alec_djinn
  • 7,288
  • 8
  • 29
  • 58