2

I have two numpy arrays a and b:

a and b are the same dimensions, a could be a different size than b.

For instance:

a = [[1,2], ..., [5,7]]
b = [ [3,8], [4,7], ... [9,15] ] 

Is there an easy way to compute the Euclidean distance between a and b such that this new array could be used in a k nearest neighbors learning algo.

Note: This is in python

Mike El Jackson
  • 681
  • 2
  • 13
  • 22
  • 1
    So you want the Euclidean distance from each point in `a` to each point in `b`? Can you give a small example input *and* output? – aganders3 Nov 13 '15 at 17:29

4 Answers4

2

scipy.spatial.distance.cdist does this.

Adam Acosta
  • 593
  • 2
  • 6
1

If what you want is k nearest neighbors, then there are more efficient ways than computing the full distance matrix (especially with many points). Check out scipy's KDTree if you want fast k-neighbors searches.

jakevdp
  • 53,429
  • 7
  • 86
  • 117
0

You can use scipy.spatial.distance.cdist like this:

from scipy.spatial import distance

a = [[1,2], ..., [5,7]]
b = [ [3,8], [4,7], ... [9,15] ] 
dist = distance.cdist(a, b, 'euclidean')

This method can be used only if a and b have small number of elements. If there are millions of elements than its slow and requires heavy space on memory. I.e. lets say 'a' has 1 million elements and 'b' has 1000 elements. You will end up with O(m*n), where m=1000000 and b=1000.

Here you can see few methods are compared in terms of efficiency: Efficient and precise calculation of the euclidean distance

KARANJ
  • 81
  • 1
  • 10
0

You can use numpy. Here an example:

import numpy as np


a = np.array([3, 0])
b = np.array([0, 4])

c = np.sqrt(np.sum(((a - b) ** 2)))
# c == 5.0
Vlad Bezden
  • 59,971
  • 18
  • 206
  • 157