3

I have some large arrays each with i elements, call them X, Y, Z, for which I need to find some values a, b--where a and b are real numbers between 0 and 1--such that, for the following functions,

r = X - a*Y - b*Z
r_av = Sum(r)/i
rms = Sum((r - r_av)^2), summing over the i pixels

I want to minimize the rms. Basically I'm looking to minimize the scatter in r, and thus need to find the right a and b to do that. So far I have thought to do this in nested loops in one of two ways: either 1)just looping through a range of possible a,b and then selecting out the smallest rms, or 2)inserting a while statement so that the loop will terminate once rms stops decreasing with decreasing a,b for instance. Here's some pseudocode for these:

1) List

for a = 1
for b = 1
calculate m
b = b - .001
a = a - .001
loop 1000 times
sort m values, from smallest
print (a,b) corresponding to smallest m

2) Terminate

for a = 1
for b = 1
calculate m
while m > previous step, 
    b = b - .001
a = a - .001

Is one of these preferable? Or is there yet another, better way to go about this? Any tips would be greatly appreciated.

user3059201
  • 675
  • 2
  • 7
  • 11

1 Answers1

2

There is already a handy formula for least squares fitting.

I came up with two different ways to solve your problem.


For the first one, consider the matrix K:

L = len(X)
K = np.identity(L) - np.ones((L, L)) / L

In your case, A and B are defined as:

A = K.dot(np.array([Y, Z]).transpose())
B = K.dot(np.array([X]).transpose())

Apply the formula to find C that minimizes the error A * C - B:

C = np.linalg.inv(np.transpose(A).dot(A))
C = C.dot(np.transpose(A)).dot(B)

Then the result is:

a, b = C.reshape(2)

Also, note that numpy already provides linalg.lstsq that does the exact same thing:

a, b = np.linalg.lstsq(A, B)[0].reshape(2)

A simpler way is to define A as:

A = np.array([Y, Z, [1]*len(X)]).transpose()

Then solve it against X to get the coefficients and the mean:

a, b, mean = np.linalg.lstsq(A, X)[0]

If you need a proof of this result, have a look at this post.


Example:

>>> import numpy as np
>>> X = [5, 7, 9, 5]
>>> Y = [2, 0, 4, 1]
>>> Z = [7, 2, 4, 6]
>>> A = np.array([Y, Z, [1] * len(X)]).transpose()
>>> a, b, mean = np.linalg.lstsq(A, X)[0]
>>> print(a, b, mean)
0.860082304527 -0.736625514403 8.49382716049
Community
  • 1
  • 1
Vincent
  • 11,046
  • 29
  • 56