2

I wrote a Python (2.7) program to evaluate some scientific data. Its main task is to fit this data to a certain function (1). Since this is quite a lot of data the program distributes the jobs (= "fit one set of data") to several cores using multiprocessing. In a first attempt I implemented the fitting process using curve_fit from scipy.optimize which works pretty well.

So far, so good. Then we saw that the data is more precisely described by a convolution of function (1) and a gaussian distribution. The idea was to fit the data first to function (1), get the guess values as a result of this and then fit the data again to the convolution. Since the data is quite noisy and I am trying to fit it to a convolution with seven parameters, the results were this time rather bad. Especially the gaussian parameters were in some extend physically impossible.

So I tried implementing the fitting process in PyMinuit because it allows to limit the parameters within certain borders (like a positive amplitude). As I never worked with Minuit before and I try to start small, I rewrote the first ("easy") part of the fitting process. The code snippet doing the job looks like this (simplified):

import minuit
import numpy as np
# temps are the x-values
# pol are the y-values
def gettc(temps, pol, guess_values):
    try:
        efunc_chi2 = lambda a,b,c,d: np.sum( (efunc(temps, a,b,c,d) - pol)**2 )
        fit = minuit.Minuit(efunc_chi2)
        fit.values['a'] = guess_values[0]
        fit.values['b'] = guess_values[1]
        fit.values['c'] = guess_values[2]
        fit.values['d'] = guess_values[3]
        fit.fixed['d'] = True
        fit.maxcalls = 1000
        #fit.tol = 1000.0
        fit.migrad()
        param = fit.args
        popc = fit.covariance
    except minuit.MinuitError:
        return np.zeros(len(guess_values))
    return param,popc

Where efunc() is function (1). The parameter d is fixed since I don't use it at the moment.
PyMinuit function reference
Finally, here comes the actual problem: When running the script, Minuit prints for almost every fit

VariableMetricBuilder: Tolerance is not sufficient - edm is 0.000370555 requested 1e-05 continue the minimization

to stdout with different values for edm. The fit still works fine but the printing slows the program considerably down. I tried to increase fit.tol but there are a lot of datasets which return even higher edm's. Then I tried just hiding the output of fit.migrad() using this solution which actually works. And now something strange happens: Somewhere in the middle of the program, the processes on all cores fail simultaneously. Not at the first fit but in the middle of my whole dataset. The only thing I changed, was

with suppress_stdout_stderr():
    fit.migrad()

I know this is quite a long introduction, but I think it helps you more when you know the whole framework. I would be very grateful if someone has any idea on how to approach this problem.


Note: Function (1) is defined as

def efunc(x,a,b,c,d):
    if x < c:
        return a*np.exp(b*x**d)
    else:
        return a*np.exp(b*c**d)   # therefore constant
efunc = np.vectorize(efunc)
Community
  • 1
  • 1
Marc
  • 21
  • 1

0 Answers0