Gaussian fitting¶

This module provides functions to fit gaussian distributions and gaussian distribution mixtures (2 components). These functions can be used directly, or more often, in a typical FRETBursts workflow they are passed to higher level methods like fretbursts.burstlib.Data.fit_E_generic().

Single Gaussian distribution fit:

gaussian_fit_hist()

gaussian_fit_cdf()

gaussian_fit_pdf()

For 2-Gaussians fit we have the following models:

two_gauss_mix_pdf(): PDF of 2-components Gaussians mixture

two_gauss_mix_ab(): linear combination of 2 Gaussians

Main functions for mixture of 2 Gaussian distribution fit:

two_gaussian_fit_hist() histogram fit using `leastsq`

two_gaussian_fit_hist_min() histogram fit using `minimize`

two_gaussian_fit_hist_min_ab() the same but using _ab model

two_gaussian_fit_cdf() curve fit of the CDF

two_gaussian_fit_EM() Expectation-Maximization fit

two_gaussian_fit_EM_b() the same with boundaries

Also, some functions to fit 2-D gaussian distributions and mixtures are implemented but not thoroughly tested.

The reference documentation for all the functions follows.

fretbursts.fit.gaussian_fitting.bound_check(val, bounds)¶: Returns val clipped inside the interval bounds.

fretbursts.fit.gaussian_fitting.gaussian2d_fit(sx, sy, guess=[0.5, 1])¶: 2D-Gaussian fit of samples S using a fit to the empirical CDF.

fretbursts.fit.gaussian_fitting.gaussian_fit_cdf(s, mu0=0, sigma0=1, return_all=False, **leastsq_kwargs)¶: Gaussian fit of samples s fitting the empirical CDF. Additional kwargs are passed to the leastsq() function. If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq and the histogram is returned.

fretbursts.fit.gaussian_fitting.gaussian_fit_curve(x, y, mu0=0, sigma0=1, a0=None, return_all=False, **kwargs)¶

Gaussian fit of curve (x,y). If a0 is None then only (mu,sigma) are fitted (to a gaussian density). kwargs are passed to the leastsq() function.

If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq is returned.

fretbursts.fit.gaussian_fitting.gaussian_fit_hist(s, mu0=0, sigma0=1, a0=None, bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), return_all=False, leastsq_kwargs={}, weights=None, **kwargs)¶: Gaussian fit of samples s fitting the hist to a Gaussian function. If a0 is None then only (mu,sigma) are fitted (to a gaussian density). kwargs are passed to the histogram function. If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq and the histogram is returned. weights optional weights for the histogram.

fretbursts.fit.gaussian_fitting.gaussian_fit_ml(s, mu_sigma_guess=[0.5, 1])¶: Gaussian fit of samples s using the Maximum Likelihood (ML method). Didactical, since scipy.stats.norm.fit() implements the same method.

fretbursts.fit.gaussian_fitting.gaussian_fit_pdf(s, mu0=0, sigma0=1, a0=1, return_all=False, leastsq_kwargs={}, **kwargs)¶: Gaussian fit of samples s using a fit to the empirical PDF. If a0 is None then only (mu,sigma) are fitted (to a gaussian density). kwargs are passed to get_epdf(). If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq and the PDF curve is returned.

fretbursts.fit.gaussian_fitting.get_epdf(s, smooth=0, N=1000, smooth_pdf=False, smooth_cdf=True)¶

Compute the empirical PDF of the sample s.

If smooth > 0 then apply a gaussian filter with sigma=smooth. N is the number of points for interpolation of the CDF on a uniform range.

fretbursts.fit.gaussian_fitting.normpdf(x, mu=0, sigma=1.0)¶: Return the normal pdf evaluated at x.

fretbursts.fit.gaussian_fitting.reorder_parameters(p)¶: Reorder 2-gauss mix params to have the 1st component with smaller mean.

fretbursts.fit.gaussian_fitting.reorder_parameters_ab(p)¶: Reorder 2-gauss mix params to have the 1st component with smaller mean.

fretbursts.fit.gaussian_fitting.two_gauss_mix_ab(x, p)¶: Mixture of two Gaussians with no area constrain.

fretbursts.fit.gaussian_fitting.two_gauss_mix_pdf(x, p)¶: PDF for the distribution of a mixture of two Gaussians.

fretbursts.fit.gaussian_fitting.two_gaussian2d_fit(sx, sy, guess=[0.5, 1])¶: 2D-Gaussian fit of samples S using a fit to the empirical CDF.

fretbursts.fit.gaussian_fitting.two_gaussian_fit_EM(s, p0=[0, 0.1, 0.6, 0.1, 0.5], max_iter=300, ptol=0.0001, fix_mu=[0, 0], fix_sig=[0, 0], debug=False, weights=None)¶

Fit the sample s with two gaussians using Expectation Maximization.

This vesion allows to optionally fix mean or std. dev. of each component.

Parameters:

s (array) – population of samples to be fitted
p0 (sequence-like) – initial parameters [mu0, sig0, mu1, sig1, a]
bound (tuple of pairs) – sequence of (min, max) values that constrain the parameters. If min or max are None, no boundary is set.
ptol (float) – convergence condition. Relative max variation of any parameter.
max_iter (int) – max number of iteration in case of non convergence.
weights (array) – optional weigths, same size as s (for ex. 1/sigma^2 ~ nt).

Returns:

Array of parameters for the 2-gaussians (5 elements)

fretbursts.fit.gaussian_fitting.two_gaussian_fit_EM_b(s, p0=[0, 0.1, 0.6, 0.1, 0.5], weights=None, bounds=[(None, None), (None, None), (None, None), (None, None), (None, None)], max_iter=300, ptol=0.0001, debug=False)¶

Fit the sample s with two gaussians using Expectation Maximization.

This version allows setting boundaries for each parameter.

Parameters:

s (array) – population of samples to be fitted
p0 (sequence-like) – initial parameters [mu0, sig0, mu1, sig1, a]
bound (tuple of pairs) – sequence of (min, max) values that constrain the parameters. If min or max are None, no boundary is set.
ptol (float) – convergence condition. Relative max variation of any parameter.
max_iter (int) – max number of iteration in case of non convergence.
weights (array) – optional weigths, same size as s (for ex. 1/sigma^2 ~ nt).

Returns:

Array of parameters for the 2-gaussians (5 elements)

fretbursts.fit.gaussian_fitting.two_gaussian_fit_KDE_curve(s, p0=[0, 0.1, 0.6, 0.1, 0.5], weights=None, bandwidth=0.05, x_pdf=None, debug=False, method='SLSQP', bounds=None, verbose=False, **kde_kwargs)¶

Fit sample s with two gaussians using a KDE pdf approximation.

The 2-gaussian pdf is then curve-fitted to the KDE pdf.

Parameters:

s (array) – population of samples to be fitted
p0 (sequence-like) – initial parameters [mu0, sig0, mu1, sig1, a]
bandwidth (float) – bandwidth for the KDE algorithm
method (string) – fit method, can be ‘leastsq’ or one of the methods accepted by scipy minimize()
bounds (None or 5-element list) – if not None, each element is a (min,max) pair of bounds for the corresponding parameter. This argument can be used only with L-BFGS-B, TNC or SLSQP methods. If bounds are used, parameters cannot be fixed
x_pdf (array) – array on which the KDE PDF is evaluated and curve-fitted
weights (array) – optional weigths, same size as s (for ex. 1/sigma^2 ~ nt).
debug (bool) – if True performs more tests and print more info.

Additional kwargs are passed to scipy.stats.gaussian_kde().

Returns:: Array of parameters for the 2-gaussians (5 elements)

fretbursts.fit.gaussian_fitting.two_gaussian_fit_cdf(s, p0=[0.0, 0.05, 0.6, 0.1, 0.5], fix_mu=[0, 0], fix_sig=[0, 0])¶

Fit the sample s with two gaussians using a CDF fit.

Curve fit 2-gauss mixture Cumulative Distribution Function (CDF) to the empirical CDF for sample s.

Note that with a CDF fit no weighting is possible.

Parameters:

s (array) – population of samples to be fitted
p0 (5-element list or array) – initial guess or parameters
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians

Returns:

Array of parameters for the 2-gaussians (5 elements)

fretbursts.fit.gaussian_fitting.two_gaussian_fit_curve(x, y, p0, return_all=False, verbose=False, **kwargs)¶

Fit a 2-gaussian mixture to the (x,y) curve. kwargs are passed to the leastsq() function.

If return_all=False then return only the fitted parameters If return_all=True then the full output of leastsq is returned.

fretbursts.fit.gaussian_fitting.two_gaussian_fit_hist(s, bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), weights=None, p0=[0.2, 1, 0.8, 1, 0.3], fix_mu=[0, 0], fix_sig=[0, 0], fix_a=False)¶

Fit the sample s with 2-gaussian mixture (histogram fit).

Uses scipy.optimize.leastsq function. Parameters can be fixed but cannot be constrained in an interval.

Parameters:

s (array) – population of samples to be fitted
p0 (5-element list or array) – initial guess or parameters
bins (int or array) – bins passed to np.histogram()
weights (array) – optional weights passed to np.histogram()
fix_a (tuple of bools) – Whether to fix the amplitude of the gaussians
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians

Returns:

Array of parameters for the 2-gaussians (5 elements)

fretbursts.fit.gaussian_fitting.two_gaussian_fit_hist_min(s, bounds=None, method='L-BFGS-B', bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), weights=None, p0=[0.2, 1, 0.8, 1, 0.3], fix_mu=[0, 0], fix_sig=[0, 0], fix_a=False, verbose=False)¶

Fit the sample s with 2-gaussian mixture (histogram fit). [Bounded]

Uses scipy.optimize.minimize allowing constrained minimization.

Parameters:

s (array) – population of samples to be fitted
method (string) – one of the methods accepted by scipy minimize()
bounds (None or 5-element list) – if not None, each element is a (min,max) pair of bounds for the corresponding parameter. This argument can be used only with L-BFGS-B, TNC or SLSQP methods. If bounds are used, parameters cannot be fixed
p0 (5-element list or array) – initial guess or parameters
bins (int or array) – bins passed to np.histogram()
weights (array) – optional weights passed to np.histogram()
fix_a (tuple of bools) – Whether to fix the amplitude of the gaussians
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians
verbose (boolean) – allows printing fit information

Returns:

Array of parameters for the 2-gaussians (5 elements)

fretbursts.fit.gaussian_fitting.two_gaussian_fit_hist_min_ab(s, bounds=None, method='L-BFGS-B', bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), weights=None, p0=[0.2, 1, 0.8, 1, 0.3], fix_mu=[0, 0], fix_sig=[0, 0], fix_a=[0, 0], verbose=False)¶

Histogram fit of sample s with 2-gaussian functions.

Uses scipy.optimize.minimize allowing constrained minimization. Also each parameter can be fixed.

The order of the parameters is: mu1, sigma1, a1, mu2, sigma2, a2.

Parameters:

s (array) – population of samples to be fitted
method (string) – one of the methods accepted by scipy minimize()
bounds (None or 6-element list) – if not None, each element is a (min,max) pair of bounds for the corresponding parameter. This argument can be used only with L-BFGS-B, TNC or SLSQP methods. If bounds are used, parameters cannot be fixed
p0 (6-element list or array) – initial guess or parameters
bins (int or array) – bins passed to np.histogram()
weights (array) – optional weights passed to np.histogram()
fix_a (tuple of bools) – Whether to fix the amplitude of the gaussians
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians
verbose (boolean) – allows printing fit information

Returns:

Array of parameters for the 2-gaussians (6 elements)