Gaussian fitting¶
This module provides functions to fit gaussian distributions and gaussian
distribution mixtures (2 components). These functions can be used directly,
or more often, in a typical FRETBursts workflow they are passed to higher
level methods like fretbursts.burstlib.Data.fit_E_generic()
.
Single Gaussian distribution fit:
For 2-Gaussians fit we have the following models:
two_gauss_mix_pdf()
: PDF of 2-components Gaussians mixture
two_gauss_mix_ab()
: linear combination of 2 Gaussians
Main functions for mixture of 2 Gaussian distribution fit:
two_gaussian_fit_hist()
histogram fit using `leastsq`
two_gaussian_fit_hist_min()
histogram fit using `minimize`
two_gaussian_fit_hist_min_ab()
the same but using _ab model
two_gaussian_fit_cdf()
curve fit of the CDF
two_gaussian_fit_EM()
Expectation-Maximization fit
two_gaussian_fit_EM_b()
the same with boundaries
Also, some functions to fit 2-D gaussian distributions and mixtures are implemented but not thoroughly tested.
The reference documentation for all the functions follows.
- fretbursts.fit.gaussian_fitting.bound_check(val, bounds)¶
Returns
val
clipped inside the intervalbounds
.
- fretbursts.fit.gaussian_fitting.gaussian2d_fit(sx, sy, guess=[0.5, 1])¶
2D-Gaussian fit of samples S using a fit to the empirical CDF.
- fretbursts.fit.gaussian_fitting.gaussian_fit_cdf(s, mu0=0, sigma0=1, return_all=False, **leastsq_kwargs)¶
Gaussian fit of samples s fitting the empirical CDF. Additional kwargs are passed to the leastsq() function. If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq and the histogram is returned.
- fretbursts.fit.gaussian_fitting.gaussian_fit_curve(x, y, mu0=0, sigma0=1, a0=None, return_all=False, **kwargs)¶
Gaussian fit of curve (x,y). If a0 is None then only (mu,sigma) are fitted (to a gaussian density).
kwargs
are passed to the leastsq() function.If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq is returned.
- fretbursts.fit.gaussian_fitting.gaussian_fit_hist(s, mu0=0, sigma0=1, a0=None, bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), return_all=False, leastsq_kwargs={}, weights=None, **kwargs)¶
Gaussian fit of samples s fitting the hist to a Gaussian function. If a0 is None then only (mu,sigma) are fitted (to a gaussian density). kwargs are passed to the histogram function. If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq and the histogram is returned.
weights
optional weights for the histogram.
- fretbursts.fit.gaussian_fitting.gaussian_fit_ml(s, mu_sigma_guess=[0.5, 1])¶
Gaussian fit of samples s using the Maximum Likelihood (ML method). Didactical, since scipy.stats.norm.fit() implements the same method.
- fretbursts.fit.gaussian_fitting.gaussian_fit_pdf(s, mu0=0, sigma0=1, a0=1, return_all=False, leastsq_kwargs={}, **kwargs)¶
Gaussian fit of samples s using a fit to the empirical PDF. If a0 is None then only (mu,sigma) are fitted (to a gaussian density).
kwargs
are passed to get_epdf(). If return_all=False then return only the fitted (mu,sigma) values If return_all=True (or full_output=True is passed to leastsq) then the full output of leastsq and the PDF curve is returned.
- fretbursts.fit.gaussian_fitting.get_epdf(s, smooth=0, N=1000, smooth_pdf=False, smooth_cdf=True)¶
Compute the empirical PDF of the sample
s
.If smooth > 0 then apply a gaussian filter with sigma=smooth. N is the number of points for interpolation of the CDF on a uniform range.
- fretbursts.fit.gaussian_fitting.normpdf(x, mu=0, sigma=1.0)¶
Return the normal pdf evaluated at
x
.
- fretbursts.fit.gaussian_fitting.reorder_parameters(p)¶
Reorder 2-gauss mix params to have the 1st component with smaller mean.
- fretbursts.fit.gaussian_fitting.reorder_parameters_ab(p)¶
Reorder 2-gauss mix params to have the 1st component with smaller mean.
- fretbursts.fit.gaussian_fitting.two_gauss_mix_ab(x, p)¶
Mixture of two Gaussians with no area constrain.
- fretbursts.fit.gaussian_fitting.two_gauss_mix_pdf(x, p)¶
PDF for the distribution of a mixture of two Gaussians.
- fretbursts.fit.gaussian_fitting.two_gaussian2d_fit(sx, sy, guess=[0.5, 1])¶
2D-Gaussian fit of samples S using a fit to the empirical CDF.
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_EM(s, p0=[0, 0.1, 0.6, 0.1, 0.5], max_iter=300, ptol=0.0001, fix_mu=[0, 0], fix_sig=[0, 0], debug=False, weights=None)¶
Fit the sample s with two gaussians using Expectation Maximization.
This vesion allows to optionally fix mean or std. dev. of each component.
- Parameters
s (array) – population of samples to be fitted
p0 (sequence-like) – initial parameters [mu0, sig0, mu1, sig1, a]
bound (tuple of pairs) – sequence of (min, max) values that constrain the parameters. If min or max are None, no boundary is set.
ptol (float) – convergence condition. Relative max variation of any parameter.
max_iter (int) – max number of iteration in case of non convergence.
weights (array) – optional weigths, same size as
s
(for ex. 1/sigma^2 ~ nt).
- Returns
Array of parameters for the 2-gaussians (5 elements)
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_EM_b(s, p0=[0, 0.1, 0.6, 0.1, 0.5], weights=None, bounds=[(None, None), (None, None), (None, None), (None, None), (None, None)], max_iter=300, ptol=0.0001, debug=False)¶
Fit the sample s with two gaussians using Expectation Maximization.
This version allows setting boundaries for each parameter.
- Parameters
s (array) – population of samples to be fitted
p0 (sequence-like) – initial parameters [mu0, sig0, mu1, sig1, a]
bound (tuple of pairs) – sequence of (min, max) values that constrain the parameters. If min or max are None, no boundary is set.
ptol (float) – convergence condition. Relative max variation of any parameter.
max_iter (int) – max number of iteration in case of non convergence.
weights (array) – optional weigths, same size as
s
(for ex. 1/sigma^2 ~ nt).
- Returns
Array of parameters for the 2-gaussians (5 elements)
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_KDE_curve(s, p0=[0, 0.1, 0.6, 0.1, 0.5], weights=None, bandwidth=0.05, x_pdf=None, debug=False, method='SLSQP', bounds=None, verbose=False, **kde_kwargs)¶
Fit sample
s
with two gaussians using a KDE pdf approximation.The 2-gaussian pdf is then curve-fitted to the KDE pdf.
- Parameters
s (array) – population of samples to be fitted
p0 (sequence-like) – initial parameters [mu0, sig0, mu1, sig1, a]
bandwidth (float) – bandwidth for the KDE algorithm
method (string) – fit method, can be ‘leastsq’ or one of the methods accepted by scipy
minimize()
bounds (None or 5-element list) – if not None, each element is a (min,max) pair of bounds for the corresponding parameter. This argument can be used only with L-BFGS-B, TNC or SLSQP methods. If bounds are used, parameters cannot be fixed
x_pdf (array) – array on which the KDE PDF is evaluated and curve-fitted
weights (array) – optional weigths, same size as
s
(for ex. 1/sigma^2 ~ nt).debug (bool) – if True performs more tests and print more info.
Additional kwargs are passed to scipy.stats.gaussian_kde().
- Returns
Array of parameters for the 2-gaussians (5 elements)
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_cdf(s, p0=[0.0, 0.05, 0.6, 0.1, 0.5], fix_mu=[0, 0], fix_sig=[0, 0])¶
Fit the sample s with two gaussians using a CDF fit.
Curve fit 2-gauss mixture Cumulative Distribution Function (CDF) to the empirical CDF for sample
s
.Note that with a CDF fit no weighting is possible.
- Parameters
s (array) – population of samples to be fitted
p0 (5-element list or array) – initial guess or parameters
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians
- Returns
Array of parameters for the 2-gaussians (5 elements)
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_curve(x, y, p0, return_all=False, verbose=False, **kwargs)¶
Fit a 2-gaussian mixture to the (x,y) curve.
kwargs
are passed to the leastsq() function.If return_all=False then return only the fitted parameters If return_all=True then the full output of leastsq is returned.
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_hist(s, bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), weights=None, p0=[0.2, 1, 0.8, 1, 0.3], fix_mu=[0, 0], fix_sig=[0, 0], fix_a=False)¶
Fit the sample s with 2-gaussian mixture (histogram fit).
Uses scipy.optimize.leastsq function. Parameters can be fixed but cannot be constrained in an interval.
- Parameters
s (array) – population of samples to be fitted
p0 (5-element list or array) – initial guess or parameters
bins (int or array) – bins passed to
np.histogram()
weights (array) – optional weights passed to
np.histogram()
fix_a (tuple of bools) – Whether to fix the amplitude of the gaussians
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians
- Returns
Array of parameters for the 2-gaussians (5 elements)
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_hist_min(s, bounds=None, method='L-BFGS-B', bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), weights=None, p0=[0.2, 1, 0.8, 1, 0.3], fix_mu=[0, 0], fix_sig=[0, 0], fix_a=False, verbose=False)¶
Fit the sample
s
with 2-gaussian mixture (histogram fit). [Bounded]Uses scipy.optimize.minimize allowing constrained minimization.
- Parameters
s (array) – population of samples to be fitted
method (string) – one of the methods accepted by scipy
minimize()
bounds (None or 5-element list) – if not None, each element is a (min,max) pair of bounds for the corresponding parameter. This argument can be used only with L-BFGS-B, TNC or SLSQP methods. If bounds are used, parameters cannot be fixed
p0 (5-element list or array) – initial guess or parameters
bins (int or array) – bins passed to
np.histogram()
weights (array) – optional weights passed to
np.histogram()
fix_a (tuple of bools) – Whether to fix the amplitude of the gaussians
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians
verbose (boolean) – allows printing fit information
- Returns
Array of parameters for the 2-gaussians (5 elements)
- fretbursts.fit.gaussian_fitting.two_gaussian_fit_hist_min_ab(s, bounds=None, method='L-BFGS-B', bins=array([-0.5, -0.499, -0.498, ..., 1.497, 1.498, 1.499]), weights=None, p0=[0.2, 1, 0.8, 1, 0.3], fix_mu=[0, 0], fix_sig=[0, 0], fix_a=[0, 0], verbose=False)¶
Histogram fit of sample
s
with 2-gaussian functions.Uses scipy.optimize.minimize allowing constrained minimization. Also each parameter can be fixed.
The order of the parameters is: mu1, sigma1, a1, mu2, sigma2, a2.
- Parameters
s (array) – population of samples to be fitted
method (string) – one of the methods accepted by scipy
minimize()
bounds (None or 6-element list) – if not None, each element is a (min,max) pair of bounds for the corresponding parameter. This argument can be used only with L-BFGS-B, TNC or SLSQP methods. If bounds are used, parameters cannot be fixed
p0 (6-element list or array) – initial guess or parameters
bins (int or array) – bins passed to
np.histogram()
weights (array) – optional weights passed to
np.histogram()
fix_a (tuple of bools) – Whether to fix the amplitude of the gaussians
fix_mu (tuple of bools) – Whether to fix the mean of the gaussians
fix_sig (tuple of bools) – Whether to fix the sigma of the gaussians
verbose (boolean) – allows printing fit information
- Returns
Array of parameters for the 2-gaussians (6 elements)