# Background estimation¶

## background.py¶

Routines to compute the background from an array of timestamps. This module is normally imported as bg when fretbursts is imported.

The important functions are exp_fit() and exp_cdf_fit() that provide two (fast) algorithms to estimate the background without binning. These functions are not usually called directly but passed to Data.calc_bg() to compute the background of a measurement.

See also exp_hist_fit() for background estimation using an histogram fit.

fretbursts.background.exp_fit(ph, tail_min_us=None, clk_p=1.25e-08, error_metrics=None)

Return a background rate using the MLE of mean waiting-times.

Compute the background rate, selecting waiting-times (delays) larger than a minimum threshold.

This function performs a Maximum Likelihood (ML) fit. For exponentially-distributed waiting-times this is the empirical mean.

Parameters: ph (array) – timestamps array from which to extract the background tail_min_us (float) – minimum waiting-time in micro-secs clk_p (float) – clock period for timestamps in ph error_metrics (string or None) – Valid values are ‘KS’ or ‘CM’. ‘KS’ (Kolmogorov-Smirnov statistics) computes the error as the max of deviation of the empirical CDF from the fitted CDF. ‘CM’ (Crames-von Mises) uses the L^2 distance. If None, no error metric is computed (returns None). 2-Tuple – Estimated background rate in cps, and a “quality of fit” index (the lower the better) according to the chosen metric. If error_metrics==None, the returned “quality of fit” is None.
fretbursts.background.exp_cdf_fit(ph, tail_min_us=None, clk_p=1.25e-08, error_metrics=None)

Return a background rate fitting the empirical CDF of waiting-times.

Compute the background rate, selecting waiting-times (delays) larger than a minimum threshold.

This function performs a least square fit of an exponential Cumulative Distribution Function (CDF) to the empirical CDF of waiting-times.

Parameters: ph (array) – timestamps array from which to extract the background tail_min_us (float) – minimum waiting-time in micro-secs clk_p (float) – clock period for timestamps in ph error_metrics (string or None) – Valid values are ‘KS’ or ‘CM’. ‘KS’ (Kolmogorov-Smirnov statistics) computes the error as the max of deviation of the empirical CDF from the fitted CDF. ‘CM’ (Crames-von Mises) uses the L^2 distance. If None, no error metric is computed (returns None). 2-Tuple – Estimated background rate in cps, and a “quality of fit” index (the lower the better) according to the chosen metric. If error_metrics==None, the returned “quality of fit” is None.
fretbursts.background.exp_hist_fit(ph, tail_min_us, binw=5e-05, clk_p=1.25e-08, weights='hist_counts', error_metrics=None)

Compute background rate with WLS histogram fit of waiting-times.

Compute the background rate, selecting waiting-times (delays) larger than a minimum threshold.

This function performs a Weighed Least Squares (WLS) fit of the histogram of waiting times to an exponential decay.

Parameters: ph (array) – timestamps array from which to extract the background tail_min_us (float) – minimum waiting-time in micro-secs binw (float) – bin width for waiting times, in seconds. clk_p (float) – clock period for timestamps in ph weights (None or string) – if None no weights is applied. if is ‘hist_counts’, each bin has a weight equal to its counts if is ‘inv_hist_counts’, the weight is the inverse of the counts. error_metrics (string or None) – Valid values are ‘KS’ or ‘CM’. ‘KS’ (Kolmogorov-Smirnov statistics) computes the error as the max of deviation of the empirical CDF from the fitted CDF. ‘CM’ (Crames-von Mises) uses the L^2 distance. If None, no error metric is computed (returns None). 2-Tuple – Estimated background rate in cps, and a “quality of fit” index (the lower the better) according to the chosen metric. If error_metrics==None, the returned “quality of fit” is None.

## Low-level background fit functions¶

Generic functions to fit exponential populations.

These functions can be used directly, or, in a typical FRETBursts workflow they are passed to higher level methods.

fretbursts.fit.exp_fitting.expon_fit(s, s_min=0, offset=0.5, calc_residuals=True)

Fit sample s to an exponential distribution using the ML estimator.

This function computes the rate (Lambda) using the maximum likelihood (ML) estimator of the mean waiting-time (Tau), that for an exponentially distributed sample is the sample-mean.

Parameters: s (array) – array of exponetially-distributed samples s_min (float) – all samples < s_min are discarded (s_min must be >= 0). offset (float) – offset for computing the CDF. See get_ecdf(). calc_residuals (bool) – if True compute the residuals of the fitted exponential versus the empirical CDF. A 4-tuple of the fitted rate (1/life-time), residuals array, residuals x-axis array, sample size after threshold.
fretbursts.fit.exp_fitting.expon_fit_cdf(s, s_min=0, offset=0.5, calc_residuals=True)

Fit of an exponential model to the empirical CDF of s.

This function computes the rate (Lambda) fitting a line (linear regression) to the log of the empirical CDF.

Parameters: s (array) – array of exponetially-distributed samples s_min (float) – all samples < s_min are discarded (s_min must be >= 0). offset (float) – offset for computing the CDF. See get_ecdf(). calc_residuals (bool) – if True compute the residuals of the fitted exponential versus the empirical CDF. A 4-tuple of the fitted rate (1/life-time), residuals array, residuals x-axis array, sample size after threshold.
fretbursts.fit.exp_fitting.expon_fit_hist(s, bins, s_min=0, weights=None, offset=0.5, calc_residuals=True)

Fit of an exponential model to the histogram of s using least squares.

Parameters: s (array) – array of exponetially-distributed samples bins (float or array) – if float is the bin width, otherwise is the array of bin edges (passed to numpy.histogram) s_min (float) – all samples < s_min are discarded (s_min must be >= 0). weights (None or string) – if None no weights is applied. if is ‘hist_counts’, each bin has a weight equal to its counts if is ‘inv_hist_counts’, the weight is the inverse of the counts. offset (float) – offset for computing the CDF. See get_ecdf(). calc_residuals (bool) – if True compute the residuals of the fitted exponential versus the empirical CDF. A 4-tuple of the fitted rate (1/life-time), residuals array, residuals x-axis array, sample size after threshold.
fretbursts.fit.exp_fitting.get_ecdf(s, offset=0.5)

Return arrays (x, y) for the empirical CDF curve of sample s.

fretbursts.fit.exp_fitting.get_residuals(s, tau_fit, offset=0.5)
Returns residuals of sample s CDF vs an exponential CDF.
Parameters: s (array of floats) – sample tau_fit (float) – mean waiting-time of the exponential distribution to use as reference offset (float) – Default 0.5. Offset to add to the empirical CDF. See get_ecdf() for details. residuals (array) – residuals of empirical CDF compared with analytical CDF with time constant tau_fit.