Background estimation

background.py

Routines to compute the background from an array of timestamps. This module is normally imported as bg when fretbursts is imported.

The important functions are exp_fit() and exp_cdf_fit() that provide two (fast) algorithms to estimate the background without binning. These functions are not usually called directly but passed to Data.calc_bg() to compute the background of a measurement.

See also exp_hist_fit() for background estimation using an histogram fit.

fretbursts.background.exp_fit(ph, tail_min_us=None, clk_p=1.25e-08, error_metrics=None)

Return a background rate using the MLE of mean waiting-times.

Compute the background rate, selecting waiting-times (delays) larger than a minimum threshold.

This function performs a Maximum Likelihood (ML) fit. For exponentially-distributed waiting-times this is the empirical mean.

Parameters:
  • ph (array) – timestamps array from which to extract the background
  • tail_min_us (float) – minimum waiting-time in micro-secs
  • clk_p (float) – clock period for timestamps in ph
  • error_metrics (string or None) – Valid values are ‘KS’ or ‘CM’. ‘KS’ (Kolmogorov-Smirnov statistics) computes the error as the max of deviation of the empirical CDF from the fitted CDF. ‘CM’ (Crames-von Mises) uses the L^2 distance. If None, no error metric is computed (returns None).
Returns:

2-Tuple – Estimated background rate in cps, and a “quality of fit” index (the lower the better) according to the chosen metric. If error_metrics==None, the returned “quality of fit” is None.

fretbursts.background.exp_cdf_fit(ph, tail_min_us=None, clk_p=1.25e-08, error_metrics=None)

Return a background rate fitting the empirical CDF of waiting-times.

Compute the background rate, selecting waiting-times (delays) larger than a minimum threshold.

This function performs a least square fit of an exponential Cumulative Distribution Function (CDF) to the empirical CDF of waiting-times.

Parameters:
  • ph (array) – timestamps array from which to extract the background
  • tail_min_us (float) – minimum waiting-time in micro-secs
  • clk_p (float) – clock period for timestamps in ph
  • error_metrics (string or None) – Valid values are ‘KS’ or ‘CM’. ‘KS’ (Kolmogorov-Smirnov statistics) computes the error as the max of deviation of the empirical CDF from the fitted CDF. ‘CM’ (Crames-von Mises) uses the L^2 distance. If None, no error metric is computed (returns None).
Returns:

2-Tuple – Estimated background rate in cps, and a “quality of fit” index (the lower the better) according to the chosen metric. If error_metrics==None, the returned “quality of fit” is None.

fretbursts.background.exp_hist_fit(ph, tail_min_us, binw=5e-05, clk_p=1.25e-08, weights=’hist_counts’, error_metrics=None)

Compute background rate with WLS histogram fit of waiting-times.

Compute the background rate, selecting waiting-times (delays) larger than a minimum threshold.

This function performs a Weighed Least Squares (WLS) fit of the histogram of waiting times to an exponential decay.

Parameters:
  • ph (array) – timestamps array from which to extract the background
  • tail_min_us (float) – minimum waiting-time in micro-secs
  • binw (float) – bin width for waiting times, in seconds.
  • clk_p (float) – clock period for timestamps in ph
  • weights (None or string) – if None no weights is applied. if is ‘hist_counts’, each bin has a weight equal to its counts if is ‘inv_hist_counts’, the weight is the inverse of the counts.
  • error_metrics (string or None) – Valid values are ‘KS’ or ‘CM’. ‘KS’ (Kolmogorov-Smirnov statistics) computes the error as the max of deviation of the empirical CDF from the fitted CDF. ‘CM’ (Crames-von Mises) uses the L^2 distance. If None, no error metric is computed (returns None).
Returns:

2-Tuple – Estimated background rate in cps, and a “quality of fit” index (the lower the better) according to the chosen metric. If error_metrics==None, the returned “quality of fit” is None.

Low-level background fit functions

Generic functions to fit exponential populations.

These functions can be used directly, or, in a typical FRETBursts workflow they are passed to higher level methods.

See also:

fretbursts.fit.exp_fitting.expon_fit(s, s_min=0, offset=0.5, calc_residuals=True)

Fit sample s to an exponential distribution using the ML estimator.

This function computes the rate (Lambda) using the maximum likelihood (ML) estimator of the mean waiting-time (Tau), that for an exponentially distributed sample is the sample-mean.

Parameters:
  • s (array) – array of exponetially-distributed samples
  • s_min (float) – all samples < s_min are discarded (s_min must be >= 0).
  • offset (float) – offset for computing the CDF. See get_ecdf().
  • calc_residuals (bool) – if True compute the residuals of the fitted exponential versus the empirical CDF.
Returns:

A 4-tuple of the fitted rate (1/life-time), residuals array, residuals x-axis array, sample size after threshold.

fretbursts.fit.exp_fitting.expon_fit_cdf(s, s_min=0, offset=0.5, calc_residuals=True)

Fit of an exponential model to the empirical CDF of s.

This function computes the rate (Lambda) fitting a line (linear regression) to the log of the empirical CDF.

Parameters:
  • s (array) – array of exponetially-distributed samples
  • s_min (float) – all samples < s_min are discarded (s_min must be >= 0).
  • offset (float) – offset for computing the CDF. See get_ecdf().
  • calc_residuals (bool) – if True compute the residuals of the fitted exponential versus the empirical CDF.
Returns:

A 4-tuple of the fitted rate (1/life-time), residuals array, residuals x-axis array, sample size after threshold.

fretbursts.fit.exp_fitting.expon_fit_hist(s, bins, s_min=0, weights=None, offset=0.5, calc_residuals=True)

Fit of an exponential model to the histogram of s using least squares.

Parameters:
  • s (array) – array of exponetially-distributed samples
  • bins (float or array) – if float is the bin width, otherwise is the array of bin edges (passed to numpy.histogram)
  • s_min (float) – all samples < s_min are discarded (s_min must be >= 0).
  • weights (None or string) – if None no weights is applied. if is ‘hist_counts’, each bin has a weight equal to its counts if is ‘inv_hist_counts’, the weight is the inverse of the counts.
  • offset (float) – offset for computing the CDF. See get_ecdf().
  • calc_residuals (bool) – if True compute the residuals of the fitted exponential versus the empirical CDF.
Returns:

A 4-tuple of the fitted rate (1/life-time), residuals array, residuals x-axis array, sample size after threshold.

fretbursts.fit.exp_fitting.get_ecdf(s, offset=0.5)

Return arrays (x, y) for the empirical CDF curve of sample s.

See the code for more info (is a one-liner!).

Parameters:
  • s (array of floats) – sample
  • offset (float, default 0.5) – Offset to add to the y values of the CDF
Returns:

(x, y) (tuple of arrays) – the x and y values of the empirical CDF

fretbursts.fit.exp_fitting.get_residuals(s, tau_fit, offset=0.5)

Returns residuals of sample s CDF vs an exponential CDF.

Parameters:
  • s (array of floats) – sample
  • tau_fit (float) – mean waiting-time of the exponential distribution to use as reference
  • offset (float) – Default 0.5. Offset to add to the empirical CDF. See get_ecdf() for details.
Returns:

residuals (array) – residuals of empirical CDF compared with analytical CDF with time constant tau_fit.