The Data
class is the main container for smFRET measurements.
It contains timestamps, detectors and all the results of data processing
such as background estimation, burst data, fitted FRET and so on.
The reference documentation of the class follows.
Contents
A description of the Data
class and its main attributes.
fretbursts.burstlib.
Data
(leakage=0.0, gamma=1.0, dir_ex=0.0, **kwargs)¶Container for all the information (timestamps, bursts) of a dataset.
Data() contains all the information of a dataset (name, timestamps, bursts, correction factors) and provides several methods to perform analysis (background estimation, burst search, FRET fitting, etc…).
When loading a measurement file a Data() object is created by one
of the loader functions in loaders.py
. Data() objects can be also
created with Data.copy()
, Data.fuse_bursts()
or
Data.select_bursts()
.
To add or delete data-attributes use .add()
or .delete()
methods.
All the standard data-attributes are listed below.
Note
Attributes of type “list” contain one element per channel.
Each element, in turn, can be an array. For example .ph_times_m[i]
is the array of timestamps for channel i
; or .nd[i]
is the array
of donor counts in each burst for channel i
.
Measurement attributes
fname
¶string – measurements file name
nch
¶int – number of channels
clk_p
¶float – clock period in seconds for timestamps in ph_times_m
ph_times_m
¶list – list of timestamp arrays (int64). Each array contains all the timestamps (donor+acceptor) in one channel.
A_em
¶list – list of boolean arrays marking acceptor timestamps. Each array is a boolean mask for the corresponding ph_times_m array.
leakage
¶float or array of floats – leakage (or bleed-through) fraction. May be scalar or same size as nch.
gamma
¶float or array of floats – gamma factor. May be scalar or same size as nch.
D_em
¶list of boolean arrays – [ALEX-only]
boolean mask for .ph_times_m[i]
for donor emission
D_ex, A_ex
list of boolean arrays – [ALEX-only]
boolean mask for .ph_times_m[i]
during donor or acceptor
excitation
D_ON, A_ON
2-element tuples of int – [ALEX-only] start-end values for donor and acceptor excitation selection.
alex_period
¶int – [ALEX-only] duration of the alternation period in clock cycles.
Background Attributes
The background is computed with Data.calc_bg()
and is estimated in chunks of equal duration called background periods.
Estimations are performed in each spot and photon stream.
The following attributes contain the estimated background rate.
bg
¶dict – background rates for the different photon streams,
channels and background periods. Keys are Ph_sel
objects
and values are lists (one element per channel) of arrays (one
element per background period) of background rates.
bg_mean
¶dict – mean background rates across the entire measurement
for the different photon streams and channels. Keys are Ph_sel
objects and values are lists (one element per channel) of
background rates.
nperiods
¶int – number of periods in which timestamps are split for background calculation
bg_fun
¶function – function used to compute the background rates
Lim
¶list – each element of this list is a list of index pairs for
.ph_times_m[i]
for first and last photon in each period.
Ph_p
¶list – each element in this list is a list of timestamps pairs for first and last photon of each period.
bg_ph_sel
¶Ph_sel object – photon selection used by Lim and Ph_p.
See fretbursts.ph_sel
for details.
Th_us
¶dict – thresholds in us used to select the tail of the
interphoton delay distribution. Keys are Ph_sel
objects
and values are lists (one element per channel) of arrays (one
element per background period).
Additionlly, there are a few deprecated attributes (bg_dd
, bg_ad
,
bg_da
, bg_aa
, rate_dd
, rate_ad
, rate_da
, rate_aa
and rate_m
)
which will be removed in a future version.
Please use Data.bg
and Data.bg_mean
instead.
Burst search parameters (user input)
These are the parameters used to perform the burst search
(see burst_search()
).
ph_sel
¶Ph_sel object – photon selection used for burst search.
See fretbursts.ph_sel
for details.
m
¶int – number of consecutive timestamps used to compute the local rate during burst search
L
¶int – min. number of photons for a burst to be identified and saved
P
¶float, probability – valid values [0..1].
Probability that a burst-start is due to a Poisson background.
The employed Poisson rate is the one computed by .calc_bg()
.
F
¶float – (F * background_rate)
is the minimum rate for burst-start
Burst search data (available after burst search)
When not specified, parameters marked as (list of arrays) contains arrays
with one element per bursts. mburst
arrays contain one “row” per burst.
TT
arrays contain one element per period
(see above: background
attributes).
mburst
¶list of Bursts objects – list Bursts() one element per channel.
See fretbursts.phtools.burstsearch.Bursts
.
TT
¶list of arrays – list of arrays of T values (in sec.). A T
value is the maximum delay between m
photons to have a
burst-start. Each channels has an array of T values, one for
each background “period” (see above).
T
¶array – per-channel mean of TT
nd, na
list of arrays – number of donor or acceptor photons during donor excitation in each burst
nt
¶list of arrays – total number photons (nd+na+naa)
naa
¶list of arrays – number of acceptor photons in each burst during acceptor excitation [ALEX only]
nar
¶list of arrays – number of acceptor photons in each burst during donor excitation, not corrected for D-leakage and A-direct-excitation. [PAX only]
bp
¶list of arrays – time period for each burst. Same shape as nd
.
This is needed to identify the background rate for each burst.
bg_bs
¶list – background rates used for threshold computation in burst
search (is a reference to bg
, bg_dd
or bg_ad
).
fuse
¶None or float – if not None, the burst separation in ms below
which bursts have been fused (see .fuse_bursts()
).
E
¶list – FRET efficiency value for each burst: E = na/(na + gamma*nd).
S
¶list – stoichiometry value for each burst: S = (gamma*nd + na) /(gamma*nd + na + naa)
List of Data
attributes and
methods providing summary information on the measurement:
fretbursts.burstlib.
Data
time_max
¶The last recorded time in seconds.
time_min
¶The first recorded time in seconds.
ph_data_sizes
¶Array of total number of photons (ph-data) for each channel.
num_bursts
¶Array of number of bursts in each channel.
burst_sizes
(gamma=1.0, add_naa=False, beta=1.0, donor_ref=True)¶Return gamma corrected burst sizes for all the channel.
Compute burst sizes by calling, for each channel,
burst_sizes_ich()
.
See burst_sizes_ich()
for description of the arguments.
burst_sizes_pax_ich
(ich=0, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'), naa_aexonly=False, naa_comp=False, na_comp=False, gamma=1.0, beta=1.0, donor_ref=True)¶Return different definitions of PAX burst sizes for channel ich
.
There are 4 basic “terms” corresponding to the 4 photon streams:
nd
, na
, nda
, naa
. Which term is included is defined by
the ph_sel
argument (by default all are included).
The other arguments specify the various corrections for each term.
Parameters: |
|
---|
ich
.Examples
Burst sizes with all streams and no correction:
Data.burst_sizes_pax_ich(ph_sel=Ph_sel('all'))
Burst sizes with all streams and all corrections:
Data.burst_sizes_pax_ich(ph_sel=Ph_sel('all'), na_comp=True,
aa_aexonly=True, naa_comp=True)
See also
burst_sizes_ich
(ich=0, gamma=1.0, add_naa=False, beta=1.0, donor_ref=True)¶Return gamma corrected burst sizes for channel ich
.
If donor_ref == True
(default) the gamma corrected burst size is
computed according to:
1) nd + na / gamma
Otherwise, if donor_ref == False
, the gamma corrected burst size is:
2) nd * gamma + na
With the definition (1) the corrected burst size is equal to the raw
burst size for zero-FRET or D-only bursts (that’s why is donor_ref
).
With the definition (2) the corrected burst size is equal to the raw
burst size for 100%-FRET bursts.
In an ALEX measurement, use add_naa = True
to add counts from
AexAem stream to the returned burst size. The argument gamma
and
beta
are used to correctly scale naa
so that it become
commensurate with the Dex corrected burst size. In particular,
when using definition (1) (i.e. donor_ref = True
), the total
burst size is:
(nd + na/gamma) + naa / (beta * gamma)
Conversely, when using definition (2) (donor_ref = False
), the
total burst size is:
(nd * gamma + na) + naa / beta
Parameters: |
|
---|
ich
.burst_widths
¶List of arrays of burst duration in seconds. One array per channel.
ph_in_bursts_ich
(ich=0, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'))¶Return timestamps of photons inside bursts for channel ich
.
ich
and photon
selection ph_sel
that are inside any burst.ph_in_bursts_mask_ich
(ich=0, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'))¶Return mask of all photons inside bursts for channel ich
.
ich
and photon
selection ph_sel
that are inside any burst.status
(add='', noname=False)¶Return a string with burst search, corrections and selection info.
name
¶Measurement name – last subfolder + file name with no extension.
Name
(add='')¶Return short filename + status information.
The following methods perform background estimation, burst search and burst-data calculations:
Data.calc_bg()
Data.burst_search()
Data.calc_fret()
Data.calc_ph_num()
Data.fuse_bursts()
Data.calc_sbr()
Data.calc_max_rate()
The methods documentation follows:
fretbursts.burstlib.
Data
calc_bg
(fun, time_s=60, tail_min_us=500, F_bg=2, error_metrics=None, fit_allph=True)¶Compute time-dependent background rates for all the channels.
Compute background rates for donor, acceptor and both detectors.
The rates are computed every time_s
seconds, allowing to
track possible variations during the measurement.
Parameters: |
|
---|
The background estimation functions are defined in the module
background
(conventionally imported as bg
).
Example
Compute background with bg.exp_fit
(inter-photon delays MLE
tail fitting), every 30s, with automatic tail-threshold:
d.calc_bg(bg.exp_fit, time_s=20, tail_min_us='auto')
Returns: | None, all the results are saved in the object itself. |
---|
burst_search
(L=None, m=10, F=6.0, P=None, min_rate_cps=None, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'), compact=False, index_allph=True, c=-1, computefret=True, max_rate=False, dither=False, pure_python=False, verbose=False, mute=False, pax=False)¶Performs a burst search with specified parameters.
This method performs a sliding-window burst search without
binning the timestamps. The burst starts when the rate of m
photons is above a minimum rate, and stops when the rate falls below
the threshold. The result of the burst search is stored in the
mburst
attribute (a list of Bursts objects, one per channel)
containing start/stop times and indexes. By default, after burst
search, this method computes donor and acceptor counts, it applies
burst corrections (background, leakage, etc…) and computes
E (and S in case of ALEX). You can skip these steps by passing
computefret=False
.
The minimum rate can be explicitly specified with the min_rate_cps
argument, or computed as a function of the background rate with the
F
argument.
Parameters: |
|
---|
Note
when using P
or F
the background rates are needed, so
.calc_bg()
must be called before the burst search.
Example
d.burst_search(m=10, F=6)
Returns: | None, all the results are saved in the Data object. |
---|
calc_fret
(count_ph=False, corrections=True, dither=False, mute=False, pure_python=False, pax=False)¶Compute FRET (and stoichiometry if ALEX) for each burst.
This is an high-level functions that can be run after burst search. By default, it will count Donor and Acceptor photons, perform corrections (background, leakage), and compute gamma-corrected FRET efficiencies (and stoichiometry if ALEX).
Parameters: |
|
---|---|
Returns: | None, all the results are saved in the object. |
calc_ph_num
(alex_all=False, pure_python=False)¶Computes number of D, A (and AA) photons in each burst.
Parameters: |
|
---|---|
Returns: | Saves |
fuse_bursts
(ms=0, process=True, mute=False)¶Return a new Data
object with nearby bursts fused together.
Parameters: |
|
---|
calc_sbr
(ph_sel=Ph_sel(Dex='DAem', Aex='DAem'), gamma=1.0)¶Return Signal-to-Background Ratio (SBR) for each burst.
Parameters: |
|
---|---|
Returns: | A list of arrays (one per channel) with one value per burst.
The list is also saved in |
calc_max_rate
(m, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'), compact=False, c=1)¶Compute the max m-photon rate reached in each burst.
Parameters: |
|
---|
The following are the various burst correction factors. They are Data
properties, so setting their value automatically updates all the burst
quantities (including E
and S
).
List of Data
methods used to apply burst corrections.
fretbursts.burstlib.
Data
background_correction
(relax_nt=False, mute=False)¶Apply background correction to burst sizes (nd, na,…)
leakage_correction
(mute=False)¶Apply leakage correction to burst sizes (nd, na,…)
dither
(lsb=2, mute=False)¶Add dithering (uniform random noise) to burst counts (nd, na,…).
The dithering amplitude is the range -0.5*lsb .. 0.5*lsb.
Data
methods that allow to filter bursts according to different rules.
See also Burst selection.
fretbursts.burstlib.
Data
select_bursts
(filter_fun, negate=False, computefret=True, args=None, **kwargs)¶Return an object with bursts filtered according to filter_fun
.
This is the main method to select bursts according to different
criteria. The selection rule is defined by the selection function
filter_fun
. FRETBursts provides a several predefined selection
functions see Burst selection. New selection
functions can be defined and passed to this method to implement
arbitrary selection rules.
Parameters: |
|
---|
filter_fun()
.Returns: | A new Data object containing only the selected bursts. |
---|
Note
In order to save RAM, the timestamp arrays (ph_times_m
)
of the new Data() points to the same arrays of the original
Data(). Conversely, all the bursts data (mburst
, nd
, na
,
etc…) are new distinct objects.
select_bursts_mask
(filter_fun, negate=False, return_str=False, args=None, **kwargs)¶Returns mask arrays to select bursts according to filter_fun
.
The function filter_fun
is called to compute the mask arrays for
each channel.
This method is useful when you want to apply a selection from one
object to a second object. Otherwise use Data.select_bursts()
.
Parameters: |
|
---|
filter_fun()
.Returns: | A list of boolean arrays (one per channel) that define the burst
selection. If return_str is True returns a list of tuples, where
each tuple is a bool array and a string. |
---|
select_bursts_mask_apply
(masks, computefret=True, str_sel='')¶Returns a new Data object with bursts selected according to masks
.
This method select bursts using a list of boolean arrays as input.
Since the user needs to create the boolean arrays first, this method
is useful when experimenting with new selection criteria that don’t
have a dedicated selection function. Usually, however, it is easier
to select bursts through Data.select_bursts()
(using a
selection function).
Parameters: |
|
---|---|
Returns: | A new |
Note
In order to save RAM, the timestamp arrays (ph_times_m
)
of the new Data() points to the same arrays of the original
Data(). Conversely, all the bursts data (mburst
, nd
, na
,
etc…) are new distinct objects.
See also
Data.select_bursts()
, Data.select_mask()
Some fitting methods for burst data. Note that E and S histogram fitting with generic models is now handled with the new fitting framework.
fretbursts.burstlib.
Data
fit_E_generic
(E1=-1, E2=2, fit_fun=<function two_gaussian_fit_hist>, weights=None, gamma=1.0, **fit_kwargs)¶Fit E in each channel with fit_fun
using burst in [E1,E2] range.
All the fitting functions are defined in
fretbursts.fit.gaussian_fitting
.
Parameters: |
|
---|
All the additional arguments are passed to fit_fun
. For example p0
or mu_fix
can be passed (see fit.gaussian_fitting
for details).
Note
Use this method for CDF/PDF or hist fitting.
For EM fitting use fit_E_two_gauss_EM()
.
fit_E_m
(E1=-1, E2=2, weights='size', gamma=1.0)¶Fit E in each channel with the mean using bursts in [E1,E2] range.
Note
This two fitting are equivalent (but the first is much faster):
fit_E_m(weights='size')
fit_E_minimize(kind='E_size', weights='sqrt')
However fit_E_minimize()
does not provide a model curve.
fit_E_ML_poiss
(E1=-1, E2=2, method=1, **kwargs)¶ML fit for E modeling size ~ Poisson, using bursts in [E1,E2] range.
fit_E_minimize
(kind='slope', E1=-1, E2=2, **kwargs)¶Fit E using method kind
(‘slope’ or ‘E_size’) and bursts in [E1,E2]
If kind
is ‘slope’ the fit function is fret_fit.fit_E_slope()
If kind
is ‘E_size’ the fit function is fret_fit.fit_E_E_size()
Additional arguments in kwargs
are passed to the fit function.
fit_E_two_gauss_EM
(fit_func=<function two_gaussian_fit_EM>, weights='size', gamma=1.0, **kwargs)¶Fit the E population to a Gaussian mixture model using EM method.
Additional arguments in kwargs
are passed to the fit_func().
The following methods are used to access (or iterate over) the arrays of timestamps (for different photon streams), timestamps masks and burst data.
Data.get_ph_times()
Data.iter_ph_times()
Data.get_ph_mask()
Data.iter_ph_masks()
Data.iter_bursts_ph()
Data.expand()
Data.copy()
Data.slice_ph()
The methods documentation follows:
fretbursts.burstlib.
Data
get_ph_times
(ich=0, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'), compact=False)¶Returns the timestamps array for channel ich
.
This method always returns in-memory arrays, even when ph_times_m is a disk-backed list of arrays.
Parameters: |
|
---|
iter_ph_times
(ph_sel=Ph_sel(Dex='DAem', Aex='DAem'), compact=False)¶Iterator that returns the arrays of timestamps in .ph_times_m
.
Parameters: | Same arguments as :meth:`get_ph_mask` except for `ich`. |
---|
get_ph_mask
(ich=0, ph_sel=Ph_sel(Dex='DAem', Aex='DAem'))¶Returns a mask for ph_sel
photons in channel ich
.
The masks are either boolean arrays or slices (full or empty). In both cases they can be used to index the timestamps of the corresponding channel.
Parameters: | ph_sel (Ph_sel object) – object defining the photon selection.
See fretbursts.ph_sel for details. |
---|
iter_ph_masks
(ph_sel=Ph_sel(Dex='DAem', Aex='DAem'))¶Iterator returning masks for ph_sel
photons.
Parameters: | ph_sel (Ph_sel object) – object defining the photon selection.
See fretbursts.ph_sel for details. |
---|
iter_bursts_ph
(ich=0)¶Iterate over (start, stop) indexes to slice photons for each burst.
expand
(ich=0, alex_naa=False, width=False)¶Return per-burst D and A sizes (nd, na) and their background counts.
This method returns for each bursts the corrected signal counts and background counts in donor and acceptor channels. Optionally, the burst width is also returned.
Parameters: |
|
---|---|
Returns: | List of arrays – nd, na, donor bg, acceptor bg.
If |
copy
(mute=False)¶Copy data in a new object. All arrays copied except for ph_times_m
slice_ph
(time_s1=0, time_s2=None, s='slice')¶Return a new Data object with ph in [time_s1
,`time_s2`] (seconds)
If ALEX, this method must be called right after
fretbursts.loader.alex_apply_periods()
(with delete_ph_t=True
)
and before any background estimation or burst search.