mathtools

Description¶

This module contains all the mathematical functions that are necessary for the library.

`min_max_scaling(data=None, series_or_df=None, col_by_col=False)` ¶

Min-max scaling of pd.series, pd.dataframe or np.ndarray.

Min-max feature scaling is often simply referred to as normalisation, which rescales the dataset feature to a range of 0 - 1. It's calculated by subtracting the feature's minimum value from the value and then dividing it by the difference between the maximum and minimum value.

The formula looks like this: xnorm = x - xmin / xmax - xmin.

PARAMETER	DESCRIPTION
`data`	The data to normalise. TYPE: `(Series, DataFrame or ndarray)` DEFAULT: `None`
`series_or_df`	This parameter is deprecated and will be removed in future releases. Use "data" instead. TYPE: `None` DEFAULT: `None`
`col_by_col`	When True, each column is normalised between 0 and 1. When False, all the data is normalised between 0 and 1. Use col_by_col=False if you want to preserve the original relative amplitude of the different columns. col_by_col=True is acceptable only for 1 and 2D data. If data has more than 2 dimensions, set col_by_col=False to normalise the whole data instead. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`dataect`	The normalised data of the same type as the input. TYPE: `(Series, DataFrame or ndarray)`

`norm_xcorr(sig1, sig2, out='both')` ¶

Normalized cross-correlation of 2 signals.

PARAMETER	DESCRIPTION
`sig1`	The two signals to correlate. These signals must be 1-dimensional and of same length and type. TYPE: `Series or ndarray`
`sig2`	The two signals to correlate. These signals must be 1-dimensional and of same length and type. TYPE: `Series or ndarray`
`out`	A string indicating the output value: `both` The output is the greatest positive or negative cross-correlation value. `max` The output is the maximum cross-correlation value. TYPE: `str {"both", "max"}` DEFAULT: `"both"`

RETURNS	DESCRIPTION
`xcc`	The cross-correlation value depending on "out". TYPE: `float`

See also

norm_twod_xcorr : Normalised 2-dimensional cross-correlation of STAs of two MUS.

`norm_twod_xcorr(sig1=None, sig2=None, mode='full', df1=None, df2=None)` ¶

Normalised 2-dimensional cross-correlation of 2, 2D signals.

The two inputs must have same shape. When this function is used to cross-correlate MUAPs obtained via STA, sig1 and sig2 should contain the unpacked STA of the first and second MU, respectively, without np.nan columns.

PARAMETER	DESCRIPTION
`sig1`	The two signals to correlate. These signals must be 2-dimensional and of same length and type. TYPE: `DataFrame or ndarray` DEFAULT: `None`
`sig2`	The two signals to correlate. These signals must be 2-dimensional and of same length and type. TYPE: `DataFrame or ndarray` DEFAULT: `None`
`mode`	A string indicating the size of the output: `full` The output is the full discrete linear cross-correlation of the inputs. (Default) `valid` The output consists only of those elements that do not rely on the zero-padding. In 'valid' mode, either `sta_mu1` or `sta_mu2` must be at least as large as the other in every dimension. `same` The output is the same size as `in1`, centered with respect to the 'full' output. TYPE: `str {"full", "valid", "same"}` DEFAULT: `"full"`
`df1`	This parameter is deprecated and will be removed in future releases. Use "sig1" instead. TYPE: `None` DEFAULT: `None`
`df2`	This parameter is deprecated and will be removed in future releases. Use "sig2" instead. TYPE: `None` DEFAULT: `None`

RETURNS	DESCRIPTION
`normxcorr`	The results of the normalised 2d cross-correlation. TYPE: `DataFrame`
`normxcorr_max`	The maximum value of the 2d cross-correlation. TYPE: `float`

See also

align_by_xcorr : to align the two STAs before calling norm_twod_xcorr.
unpack_sta : for unpacking the sta dict in a pd.DataFrame before passing it to norm_twod_xcorr.
pack_sta : for packing the sta pd.DataFrame in a dict where each matrix column corresponds to a dict key.

Examples:

Full steps to pass two dataframes to norm_twod_xcorr from the same EMG file.

Load the EMG file and band-pass filter the raw EMG signal
Sort the matrix channels and compute the spike-triggered average
Extract the STA of the MUs of interest from all the STAs
Unpack the STAs of single MUs and remove np.nan to pass them to norm_twod_xcorr
Compute 2dxcorr to identify a common lag/delay

>>> import openhdemg.library as emg
>>> emgfile = emg.askopenfile(filesource="OTB", otb_ext_factor=8)
>>> emgfile = emg.filter_rawemg(emgfile, order=2, lowcut=20, highcut=500)
>>> sorted_rawemg = emg.sort_rawemg(
...     emgfile,
...     code="GR08MM1305",
...     orientation=180,
...     )
>>> sta = emg.sta(emgfile, sorted_rawemg, firings=[0, 50], timewindow=100)
>>> mu0 = 0
>>> mu1 = 1
>>> sta_mu1 = sta[mu0]
>>> sta_mu2 = sta[mu1]
>>> df1, _ = emg.unpack_sta(sta_mu1)
>>> no_nan_sta1 = df1.dropna(axis=1)
>>> df2, _ = emg.unpack_sta(sta_mu2)
>>> no_nan_sta2 = df2.dropna(axis=1)
>>> normxcorr_df, normxcorr_max = emg.norm_twod_xcorr(
...     no_nan_sta1,
...     no_nan_sta2,
... )
>>> normxcorr_max
0.7241553627564273
>>> normxcorr_df
            0             1             2               125       126
0   -0.000002 -1.467778e-05 -3.013564e-05 ... -1.052780e-06  0.000001
1   -0.000004 -2.818055e-05 -6.024427e-05 ... -4.452469e-06  0.000001
2   -0.000007 -4.192479e-05 -9.223725e-05 ... -1.549197e-05 -0.000002
3   -0.000009 -5.071660e-05 -1.174545e-04 ... -3.078518e-05 -0.000007
4   -0.000007 -4.841255e-05 -1.239106e-04 ... -4.232094e-05 -0.000012
..        ...           ...           ... ...           ...       ...
402  0.000005  1.641773e-05  3.994943e-05 ...  8.170792e-07 -0.000006
403 -0.000001  4.535878e-06  1.858700e-05 ...  2.087135e-06 -0.000003
404 -0.000004 -1.241530e-06  5.704194e-06 ...  1.027966e-05  0.000002
405 -0.000004 -1.693078e-06  1.054646e-06 ...  1.811828e-05  0.000007
406 -0.000002 -2.473282e-07  6.006046e-07 ...  1.605406e-05  0.000007

`discrete_spike_xcorr(spikes1, spikes2, max_lag)` ¶

Compute a discrete cross-correlation between two spike trains over a limited range of lags.

This function counts coincidences between spikes in spikes1 and spikes2 at different temporal offsets (lags) ranging from -max_lag to +max_lag. It is intended for comparing the similarity of two spike trains (MUPULSES) within a specified maximum lag window.

PARAMETER	DESCRIPTION
`spikes1`	Indices of spikes in the first spike train. TYPE: `list or array-like of int`
`spikes2`	Indices of spikes in the second spike train. TYPE: `list or array-like of int`
`max_lag`	Maximum lag (in sample points) to consider. The function computes cross-correlation at lags -max_lag, ..., 0, ..., +max_lag. TYPE: `int`

RETURNS	DESCRIPTION
`xcorr`	Cross-correlation values for each lag. The central element (`xcorr[lag]`) corresponds to zero lag, negative indices correspond to negative lags, and positive indices correspond to positive lags. Each value represents the number of coinciding spikes between `spikes1` and the shifted `spikes2`. TYPE: `np.ndarray of shape (2*max_lag + 1,)`

See also

remove_duplicates_within : Remove duplicate MUs within the same file based on discharge times.

Examples:

>>> import openhdemg.library as emg
>>> spikes1 = [1, 5, 10]
>>> spikes2 = [2, 5, 9]
>>> emg.discrete_spike_xcorr(spikes1, spikes2, max_lag=2)
array([0. 1. 1. 1. 0.])

`compute_sil(ipts, mupulses, compute_on_peaks_only=True, ignore_negative_ipts=None)` ¶

Calculate the Silhouette score for a single MU.

The SIL is defined as the difference between the within-cluster sums of point-to-centroid distances and the same measure calculated between clusters. The output measure is normalised in a range between 0 and 1.

Since version 0.2.0

The behaviour of this function has changed to ensure a more accurate SIL estimation. To maintain the old behaviour, you can set compute_on_peaks_only=False.

PARAMETER	DESCRIPTION
`ipts`	The decomposed source (or pulse train, IPTS[mu]) of the MU of interest. TYPE: `Series or ndarray`
`mupulses`	The time of firing (MUPULSES[mu]) of the MU of interest. TYPE: `ndarray`
`compute_on_peaks_only`	If True, the silhouette (SIL) score is computed using only the ipts peaks, rather than all values in the source signal. This can improve accuracy estimation by comparing MU spikes only against other candidate spikes, ignoring baseline or negative ipts values. If False, the noise cluster is defined as all samples not selected as MU spikes. TYPE: `bool` DEFAULT: `True`
`ignore_negative_ipts`	This parameter is deprecated and will be removed in future releases. Please transform the 'ipts' before if needed. To replicate the behaviour of 'ignore_negative_ipts=True' you can use 'ipts * np.abs(ipts)'. TYPE: `None` DEFAULT: `None`

RETURNS	DESCRIPTION
`sil`	The SIL score. TYPE: `float64`

See also

compute_pnr : to calculate the Pulse to Noise ratio of a single MU.

Examples:

Calculate the SIL score for the third MU (MU number 2).

>>> import openhdemg.library as emg
>>> emgfile = emg.emg_from_samplefile()
>>> mu_of_interest = 2
>>> sil_value = emg.compute_sil(
...     ipts=emgfile["IPTS"][mu_of_interest],
...     mupulses=emgfile["MUPULSES"][mu_of_interest],
... )
0.9155651076123175

`compute_pnr(ipts, mupulses, fsamp, constrain_pulses=[True, 3], separate_paired_firings=True)` ¶

Calculate the pulse to noise ratio for a single MU.

PARAMETER	DESCRIPTION
`ipts`	The decomposed source (or pulse train, IPTS[mu]) of the MU of interest. TYPE: `Series`
`mupulses`	The time of firing (MUPULSES[mu]) of the MU of interest. TYPE: `ndarray`
`constrain_pulses`	If constrain_pulses[0] == True, the times of firing are considered those in mupulses +- the number of samples specified in constrain_pulses[1]. If constrain_pulses[0] == False, the times of firing are estimated via a heuristic penalty funtion (see Notes). constrain_pulses[1] must be an integer (see Notes for instructions on how to set the appropriate value). TYPE: `list` DEFAULT: `[True, 3]`
`separate_paired_firings`	Whether to treat differently paired and non-paired firings during the estimation of the signal/noise threshold (heuristic penalty funtion, see Notes). This is relevant only if constrain_pulses[0] == False. Otherwise, this argument is ignored. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`pnr`	The PNR in decibels. TYPE: `float64`

See also

compute_sil : to calculate the Silhouette score for a single MU.

Notes

The behaviour of the compute_pnr() function is determined by the argument constrain_pulses.

If constrain_pulses[0] == True, the times of firing are considered those in mupulses +- a number of samples specified in constrain_pulses[1]. The inclusion of the samples around the mupulses values allows to capture the full ipts corresponding to the time of firing (e.g., including also the raising and falling wedges). The appropriate value of constrain_pulses[1] must be determined by the user and depends on the sampling frequency. It is suggested to use 3 when the sampling frequency is 2000 or 2048 Hz and increase it if the sampling frequency is higher (e.g. use 6 at 4000 or 4096 Hz). With this approach, the PNR estimation is not related to the variability of the firings.

If constrain_pulses[0] == False, the ipts values are classified as firings or noise based on a threshold value (named "Pi" or "r") estimated from the euristic penalty funtion described in Holobar 2012, as proposed in Holobar 2014. If the variability of the firings is relevant, this apoproach should be preferred. Specifically: Pi = D · χ3,50 + CoVIDI + CoVpIDI Where: D is the median of the low-pass filtered instantaneous motor unit discharge rate (first-order Butterworth filter, cut-off frequency 3 Hz). χ3,50 stands for an indicator function that penalizes motor units with filtered discharge rate D below 3 pulses per second (pps) or above 50 pps: χ3,50 = 0 if D is between 3 and 50 or D if D is not between 3 and 50. Two separate coefficients of variation for inter-discharge interval (IDI) calculated as standard deviation (SD) of IDI divided by the mean IDI, are used. CoVIDI is the coefficient of variation for IDI of non-paired MUs discharges only, whereas CoVpIDI is the coefficient of variation for IDI of paired MUs discharges. Holobar 2012 considered MUs discharges paired whenever the second discharge was within 50 ms of the first. Paired discharges are typical in pathological tremor and the use of both CoVIDI and CoVpIDI accounts for this condition. However, this heuristic penalty function penalizes MUs firing during specific types of contractions like explosive contractions (MUs discharge up to 200 pps). Therefore, in this implementation of the PNR, we did not include the penalty based on MUs discharge. Additionally, the user can decide whether to adopt the two coefficients of variations to estimate Pi or not. If both are used, Pi would be calculated as: Pi = CoVIDI + CoVpIDI Otherwise, Pi would be calculated as: Pi = CoV_all_IDI

Examples:

Calculate the PNR value for the third MU (MU number 2) forcing the selction of the times of firing.

>>> import openhdemg.library as emg
>>> emgfile = emg.emg_from_samplefile()
>>> mu_of_interest = 2
>>> pnr_value = emg.compute_pnr(
...     ipts=emgfile["IPTS"][mu_of_interest],
...     mupulses=emgfile["MUPULSES"][mu_of_interest],
...     fsamp=emgfile["FSAMP"],
...     constrain_pulses=[True, 3],
... )

Calculate the PNR value for the third MU (MU number 2) selecting the times of firing based on the euristic penalty funtion described in Holobar 2012 and considering, separately, the paired and the non-paired firings.

>>> import openhdemg.library as emg
>>> emgfile = emg.emg_from_samplefile()
>>> mu_of_interest = 2
>>> pnr_value = emg.compute_pnr(
...     ipts=emgfile["IPTS"][mu_of_interest],
...     mupulses=emgfile["MUPULSES"][mu_of_interest],
...     fsamp=emgfile["FSAMP"],
...     constrain_pulses=[False],
...     separate_paired_firings=True,
... )

`compute_pulses_agreement_rate(estimated_pulses, reference_pulses, tolerance=1, method='dice_coefficient')` ¶

Calculate the Rate of Agreement (ROA) between two MUPULSES sets.

Usually used to estimate the ROA between the MUPULSES of a single MU detected during the decomposition (or after the manual cleaning), with a reference array of values (for example, simulation ground truth or MUPULSES detected using different computational methods).

Currently, only the 'dice_coefficient' method is implemented. This parameter is included for future compatibility with alternative agreement metrics.

PARAMETER	DESCRIPTION
`estimated_pulses`	The estimated time of firing (e.g., MUPULSES[mu]) of the MU of interest. TYPE: `ndarray`
`reference_pulses`	The reference time of firing (e.g., REFERENCE_MUPULSES[mu]) of the same MU of interest. TYPE: `ndarray`
`tolerance`	Maximum accepted absolute sample difference between an estimated and reference discharge for them to be counted as a match. A value of 0 means exact sample matching. tolerance=n allows ±n samples of shift. TYPE: `int` DEFAULT: `1`
`method`	The metric used to calculate agreement. Currently only supports "dice_coefficient" (default), which calculates ROA as: (2 * TP) / (2 * TP + FP + FN) * 100. TYPE: `str` DEFAULT: `'dice_coefficient'`

RETURNS	DESCRIPTION
`roa`	The ROA in %. TYPE: `float64`

mathtools

Description¶

min_max_scaling(data=None, series_or_df=None, col_by_col=False) ¶

norm_xcorr(sig1, sig2, out='both') ¶

norm_twod_xcorr(sig1=None, sig2=None, mode='full', df1=None, df2=None) ¶

discrete_spike_xcorr(spikes1, spikes2, max_lag) ¶

compute_sil(ipts, mupulses, compute_on_peaks_only=True, ignore_negative_ipts=None) ¶

compute_pnr(ipts, mupulses, fsamp, constrain_pulses=[True, 3], separate_paired_firings=True) ¶

compute_pulses_agreement_rate(estimated_pulses, reference_pulses, tolerance=1, method='dice_coefficient') ¶

`min_max_scaling(data=None, series_or_df=None, col_by_col=False)` ¶

`norm_xcorr(sig1, sig2, out='both')` ¶

`norm_twod_xcorr(sig1=None, sig2=None, mode='full', df1=None, df2=None)` ¶

`discrete_spike_xcorr(spikes1, spikes2, max_lag)` ¶

`compute_sil(ipts, mupulses, compute_on_peaks_only=True, ignore_negative_ipts=None)` ¶

`compute_pnr(ipts, mupulses, fsamp, constrain_pulses=[True, 3], separate_paired_firings=True)` ¶

`compute_pulses_agreement_rate(estimated_pulses, reference_pulses, tolerance=1, method='dice_coefficient')` ¶