Skip to content

mathtools

Description

This module contains all the mathematical functions that are necessary for the library.


min_max_scaling(data=None, series_or_df=None, col_by_col=False)

Min-max scaling of pd.series, pd.dataframe or np.ndarray.

Min-max feature scaling is often simply referred to as normalisation, which rescales the dataset feature to a range of 0 - 1. It's calculated by subtracting the feature's minimum value from the value and then dividing it by the difference between the maximum and minimum value.

The formula looks like this: xnorm = x - xmin / xmax - xmin.

PARAMETER DESCRIPTION
data

The data to normalise.

TYPE: (Series, DataFrame or ndarray) DEFAULT: None

series_or_df

This parameter is deprecated and will be removed in future releases. Use "data" instead.

TYPE: None DEFAULT: None

col_by_col

When True, each column is normalised between 0 and 1. When False, all the data is normalised between 0 and 1. Use col_by_col=False if you want to preserve the original relative amplitude of the different columns. col_by_col=True is acceptable only for 1 and 2D data. If data has more than 2 dimensions, set col_by_col=False to normalise the whole data instead.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
dataect

The normalised data of the same type as the input.

TYPE: (Series, DataFrame or ndarray)


norm_xcorr(sig1, sig2, out='both')

Normalized cross-correlation of 2 signals.

PARAMETER DESCRIPTION
sig1

The two signals to correlate. These signals must be 1-dimensional and of same length and type.

TYPE: Series or ndarray

sig2

The two signals to correlate. These signals must be 1-dimensional and of same length and type.

TYPE: Series or ndarray

out

A string indicating the output value:

both The output is the greatest positive or negative cross-correlation value.

max The output is the maximum cross-correlation value.

TYPE: str {"both", "max"} DEFAULT: "both"

RETURNS DESCRIPTION
xcc

The cross-correlation value depending on "out".

TYPE: float

See also
  • norm_twod_xcorr : Normalised 2-dimensional cross-correlation of STAs of two MUS.


norm_twod_xcorr(sig1=None, sig2=None, mode='full', df1=None, df2=None)

Normalised 2-dimensional cross-correlation of 2, 2D signals.

The two inputs must have same shape. When this function is used to cross-correlate MUAPs obtained via STA, sig1 and sig2 should contain the unpacked STA of the first and second MU, respectively, without np.nan columns.

PARAMETER DESCRIPTION
sig1

The two signals to correlate. These signals must be 2-dimensional and of same length and type.

TYPE: DataFrame or ndarray DEFAULT: None

sig2

The two signals to correlate. These signals must be 2-dimensional and of same length and type.

TYPE: DataFrame or ndarray DEFAULT: None

mode

A string indicating the size of the output:

full The output is the full discrete linear cross-correlation of the inputs. (Default)

valid The output consists only of those elements that do not rely on the zero-padding. In 'valid' mode, either sta_mu1 or sta_mu2 must be at least as large as the other in every dimension.

same The output is the same size as in1, centered with respect to the 'full' output.

TYPE: str {"full", "valid", "same"} DEFAULT: "full"

df1

This parameter is deprecated and will be removed in future releases. Use "sig1" instead.

TYPE: None DEFAULT: None

df2

This parameter is deprecated and will be removed in future releases. Use "sig2" instead.

TYPE: None DEFAULT: None

RETURNS DESCRIPTION
normxcorr

The results of the normalised 2d cross-correlation.

TYPE: DataFrame

normxcorr_max

The maximum value of the 2d cross-correlation.

TYPE: float

See also
  • align_by_xcorr : to align the two STAs before calling norm_twod_xcorr.
  • unpack_sta : for unpacking the sta dict in a pd.DataFrame before passing it to norm_twod_xcorr.
  • pack_sta : for packing the sta pd.DataFrame in a dict where each matrix column corresponds to a dict key.

Examples:

Full steps to pass two dataframes to norm_twod_xcorr from the same EMG file.

  1. Load the EMG file and band-pass filter the raw EMG signal
  2. Sort the matrix channels and compute the spike-triggered average
  3. Extract the STA of the MUs of interest from all the STAs
  4. Unpack the STAs of single MUs and remove np.nan to pass them to norm_twod_xcorr
  5. Compute 2dxcorr to identify a common lag/delay
>>> import openhdemg.library as emg
>>> emgfile = emg.askopenfile(filesource="OTB", otb_ext_factor=8)
>>> emgfile = emg.filter_rawemg(emgfile, order=2, lowcut=20, highcut=500)
>>> sorted_rawemg = emg.sort_rawemg(
...     emgfile,
...     code="GR08MM1305",
...     orientation=180,
...     )
>>> sta = emg.sta(emgfile, sorted_rawemg, firings=[0, 50], timewindow=100)
>>> mu0 = 0
>>> mu1 = 1
>>> sta_mu1 = sta[mu0]
>>> sta_mu2 = sta[mu1]
>>> df1, _ = emg.unpack_sta(sta_mu1)
>>> no_nan_sta1 = df1.dropna(axis=1)
>>> df2, _ = emg.unpack_sta(sta_mu2)
>>> no_nan_sta2 = df2.dropna(axis=1)
>>> normxcorr_df, normxcorr_max = emg.norm_twod_xcorr(
...     no_nan_sta1,
...     no_nan_sta2,
... )
>>> normxcorr_max
0.7241553627564273
>>> normxcorr_df
            0             1             2               125       126
0   -0.000002 -1.467778e-05 -3.013564e-05 ... -1.052780e-06  0.000001
1   -0.000004 -2.818055e-05 -6.024427e-05 ... -4.452469e-06  0.000001
2   -0.000007 -4.192479e-05 -9.223725e-05 ... -1.549197e-05 -0.000002
3   -0.000009 -5.071660e-05 -1.174545e-04 ... -3.078518e-05 -0.000007
4   -0.000007 -4.841255e-05 -1.239106e-04 ... -4.232094e-05 -0.000012
..        ...           ...           ... ...           ...       ...
402  0.000005  1.641773e-05  3.994943e-05 ...  8.170792e-07 -0.000006
403 -0.000001  4.535878e-06  1.858700e-05 ...  2.087135e-06 -0.000003
404 -0.000004 -1.241530e-06  5.704194e-06 ...  1.027966e-05  0.000002
405 -0.000004 -1.693078e-06  1.054646e-06 ...  1.811828e-05  0.000007
406 -0.000002 -2.473282e-07  6.006046e-07 ...  1.605406e-05  0.000007


discrete_spike_xcorr(spikes1, spikes2, max_lag)

Compute a discrete cross-correlation between two spike trains over a limited range of lags.

This function counts coincidences between spikes in spikes1 and spikes2 at different temporal offsets (lags) ranging from -max_lag to +max_lag. It is intended for comparing the similarity of two spike trains (MUPULSES) within a specified maximum lag window.

PARAMETER DESCRIPTION
spikes1

Indices of spikes in the first spike train.

TYPE: list or array-like of int

spikes2

Indices of spikes in the second spike train.

TYPE: list or array-like of int

max_lag

Maximum lag (in sample points) to consider. The function computes cross-correlation at lags -max_lag, ..., 0, ..., +max_lag.

TYPE: int

RETURNS DESCRIPTION
xcorr

Cross-correlation values for each lag. The central element (xcorr[lag]) corresponds to zero lag, negative indices correspond to negative lags, and positive indices correspond to positive lags. Each value represents the number of coinciding spikes between spikes1 and the shifted spikes2.

TYPE: np.ndarray of shape (2*max_lag + 1,)

See also
  • remove_duplicates_within : Remove duplicate MUs within the same file based on discharge times.

Examples:

>>> import openhdemg.library as emg
>>> spikes1 = [1, 5, 10]
>>> spikes2 = [2, 5, 9]
>>> emg.discrete_spike_xcorr(spikes1, spikes2, max_lag=2)
array([0. 1. 1. 1. 0.])


compute_sil(ipts, mupulses, compute_on_peaks_only=True, ignore_negative_ipts=None)

Calculate the Silhouette score for a single MU.

The SIL is defined as the difference between the within-cluster sums of point-to-centroid distances and the same measure calculated between clusters. The output measure is normalised in a range between 0 and 1.

Since version 0.2.0

The behaviour of this function has changed to ensure a more accurate SIL estimation. To maintain the old behaviour, you can set compute_on_peaks_only=False.

PARAMETER DESCRIPTION
ipts

The decomposed source (or pulse train, IPTS[mu]) of the MU of interest.

TYPE: Series or ndarray

mupulses

The time of firing (MUPULSES[mu]) of the MU of interest.

TYPE: ndarray

compute_on_peaks_only

If True, the silhouette (SIL) score is computed using only the ipts peaks, rather than all values in the source signal. This can improve accuracy estimation by comparing MU spikes only against other candidate spikes, ignoring baseline or negative ipts values. If False, the noise cluster is defined as all samples not selected as MU spikes.

TYPE: bool DEFAULT: True

ignore_negative_ipts

This parameter is deprecated and will be removed in future releases. Please transform the 'ipts' before if needed. To replicate the behaviour of 'ignore_negative_ipts=True' you can use 'ipts * np.abs(ipts)'.

TYPE: None DEFAULT: None

RETURNS DESCRIPTION
sil

The SIL score.

TYPE: float64

See also
  • compute_pnr : to calculate the Pulse to Noise ratio of a single MU.

Examples:

Calculate the SIL score for the third MU (MU number 2).

>>> import openhdemg.library as emg
>>> emgfile = emg.emg_from_samplefile()
>>> mu_of_interest = 2
>>> sil_value = emg.compute_sil(
...     ipts=emgfile["IPTS"][mu_of_interest],
...     mupulses=emgfile["MUPULSES"][mu_of_interest],
... )
0.9155651076123175


compute_pnr(ipts, mupulses, fsamp, constrain_pulses=[True, 3], separate_paired_firings=True)

Calculate the pulse to noise ratio for a single MU.

PARAMETER DESCRIPTION
ipts

The decomposed source (or pulse train, IPTS[mu]) of the MU of interest.

TYPE: Series

mupulses

The time of firing (MUPULSES[mu]) of the MU of interest.

TYPE: ndarray

constrain_pulses

If constrain_pulses[0] == True, the times of firing are considered those in mupulses +- the number of samples specified in constrain_pulses[1]. If constrain_pulses[0] == False, the times of firing are estimated via a heuristic penalty funtion (see Notes). constrain_pulses[1] must be an integer (see Notes for instructions on how to set the appropriate value).

TYPE: list DEFAULT: [True, 3]

separate_paired_firings

Whether to treat differently paired and non-paired firings during the estimation of the signal/noise threshold (heuristic penalty funtion, see Notes). This is relevant only if constrain_pulses[0] == False. Otherwise, this argument is ignored.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
pnr

The PNR in decibels.

TYPE: float64

See also
  • compute_sil : to calculate the Silhouette score for a single MU.
Notes

The behaviour of the compute_pnr() function is determined by the argument constrain_pulses.

If constrain_pulses[0] == True, the times of firing are considered those in mupulses +- a number of samples specified in constrain_pulses[1]. The inclusion of the samples around the mupulses values allows to capture the full ipts corresponding to the time of firing (e.g., including also the raising and falling wedges). The appropriate value of constrain_pulses[1] must be determined by the user and depends on the sampling frequency. It is suggested to use 3 when the sampling frequency is 2000 or 2048 Hz and increase it if the sampling frequency is higher (e.g. use 6 at 4000 or 4096 Hz). With this approach, the PNR estimation is not related to the variability of the firings.

If constrain_pulses[0] == False, the ipts values are classified as firings or noise based on a threshold value (named "Pi" or "r") estimated from the euristic penalty funtion described in Holobar 2012, as proposed in Holobar 2014. If the variability of the firings is relevant, this apoproach should be preferred. Specifically: Pi = D ยท ฯ‡3,50 + CoVIDI + CoVpIDI Where: D is the median of the low-pass filtered instantaneous motor unit discharge rate (first-order Butterworth filter, cut-off frequency 3 Hz). ฯ‡3,50 stands for an indicator function that penalizes motor units with filtered discharge rate D below 3 pulses per second (pps) or above 50 pps: ฯ‡3,50 = 0 if D is between 3 and 50 or D if D is not between 3 and 50. Two separate coefficients of variation for inter-discharge interval (IDI) calculated as standard deviation (SD) of IDI divided by the mean IDI, are used. CoVIDI is the coefficient of variation for IDI of non-paired MUs discharges only, whereas CoVpIDI is the coefficient of variation for IDI of paired MUs discharges. Holobar 2012 considered MUs discharges paired whenever the second discharge was within 50 ms of the first. Paired discharges are typical in pathological tremor and the use of both CoVIDI and CoVpIDI accounts for this condition. However, this heuristic penalty function penalizes MUs firing during specific types of contractions like explosive contractions (MUs discharge up to 200 pps). Therefore, in this implementation of the PNR, we did not include the penalty based on MUs discharge. Additionally, the user can decide whether to adopt the two coefficients of variations to estimate Pi or not. If both are used, Pi would be calculated as: Pi = CoVIDI + CoVpIDI Otherwise, Pi would be calculated as: Pi = CoV_all_IDI

Examples:

Calculate the PNR value for the third MU (MU number 2) forcing the selction of the times of firing.

>>> import openhdemg.library as emg
>>> emgfile = emg.emg_from_samplefile()
>>> mu_of_interest = 2
>>> pnr_value = emg.compute_pnr(
...     ipts=emgfile["IPTS"][mu_of_interest],
...     mupulses=emgfile["MUPULSES"][mu_of_interest],
...     fsamp=emgfile["FSAMP"],
...     constrain_pulses=[True, 3],
... )

Calculate the PNR value for the third MU (MU number 2) selecting the times of firing based on the euristic penalty funtion described in Holobar 2012 and considering, separately, the paired and the non-paired firings.

>>> import openhdemg.library as emg
>>> emgfile = emg.emg_from_samplefile()
>>> mu_of_interest = 2
>>> pnr_value = emg.compute_pnr(
...     ipts=emgfile["IPTS"][mu_of_interest],
...     mupulses=emgfile["MUPULSES"][mu_of_interest],
...     fsamp=emgfile["FSAMP"],
...     constrain_pulses=[False],
...     separate_paired_firings=True,
... )


compute_pulses_agreement_rate(estimated_pulses, reference_pulses, tolerance=1, method='dice_coefficient')

Calculate the Rate of Agreement (ROA) between two MUPULSES sets.

Usually used to estimate the ROA between the MUPULSES of a single MU detected during the decomposition (or after the manual cleaning), with a reference array of values (for example, simulation ground truth or MUPULSES detected using different computational methods).

Currently, only the 'dice_coefficient' method is implemented. This parameter is included for future compatibility with alternative agreement metrics.

PARAMETER DESCRIPTION
estimated_pulses

The estimated time of firing (e.g., MUPULSES[mu]) of the MU of interest.

TYPE: ndarray

reference_pulses

The reference time of firing (e.g., REFERENCE_MUPULSES[mu]) of the same MU of interest.

TYPE: ndarray

tolerance

Maximum accepted absolute sample difference between an estimated and reference discharge for them to be counted as a match. A value of 0 means exact sample matching. tolerance=n allows ±n samples of shift.

TYPE: int DEFAULT: 1

method

The metric used to calculate agreement. Currently only supports "dice_coefficient" (default), which calculates ROA as: (2 * TP) / (2 * TP + FP + FN) * 100.

TYPE: str DEFAULT: 'dice_coefficient'

RETURNS DESCRIPTION
roa

The ROA in %.

TYPE: float64