`pyheartlib.data_beat`

Module Contents

Classes

BeatData

Processes the provided ECG records and creates a dataset containing

class pyheartlib.data_beat.BeatData(base_path=None, win=[60, 120], num_pre_rr=10, num_post_rr=10, remove_bl=False, lowpass=False, cutoff=45, order=15, progress_bar=True, **kwargs)

Bases: pyheartlib.data.Data

Processes the provided ECG records and creates a dataset containing waveforms, features, and annotations.

Parameters:

base_path (str, optional) – Path of the main directory for storing the original and processed data, by default None
win (list, optional) – [Onset, Offset] of signal excerpts around the R-peaks, by default [60, 120]
num_pre_rr (int, optional) – Number of preceding R-peak locations to be included for each beat, by default 10
num_post_rr (int, optional) – Number of subsequent R-peak locations to be included for each beat, by default 10
remove_bl (bool, optional) – If True, the baseline wander is removed from the original signals prior to extracting excerpts, by default False
lowpass (bool, optional) – Whether or not to apply low-pass filter to the original signals, by default False
cutoff (int, optional) – Parameter of the low pass-filter, by default 45
order (int, optional) – Parameter of the low pass-filter, by default 15
progress_bar (bool, optional) – Whether to display a progress bar, by default True
processors (list, optional) – Ordered list of functions’ names for preprocessing the raw signals. Each function takes a one-dimensional NumPy array as its input and returns an array of the same length.

Examples

>>> beatdata = BeatData(base_path="./data", win=[200, 200],
>>>                     remove_bl=False, lowpass=False,
>>>                     progress_bar=True)
>>> # create a BeatInfo object
>>> beatinfo = BeatInfo()
>>> # save the dataset file
>>> beatdata.save_dataset_inter(DS1[17:18], beatinfo, file="train.beat")
>>> # load the dataset from file
>>> train_ds = beatdata.load_data(file_name="train.beat")
File loaded from: ./data/train.beat
-Shape of "waveforms" is (2985, 400). Number of samples is 2985.
-Shape of "beat_feats" is (2985, 27). Number of samples is 2985.
-Shape of "labels" is (2985,). Number of samples is 2985.
            N  L  R  j  e  V  E    A  S  a  J  F  f  /  Q
train.beat  2601  0  0  0  0  1  0  383  0  0  0  0  0  0  0

make_frags(signal, r_locations=None, r_label=None)

Fragments one signal into beats and returns the signal excerpts and corresponding labels.

Parameters:

signal (list) – A list containing signal values.
r_locations (list) – A list containing rpeak locations on the signal.
r_label (list) – A list containing the rpeak(beat) labels.

Returns:

signal_fragsnumpy.ndarray: A 2D array containing extracted beat excerpts.
beat_typeslist: Contains the corresponding labels of each beat excerpt.
r_locslist: A list containing lists of previous, itself, and future rpeak locations for each beat. Can be used for HRV calculations.
s_idxslist: Contains the starting point of each extracted beat excerpt on the original signal. This is computed by subtracting the window onset from the rpeak location.

Return type:

Tuple

make_dataset(records, beatinfo_obj=None)

Creates a dataset from the provided records.

Parameters:

records (list) – A list containing records ids.
beatinfo_obj (instance of BeatInfo.) –

Returns:

Dictionary with keys:

’waveforms’numpy.ndarray: 2D array of beat waveforms.
’beat_feats’pd.DataFrame: DataFrame of beats’ features.
’labels’numpy.ndarray: 1D array of beats’ labels.

Return type:

dict

beat_info_feat(data, beatinfo_obj)

Provides the computed features for all the beats.

Parameters:

data (dict) –
Dictionary with keys:

’waveform’numpy.ndarray
Array of waveforms (#waveforms, len_waveforms, #channels).

’rpeak_locs’list
List of rpeak locations.

’rec_ids’list
List of record ids.

’start_idxs’list
List of start_idxs of waveforms on the raw signal.

’labels’list
List of beat labels.
beatinfo_obj (Instance of BeatInfo) –

Returns:

featureslist: Contains feature dictionaries for all the beats.
labelslist: Contains corresponding beat labels.

Return type:

Tuple

save_dataset_inter(records, beatinfo_obj, file=None)

Creates a dataset from the given record IDs.

Parameters:

records (list) – List of records IDs.
beatinfo_obj (Instance of BeatInfo.) –
file (str) – Name of the file that will be saved.

save_dataset_intra(records, beatinfo_obj, split_ratio=0.3, file_prefix='intra')

Makes the dataset in intra-patient way.

Parameters:

records (list, optional) – List of records IDs.
beatinfo_obj (Instance of BeatInfo.) –
split_ratio (float, optional) – Ratio of test set, by default 0.3
file_prefix (str, optional) – Prefix for the file names to be saved, by default ‘intra’

save_dataset_single(record, beatinfo_obj, split_ratio=0.3, file=None)

Saves the signal fragments and their labels into a file for a single record.

Parameters:

record (str) – Record id.
beatinfo_obj (instance of BeatInfo) –
split_ratio (float, optional) – Ratio of test set, by default 0.3
file (str, optional) – Name of the file to be saved, by default None

load_data(file_name)

Loads a file containing a dataframe.

Parameters:: file_name (str) – File name. The final final path is the join of base path and file name.
Returns:: Dataset with keys: “waveforms”, “beat_feats”, and “labels”.
Return type:: dict

report_stats(yds_list)

Counts the number of samples for each label type in the data.

Parameters:: yds_list (list) – List containing several label data sets. e.g train,val,test.
Returns:: A list of dictionaries. One dictionary per one data set. Keys are labels types(symbols) and values are the counts of each specific symbol.
Return type:: list

report_stats_table(yds_list, name_list=[])

Returns the number of samples for each label type in the data.

Parameters:

yds_list (list) – List containing several label data. e.g train,val,test.
name_list (list, optional) – A list of strings as the name of label data e.g. train,val,test, by default []

Returns:

A dataframe containing symbols and their counts.

Return type:

pandas.dataframe

per_record_stats(rec_ids_list=None, cols=None)

Returns a dataframe containing the number of each type in each record.

Parameters:

rec_ids_list (list, optional) – List of record ids, by default None
cols (list) – List of labels classes, by default None

Returns:

Contains count of each label type.

Return type:

pandas.dataframe

slice_data(ds, labels)

Returns the data according to the provided annotation list.

Parameters:

ds (dics) – Dataset with keys: “waveforms”, “beat_feats”, and “labels”.
labels (list) – List of labels to be kept in the output.

Returns:

Dataset with keys: “waveforms”, “beat_feats”, and “labels”.

Return type:

dict

search_label(inp, sym='N')

Searches the provided data and returns the indexes for a patricular label.

Parameters:

inp (dict or numpy.ndarray) – Input can be a dictionary having a ‘labels’ key, or a 1D numpy array containing labels.
sym (str, optional) – The label to be searched for in the dataset, by default ‘N’

Returns:

A list of indexes corresponding to the searched label.

Return type:

list

Raises:

TypeError – Input data must be a dictionary or a numpy array.

clean_inf_nan(ds)

Cleans the dataset by removing samples (rows) with inf or nan in computed features.

Parameters:: ds (dict) – Dataset with keys: “waveforms”, “beat_feats”, and “labels”.
Returns:: Cleaned dataset with keys: “waveforms”, “beat_feats”, and “labels”.
Return type:: dict

clean_IQR(ds, factor=1.5, return_indexes=False)

Cleans the dataset by removing outliers using IQR method.

Parameters:

ds (dict) – Dataset with keys: “waveforms”, “beat_feats”, and “labels”.
factor (float, optional) – Parameter of IQR method, by default 1.5
return_indexes (bool, optional) – If True returns indexes of outliers, otherwise returns cleaned dataset, by default False

Returns:

Cleaned dataset with keys: “waveforms”, “beat_feats”, and “labels”, or indexes of outliers.

Return type:

dict or list

clean_IQR_class(ds, factor=1.5)

Cleans dataset by IQR method for every class separately.

Parameters:

ds (dict) – Dataset with keys: “waveforms”, “beat_feats”, and “labels”.
factor (float, optional) – Parameter of IQR method, by default 1.5

Returns:

Cleaned dataset with keys: “waveforms”, “beat_feats”, and “labels”.

Return type:

dict

append_ds(ds1, ds2)

Appends two datasets.

Parameters:

ds1 (dict) – Datasets with keys: “waveforms”, “beat_feats”, and “labels”.
ds2 (dict) – Datasets with keys: “waveforms”, “beat_feats”, and “labels”.

Returns:

Dataset with keys: “waveforms”, “beat_feats”, and “labels”.

Return type:

dict

pyheartlib.data_beat

Module Contents

Classes

`pyheartlib.data_beat`