pyheartlib.data_rpeak

Module Contents

Classes

RpeakData

Processes ECG records to make a dataset holding records along with

ECGSequence

Generates samples of data in batches.

Functions

load_dataset([file_path])

Loads the dataset.

class pyheartlib.data_rpeak.RpeakData(base_path=None, remove_bl=False, lowpass=False, cutoff=45, order=15, progress_bar=True, **kwargs)

Bases: pyheartlib.data.Data, pyheartlib.data.DataSeq

Processes ECG records to make a dataset holding records along with metadata about signal excerpts.

It has a method that can generate metadata for signal excerpts. The metadata are generated using the sliding window approach. For each excerpt of a signal the onset, offset, and its annotation is recorded. The metadata list for an excerpt is structured as: [record_id, onset, offset, annotation]. Annotation for an excerpt is a list of zeros, except for any interval where there is an R-peak label. Example metadata for an excerpt: [10, 500, 800, [0, 0, 0, ‘N’, 0, ‘N’, 0, ‘N’, 0, …]]

Parameters:
  • base_path (str, optional) – Path of the main directory for storing the original and processed data, by default None

  • remove_bl (bool, optional) – If True, the baseline wander is removed from the original signals prior to extracting excerpts, by default False

  • lowpass (bool, optional) – Whether or not to apply low-pass filter to the original signals, by default False

  • cutoff (int, optional) – Parameter of the low pass-filter, by default 45

  • order (int, optional) – Parameter of the low pass-filter, by default 15

  • progress_bar (bool, optional) – Whether to display a progress bar, by default True

  • processors (list, optional) – Ordered list of functions’ names for preprocessing the raw signals. Each function takes a one-dimensional NumPy array as its input and returns an array of the same length.

Example

>>> from pyheartlib.data_rpeak import RpeakData
>>> # Make an instance of the RpeakData
>>> rpeak_data = RpeakData(
>>>     base_path="data", remove_bl=False, lowpass=False,
>>>     progress_bar=False)
>>> # Define records
>>> train_set = [201, 203]
>>> # Create the dataset
>>> # The win_size specifies the length of the excerpts
>>> rpeak_data.save_dataset(
>>>     rec_list=train_set,
>>>     file_name="train.rpeak",
>>>     win_size=5 * 360,
>>>     stride=360,
>>>     interval=72,
>>> )
full_annotate(record)

Returns a signal along with an annotation of the same length.

Parameters:

record (dict) – Record as a dictionary with keys: signal, r_locations, r_labels, rhythms, rhythms_locations.

Returns:

Two items: (signal, full_ann).

First element is the original signal (1D ndarray).

Second element is a list that has the same length as the original signal with zero elements except at any R-peak index which has the R-peak label instead. E.g.: [0,0,0,’N’,0,’N’,0,’N’, …]

Return type:

tuple

gen_samples_info(annotated_records, win_size=30 * 360, stride=256, **kwargs)

Generates metadata for signal excerpts.

The metadata are generated using the sliding window approach. For each excerpt of a signal the onset, offset, and its annotation is recorded. The metadata list for an excerpt is structured as: [record id, onset, offset, annotation]

Parameters:
  • annotated_records (list) – List of records ([rec1_dict, …]). Each record is a dictionary with keys: signal, r_locations, r_labels, rhythms, rhythms_locations, full_ann.

  • win_size (int, optional) – Sliding window length, by default 30*360

  • stride (int, optional) – Stride of the sliding window, by default 36

  • interval (int, optional) – Controls the degree of granularity of labels in the annotation list, by default 36

  • binary (bool, optional) – If True, any non-zero label will be converted to 1 in the annotation list, by default False

Returns:

A nested list. Each inner list is structured as: [record id, onset, offset, annotation].

Annotation is a list with zeros except for any interval that there is an R-peak label.

E.g.: [[10, 500, 800, [0,0,0,’N’,0,’N’,0,’N’,0,….] ], …]

Return type:

list

class pyheartlib.data_rpeak.ECGSequence(data, samples_info, class_labels=None, binary=True, batch_size=128, raw=True, interval=36, shuffle=True)

Bases: tensorflow.keras.utils.Sequence

Generates samples of data in batches.

The excerpt for each sample is extracted based on the provided metadata. The use of metadata instead of excerpts has the advantage of reducing the RAM requirement, especially when numerous excerpts are required from the raw signals. By using metadata about the excerpts, they are extracted in batches whenever they are needed.

Parameters:
  • data (list) – A list containing a dictionary for each record: [rec1,rec2,….]. Each record is a dictionary with keys: signal, r_locations, r_labels, rhythms, rhythms_locations, full_ann.

  • samples_info (list) – A nested list of metadata for excerpts. For each excerpt,the metadata is structured as a list: [record_id, onset, offset, annotation]. E.g. : [[10, 500, 800, [0,0,0,’N’,0,’N’,0,’N’,0,…] ], …].

  • class_labels (list, optional) – Classes as a list for converting the output annotations to integers such as: [0, “N”, “V”] => [0,1,2], by default None

  • batch_size (int, optional) – Number of samples in each batch, by default 128

  • binary (bool, optional) – If True, any non-zero label will be converted to 1 in the output annotation list, by default True

  • raw (bool, optional) – Whether to return the waveform or the computed features, by default True

  • interval (int, optional) – Sub-segmenting interval for feature computations and label assignments. Controls the degree of granularity for labels in the output annotation list and the number of sub-segments, by default 36

  • shuffle (bool, optional) – If True, after each epoch the samples are shuffled, by default True

Examples

>>> from pyheartlib.data_rpeak import ECGSequence
>>> trainseq = ECGSequence(
>>>     annotated_records, samples_info, binary=False, batch_size=2,
>>> raw=True, interval=72)

Notes

Returns a tuple containing two elements when its object is utilized in this way: ECGSequence_object[BatchNo].

The first element (Batch_x) contains data samples and the second one (Batch_y) their associated annotation.

Batch_x contains the signal excerpts or their features (Batch_wave).

If raw is False, Batch_wave has the shape of (Batch size, Number of channels, Number of sub-segments, Number of features), otherwise, it has the shape of (Batch_size, Number of channels, Length of excerpt).

Batch_y has the shape of (Batch size, Length of annotation list).

__len__()
__getitem__(idx)
Returns:

Contains batch_x, batch_y as numpy arrays.

batch_x has the shape of (batch_size, #channels, #sub-segments, #features) if raw is False, otherwise it has the shape of (batch_size, #channels, len_seq).

batch_y has the shape of (batch_size,len_annotation_list)

Return type:

tuple

on_epoch_end()

After each epoch shuffles the samples.

get_integer(ann)

Converts labels in an annotation list to integers.

get_binary(ann)

Converts labels in an annotation list to 0 or 1.

gen_annotation(seg)

Generate annotation list.

The returned list contains zeros except for any interval that has an R-peak label.

compute_wf_feats(seq)
get_wf_feats_names()

Get waveform feature names.

pyheartlib.data_rpeak.load_dataset(file_path=None)

Loads the dataset.

Parameters:

file_path (str, optional) – Path of the dataset, by default None