pyheartlib.data_rhythm
Module Contents
Classes
Processes ECG records to make a dataset holding records along with |
|
Generates samples of data in batches. |
Functions
|
Loads the dataset. |
- class pyheartlib.data_rhythm.RhythmData(base_path=None, remove_bl=False, lowpass=False, cutoff=45, order=15, progress_bar=True, **kwargs)
Bases:
pyheartlib.data.Data,pyheartlib.data.DataSeqProcesses ECG records to make a dataset holding records along with metadata about signal excerpts.
It has a method that can generate metadata for signal excerpts. The metadata are generated using the sliding window approach. For each excerpt of a signal the onset, offset, and its annotation is recorded. The metadata list for an excerpt is structured as: [record_id, onset, offset, annotation]. Annotation for an excerpt is a single label. Example metadata for an excerpt: [10, 500, 800, ‘AFIB’].
- Parameters:
base_path (str, optional) – Path of the main directory for storing the original and processed data, by default None
remove_bl (bool, optional) – If True, the baseline wander is removed from the original signals prior to extracting excerpts, by default False
lowpass (bool, optional) – Whether or not to apply low-pass filter to the original signals, by default False
cutoff (int, optional) – Parameter of the low pass-filter, by default 45
order (int, optional) – Parameter of the low pass-filter, by default 15
progress_bar (bool, optional) – Whether to display a progress bar, by default True
processors (list, optional) – Ordered list of functions’ names for preprocessing the raw signals. Each function takes a one-dimensional NumPy array as its input and returns an array of the same length.
Example
>>> from pyheartlib.data_rhythm import RhythmData >>> # Make an instance of the RhythmData >>> rhythm_data = RhythmData( >>> base_path="data", remove_bl=False, lowpass=False, >>> progress_bar=False) >>> # Define records >>> train_set = [201, 203] >>> # Create the dataset >>> rhythm_data.save_dataset( >>> rec_list=train_set, file_name="train.arr", win_size=3600, stride=64 >>> )
- full_annotate(record)
Returns a signal along with an annotation of the same length.
- Parameters:
record (dict) – Record as a dictionary with keys: signal, r_locations, r_labels, rhythms, rhythms_locations.
- Returns:
Two items: (signal, full_ann).
First element is the original signal (1D ndarray).
Second element is a list that has the same length as the original signal with rhythm types as its elements: [‘(N’,’AFIB’,’AFIB’, …].
- Return type:
tuple
- gen_samples_info(annotated_records, win_size=30 * 360, stride=36, **kwargs)
Generates metadata for signal excerpts.
The metadata are generated using the sliding window approach. For each excerpt of a signal the onset, offset, and its annotation is recorded. The metadata list for an excerpt is structured as: [record id, onset, offset, annotation]
- Parameters:
annotated_records (list) – List of records ([rec1_dict, …]). Each record is a dictionary with keys: signal, r_locations, r_labels, rhythms, rhythms_locations, full_ann.
win_size (int, optional) – Sliding window length, by default 30*360
stride (int, optional) – Stride of the sliding window, by default 36
- Returns:
A nested list. Each inner list is structured as: [record id, onset, offset, annotation]. E.g. : [[10,500,800,’AFIB’], [10,700,900,’(N’], …]
- Return type:
list
- class pyheartlib.data_rhythm.ECGSequence(data, samples_info, class_labels=None, batch_size=128, raw=True, interval=36, shuffle=True, rri_output=True, rri_length=150)
Bases:
tensorflow.keras.utils.SequenceGenerates samples of data in batches.
The excerpt for each sample is extracted based on the provided metadata. The use of metadata instead of excerpts has the advantage of reducing the RAM requirement, especially when numerous excerpts are required from the raw signals. By using metadata about the excerpts, they are extracted in batches whenever they are needed.
- Parameters:
data (list) – A list containing a dictionary for each record: [rec1,rec2,….]. Each record is a dictionary with keys: signal, r_locations, r_labels, rhythms, rhythms_locations, full_ann.
samples_info (list) – A nested list of metadata for excerpts. For each excerpt,the metadata is structured as a list: [record_id, onset, offset, annotation]. E.g. : [[10,500,800,’AFIB’], …].
class_labels (list, optional) – Classes as a list for converting the output annotations to integers such as: [“(N”, “(VT”] => [0,1], by default None
batch_size (int, optional) – Number of samples in each batch, by default 128
raw (bool, optional) – Whether to return the waveform or the computed features, by default True
interval (int, optional) – Interval for sub-segmenting the signal for waveform feature computation, by default 36
shuffle (bool, optional) – If True, after each epoch the samples are shuffled, by default True
rri_output (bool, optional) – Whether to return RR-intervals and their features. If False, returns only waveforms.
rri_length (int, optional) – Length of the output RR-intervals list. It is zero-padded on the right side, by default 150
Examples
>>> from pyheartlib.data_rhythm import ECGSequence >>> trainseq = ECGSequence( >>> annotated_records, >>> samples_info, >>> class_labels=None, >>> batch_size=3, >>> raw=True, >>> interval=36, >>> shuffle=False, >>> rri_output=True, >>> rri_length=25 >>> )
Notes
Returns a tuple containing two elements when its object is utilized in this way: ECGSequence_object[BatchNo].
The first element (Batch_x) contains data samples and the second one (Batch_y) their associated annotation.
If rri_output is True, Batch_x is a list of NumPy arrays of Batch_wave, Batch_rri, Batch_rri_feat.
Batch_wave contains signal excerpts or their features, Batch_rri contains RR-intervals, and Batch_rri_feat contains RR-interval features.
If rri_output is False, Batch_x contains Batch_wave only.
If raw is False, Batch_wave has the shape of (Batch size, Number of channels, Number of sub-segments, Number of features), otherwise, it has the shape of (Batch size, Number of channels, Length of excerpt).
Batch_y has the shape of (Batch size, ).
- __len__()
- __getitem__(idx)
- Returns:
Contains batch_x and batch_y.
If rri_output is True, batch_x is a list of Numpy arrays of batch_wave, batch_rri, batch_rri_feat. If rri_output is False, batch_x contains batch_wave only.
If raw is False, batch_wave has the shape of (batch_size, #channels, #sub-segments, #features), otherwise, it has the shape of (batch_size, #channels, wave_len).
batch_y has the shape of (batch_size, 1).
- Return type:
tuple
- on_epoch_end()
After each epoch shuffles the samples.
- get_integer(ann)
Converts a text label to integer.
- get_rri(rec_id, start, end)
Computes RR-intervals.
- compute_rri_features(rri_array)
Computes some statistical features for RR-intervals.
- get_rri_features_names()
Get RR-interval feature names.
- compute_wf_feats(seq)
Computes waveform features.
- get_wf_feats_names()
Get waveform feature names.
- pyheartlib.data_rhythm.load_dataset(file_path=None)
Loads the dataset.
- Parameters:
file_path (str, optional) – Path of the dataset, by default None