CPSC2021Dataset¶

class torch_ecg.databases.datasets.CPSC2021Dataset(config: CFG, task: str, training: bool = True, lazy: bool = True, **reader_kwargs: Any)[source]¶

Bases: ReprMixin, Dataset

Data generator for feeding data into pytorch models using the CPSC2021 database.

Strategies for generating data and labels: 1. ECGs are preprocessed and stored in one folder 2. preprocessed ECGs are sliced with overlap to generate data and label for different tasks:

the data files stores segments of fixed length of preprocessed ECGs,

the annotation files contain “qrs_mask”, and “af_mask”

The returned values (tuple) of __getitem__() depends on the task:

“qrs_detection”: (data, qrs_mask, None)

“rr_lstm”: (rr_seq, rr_af_mask, rr_weight_mask)

“main”: (data, af_mask, weight_mask)

where

data shape: (n_lead, n_sample)

qrs_mask shape: (n_sample, 1)

af_mask shape: (n_sample, 1)

weight_mask shape: (n_sample, 1)

rr_seq shape: (n_rr, 1)

rr_af_mask shape: (n_rr, 1)

rr_weight_mask shape: (n_rr, 1)

Typical values of n_sample and n_rr are 6000 and 30, respectively.

n_lead is typically 2, which is the number of leads in the ECG signal of the CPSC2021 database.

Parameters:

config (dict) –

Configurations for the dataset, ref. CPSC2021TrainCfg. A simple example is as follows:

>>> config = deepcopy(CPSC2021TrainCfg)
>>> config.db_dir = "some/path/to/db"
>>> dataset = CPSC2021Dataset(config, task="main", training=True, lazy=False)

training (bool, default True) – If True, the training set will be loaded, otherwise the test (val) set will be loaded.
lazy (bool, default True) – If True, the data will not be loaded immediately, instead, it will be loaded on demand.
**reader_kwargs (dict, optional) – Keyword arguments for the database reader class.

extra_repr_keys() → List[str][source]¶: Extra keys for __repr__() and __str__().

load_preprocessed_data(rec: str) → ndarray[source]¶

Load the preprocessed data of the record.

Parameters:: rec (str) – Name of the record.
Returns:: The pre-computed preprocessed ECG data of the record.
Return type:: numpy.ndarray

persistence(force_recompute: bool = False, verbose: int = 0) → None[source]¶

Save the preprocessed data to disk.

Parameters:

force_recompute (bool, default False) – Whether to force recompute the preprocessed data.
verbose (int, default 0) – Verbosity level for printing the progress.

Return type:

None

plot_seg(seg: str, ticks_granularity: int = 0) → None[source]¶

Plot the segment.

Parameters:

seg (str) – Name of the segment, of pattern like “S_1_1_0000193”.
ticks_granularity (int, default 0) – Granularity to plot axis ticks, the higher the more ticks. 0 (no ticks) –> 1 (major ticks) –> 2 (major + minor ticks)

Return type:

None

reset_task(task: str, lazy: bool = True) → None[source]¶

Reset the task of the data generator.

Parameters:

task (str) – The task to be set.
lazy (bool, optional) – If True, the data will not be loaded immediately, instead, it will be loaded on demand.

Return type:

None