CPSC2021Dataset¶
- class torch_ecg.databases.datasets.CPSC2021Dataset(config: CFG, task: str, training: bool = True, lazy: bool = True, **reader_kwargs: Any)[source]¶
-
Data generator for feeding data into pytorch models using the
CPSC2021
database.Strategies for generating data and labels: 1. ECGs are preprocessed and stored in one folder 2. preprocessed ECGs are sliced with overlap to generate data and label for different tasks:
the data files stores segments of fixed length of preprocessed ECGs,
the annotation files contain “qrs_mask”, and “af_mask”
The returned values (tuple) of
__getitem__()
depends on the task:“qrs_detection”: (data, qrs_mask, None)
“rr_lstm”: (rr_seq, rr_af_mask, rr_weight_mask)
“main”: (data, af_mask, weight_mask)
where
data shape:
(n_lead, n_sample)
qrs_mask shape:
(n_sample, 1)
af_mask shape:
(n_sample, 1)
weight_mask shape:
(n_sample, 1)
rr_seq shape:
(n_rr, 1)
rr_af_mask shape:
(n_rr, 1)
rr_weight_mask shape:
(n_rr, 1)
Typical values of
n_sample
andn_rr
are 6000 and 30, respectively.n_lead
is typically 2, which is the number of leads in the ECG signal of theCPSC2021
database.- Parameters:
config (dict) –
Configurations for the dataset, ref. CPSC2021TrainCfg. A simple example is as follows:
>>> config = deepcopy(CPSC2021TrainCfg) >>> config.db_dir = "some/path/to/db" >>> dataset = CPSC2021Dataset(config, task="main", training=True, lazy=False)
training (bool, default True) – If True, the training set will be loaded, otherwise the test (val) set will be loaded.
lazy (bool, default True) – If True, the data will not be loaded immediately, instead, it will be loaded on demand.
**reader_kwargs (dict, optional) – Keyword arguments for the database reader class.
- load_preprocessed_data(rec: str) ndarray [source]¶
Load the preprocessed data of the record.
- Parameters:
rec (str) – Name of the record.
- Returns:
The pre-computed preprocessed ECG data of the record.
- Return type:
- persistence(force_recompute: bool = False, verbose: int = 0) None [source]¶
Save the preprocessed data to disk.