CACHET_CADB¶
- class torch_ecg.databases.CACHET_CADB(db_dir: str | bytes | PathLike | None = None, working_dir: str | bytes | PathLike | None = None, verbose: int = 1, **kwargs: Any)[source]¶
Bases:
_DataBase
CACHET-CADB: A Contextualized Ambulatory Electrocardiography Arrhythmia Dataset
ABOUT
The database has 259 days of contextualized ECG recordings from 24 patients and 1,602 manually annotated 10 s heart-rhythm samples.
The length of the ECG records in the CACHET-CADB varies from 24 h to 3 weeks.
The patient’s ambulatory context information (activities, movement acceleration, body position, etc.) is extracted for every 10 s interval cumulatively.
nearly 11% of the ECG data in the database is found to be noisy.
Webpages for downloading the database [1] and the short-format database [2], see also the GitHub repository [3].
Usage
ECG arrhythmia detection
Self-Supervised Learning
References
Citation
10.3389/fcvm.2022.893090 10.11583/DTU.14547264 10.11583/DTU.14547330
- Parameters:
db_dir (path-like, optional) – Storage path of the database. If not specified, data will be fetched from Physionet.
working_dir (path-like, optional) – Working directory, to store intermediate files and log files.
verbose (int, default 1) – Level of logging verbosity.
kwargs (dict, optional) – Auxilliary key word arguments
- property database_info: DataBaseInfo¶
The
DataBaseInfo
object of the database.
- property df_metadata: DataFrame¶
The table of metadata of the records.
- download(files: str | Sequence[str] | None) None [source]¶
Download the database from the DTU website.
- get_absolute_path(rec: str | int, extension: str = 'signal-ecg') Path [source]¶
Get the absolute path of the signal folder of the record.
- Parameters:
- Returns:
Absolute path of the file.
- Return type:
- get_subject_info(rec_or_sid: str | int, items: List[str] | None = None) Dict[str, str] [source]¶
Read auxiliary information of a subject (a record) stored in the header files.
- Parameters:
- Returns:
subject_info – Information about the subject, including “age”, “gender”, “height”, “weight”.
- Return type:
- load_ann(rec: str | int, ann_format: str = 'pd') DataFrame | ndarray | Dict[int | str, ndarray] [source]¶
Load annotation from the metadata file.
- Parameters:
- Returns:
ann – The annotation of the record.
- Return type:
pandas.DataFrame or numpy.ndarray or dict
- load_context_ann(rec: str | int, sheet_name: str | None = None) DataFrame | Dict[str, DataFrame] [source]¶
Load context annotation.
- Parameters:
- Returns:
context_ann – Context annotations of the record.
- Return type:
pandas.DataFrame or dict
- load_context_data(rec: str | int, context_name: str, sampfrom: int | None = None, sampto: int | None = None, channels: str | int | List[str] | List[int] | None = None, units: str | None = None, fs: Real | None = None) ndarray | DataFrame [source]¶
Load context data (e.g. accelerometer, heart rate, etc.).
- Parameters:
rec (str or int) – Record name or index of the record in
all_records
.context_name (str) – Context name, can be one of “acc”, “angularrate”, “hr_live”, “hrvrmssd_live”, “movementacceleration_live”, “press”, “marker”.
sampfrom (int, optional) – Start index of the data to be loaded.
sampto (int, optional) – End index of the data to be loaded.
channels (str or int or List[str] or List[int], optional) – Channels (names or indices) to be loaded. If is None, all channels will be loaded.
units (str, optional) – Units of the output signal, currently can only be “default”; None for digital data, without digital-to-physical conversion.
fs (numbers.Real, optional) – Sampling frequency of the output signal. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used.
- Returns:
context_data – Context data in the “channel_first” format.
- Return type:
numpy.ndarray or pandas.DataFrame
Note
If the record does not have the specified context data, empty array or DataFrame will be returned.
- load_data(rec: str | int, sampfrom: int | None = None, sampto: int | None = None, data_format: str = 'channel_first', units: str | None = 'mV', fs: Real | None = None, return_fs: bool = False) ndarray | Tuple[ndarray, Real] [source]¶
Load physical (converted from digital) ECG data, or load digital signal directly.
- Parameters:
rec (str or int) – Record name or index of the record in
all_records
, or “short_format” (-1) to load data from the short format file.sampfrom (int, optional) – Start index of the data to be loaded.
sampto (int, optional) – End index of the data to be loaded.
data_format (str, default "channel_first") – Format of the ECG data, “channel_last” (alias “lead_last”), or “channel_first” (alias “lead_first”), or “flat” (alias “plain”).
units (str or None, default "mV") – Units of the output signal, can also be “μV” (aliases “uV”, “muV”); None for digital data, without digital-to-physical conversion.
fs (numbers.Real, optional) – Sampling frequency of the output signal. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used.
return_fs (bool, default False) – Whether to return the sampling frequency of the output signal.
- Returns:
data (numpy.ndarray) – The loaded ECG data.
data_fs (numbers.Real, optional) – Sampling frequency of the output signal. Returned if return_fs is True.