SPH¶
- class torch_ecg.databases.SPH(db_dir: str | bytes | PathLike | None = None, working_dir: str | bytes | PathLike | None = None, verbose: int = 1, **kwargs: Any)[source]¶
Bases:
_DataBase
Shandong Provincial Hospital Database
ABOUT
contains 25770 ECG records from 24666 patients (55.36% male and 44.64% female), with between 10 and 60 seconds
sampling frequency is 500 Hz
records were acquired from Shandong Provincial Hospital (SPH) between 2019/08 and 2020/08
diagnostic statements of all ECG records are in full compliance with the AHA/ACC/HRS recommendations, consisting of 44 primary statements and 15 modifiers
46.04% records in the dataset contain ECG abnormalities, and 14.45% records have multiple diagnostic statements
(IMPORTANT) noises caused by the power line interference, baseline wander, and muscle contraction have been removed by the machine
(Label production) The ECG analysis system automatically calculate nine ECG features for reference, which include heart rate, P wave duration, P-R interval, QRS duration, QT interval, corrected QT (QTc) interval, QRS axis, the amplitude of the R wave in lead V5 (RV5), and the amplitude of the S wave in lead V1 (SV1). A cardiologist made the final diagnosis in consideration of the patient health record.
The paper [1], [2]. Data can be downloaded from [3]. The annotation system is described in [4].
Usage
ECG arrhythmia detection
References
Citation
10.1038/s41597-022-01403-5 10.6084/m9.figshare.c.5779802.v1
- Parameters:
db_dir (path-like, optional) – Storage path of the database. If not specified, data will be fetched from Physionet.
working_dir (path-like, optional) – Working directory, to store intermediate files and log files.
verbose (int, default 1) – Level of logging verbosity.
kwargs (dict, optional) – Auxilliary key word arguments
- property database_info: DataBaseInfo¶
The
DataBaseInfo
object of the database.
- download(files: str | Sequence[str] | None) None [source]¶
Download the database from the figshare website.
- get_subject_info(rec_or_sid: str | int, items: List[str] | None = None) dict [source]¶
Read auxiliary information of a subject (a record) from the header files.
- load_ann(rec: str | int, ann_format: str = 'c', ignore_modifier: bool = True) List[str] [source]¶
Load annotation from the metadata file.
- Parameters:
rec (int or str) – Record name or index of the record in
all_records
.ann_format (str, default "a") –
Format of labels, one of the following (case insensitive):
”a”: abbreviations
”f”: full names
”c”: AHACode
ignore_modifier (bool, default True) – Whether to ignore the modifiers of the annotations or not. For example, “60+310” will be converted to “60”
- Returns:
labels – The list of labels.
- Return type:
List[str]
- load_data(rec: str | int, leads: str | int | List[str | int] | None = None, data_format: str = 'channel_first', units: str = 'mV', return_fs: bool = False) ndarray [source]¶
Load ECG data from h5 file of the record.
- Parameters:
rec (str or int) – Record name or index of the record in
all_records
.leads (str or int or List[str] or List[int], optional) – The leads of the ECG data to be loaded.
data_format (str, default "channel_first") – Format of the ECG data, “channel_last” (alias “lead_last”), or “channel_first” (alias “lead_first”).
units (str, default "mV") – Units of the output signal, can also be “μV” (alias “uV”, “muV”).
return_fs (bool, default False) – Whether to return the sampling frequency of the output signal.
- Returns:
data (numpy.ndarray) – The loaded ECG data.
data_fs (numbers.Real, optional) – Sampling frequency of the output signal. Returned if return_fs is True.
- plot(rec: str | int, data: ndarray | None = None, ann: Sequence[str] | None = None, ticks_granularity: int = 0, leads: str | int | List[str | int] | None = None, same_range: bool = False, waves: Dict[str, Sequence[int]] | None = None, **kwargs: Any) None [source]¶
Plot the signals of a record or external signals (units in μV), with metadata (fs, labels, tranche, etc.), possibly also along with wave delineations.
- Parameters:
rec (int or str) – Record name or index of the record in
all_records
.data (numpy.ndarray, optional) – (12-lead) ECG signal to plot. Should be of the format “channel_first”, and compatible with leads. If not None, data of rec will not be used. Tthis is useful when plotting filtered data.
ann (Sequence[str], optional) – Annotations for data. Ignored if data is None.
ticks_granularity (int, default 0) – Granularity to plot axis ticks, the higher the more ticks. 0 (no ticks) –> 1 (major ticks) –> 2 (major + minor ticks)
leads (str or int or List[str] or List[int], optional) – The leads of the ECG signal to plot.
same_range (bool, default False) – If True, forces all leads to have the same y range.
waves (dict, optional) – A dictionary containing the indices of the wave critical points, including “p_onsets”, “p_peaks”, “p_offsets”, “q_onsets”, “q_peaks”, “r_peaks”, “s_peaks”, “s_offsets”, “t_onsets”, “t_peaks”, “t_offsets”.
kwargs (dict, optional) – Additional keyword arguments to pass to
matplotlib.pyplot.plot()
.
TODO
Slice too long records, and plot separately for each segment.
Plot waves using
matplotlib.pyplot.axvspan()
.
Note
Locator of
plt
has default MAXTICKS of 1000. If not modifying this number, at most 40 seconds of signal could be plotted once.Contributors: Jeethan, and WEN Hao