SPH

class torch_ecg.databases.SPH(db_dir: str | bytes | PathLike | None = None, working_dir: str | bytes | PathLike | None = None, verbose: int = 1, **kwargs: Any)[source]

Bases: _DataBase

Shandong Provincial Hospital Database

ABOUT

  1. contains 25770 ECG records from 24666 patients (55.36% male and 44.64% female), with between 10 and 60 seconds

  2. sampling frequency is 500 Hz

  3. records were acquired from Shandong Provincial Hospital (SPH) between 2019/08 and 2020/08

  4. diagnostic statements of all ECG records are in full compliance with the AHA/ACC/HRS recommendations, consisting of 44 primary statements and 15 modifiers

  5. 46.04% records in the dataset contain ECG abnormalities, and 14.45% records have multiple diagnostic statements

  6. (IMPORTANT) noises caused by the power line interference, baseline wander, and muscle contraction have been removed by the machine

  7. (Label production) The ECG analysis system automatically calculate nine ECG features for reference, which include heart rate, P wave duration, P-R interval, QRS duration, QT interval, corrected QT (QTc) interval, QRS axis, the amplitude of the R wave in lead V5 (RV5), and the amplitude of the S wave in lead V1 (SV1). A cardiologist made the final diagnosis in consideration of the patient health record.

  8. The paper [1], [2]. Data can be downloaded from [3]. The annotation system is described in [4].

Usage

  1. ECG arrhythmia detection

References

Citation

10.1038/s41597-022-01403-5 10.6084/m9.figshare.c.5779802.v1

Parameters:
  • db_dir (path-like, optional) – Storage path of the database. If not specified, data will be fetched from Physionet.

  • working_dir (path-like, optional) – Working directory, to store intermediate files and log files.

  • verbose (int, default 1) – Level of logging verbosity.

  • kwargs (dict, optional) – Auxilliary key word arguments

property database_info: DataBaseInfo

The DataBaseInfo object of the database.

download(files: str | Sequence[str] | None) None[source]

Download the database from the figshare website.

get_age(rec: str | int) int[source]

Get the age of the subject that the record belongs to.

Parameters:

rec (int or str) – Record name or index of the record in all_records.

Returns:

age – Age of the subject.

Return type:

int

get_sex(rec: str | int) str[source]

Get the sex of the subject that the record belongs to.

Parameters:

rec (int or str) – Record name or index of the record in all_records.

Returns:

sex – Sex of the subject.

Return type:

str

get_siglen(rec: str | int) int[source]

Get the length of the ECG signal of the record.

Parameters:

rec (int or str) – Record name or index of the record in all_records.

Returns:

siglen – Length of the ECG signal of the record.

Return type:

int

get_subject_id(rec: str | int) str[source]

Attach a unique subject ID for the record.

Parameters:

rec (str or int) – Record name or index of the record in all_records

Returns:

sid – Subject ID associated with the record.

Return type:

str

get_subject_info(rec_or_sid: str | int, items: List[str] | None = None) dict[source]

Read auxiliary information of a subject (a record) from the header files.

Parameters:
  • rec (int or str) – Record name, or index of the record in all_records, or the subject ID.

  • items (List[str], optional) – Items of information to be returned (e.g. age, sex, etc.).

Returns:

subject_info – Information about the subject, including “age”, “sex”.

Return type:

dict

load_ann(rec: str | int, ann_format: str = 'c', ignore_modifier: bool = True) List[str][source]

Load annotation from the metadata file.

Parameters:
  • rec (int or str) – Record name or index of the record in all_records.

  • ann_format (str, default "a") –

    Format of labels, one of the following (case insensitive):

    • ”a”: abbreviations

    • ”f”: full names

    • ”c”: AHACode

  • ignore_modifier (bool, default True) – Whether to ignore the modifiers of the annotations or not. For example, “60+310” will be converted to “60”

Returns:

labels – The list of labels.

Return type:

List[str]

load_data(rec: str | int, leads: str | int | List[str | int] | None = None, data_format: str = 'channel_first', units: str = 'mV', return_fs: bool = False) ndarray[source]

Load ECG data from h5 file of the record.

Parameters:
  • rec (str or int) – Record name or index of the record in all_records.

  • leads (str or int or List[str] or List[int], optional) – The leads of the ECG data to be loaded.

  • data_format (str, default "channel_first") – Format of the ECG data, “channel_last” (alias “lead_last”), or “channel_first” (alias “lead_first”).

  • units (str, default "mV") – Units of the output signal, can also be “μV” (alias “uV”, “muV”).

  • return_fs (bool, default False) – Whether to return the sampling frequency of the output signal.

Returns:

  • data (numpy.ndarray) – The loaded ECG data.

  • data_fs (numbers.Real, optional) – Sampling frequency of the output signal. Returned if return_fs is True.

plot(rec: str | int, data: ndarray | None = None, ann: Sequence[str] | None = None, ticks_granularity: int = 0, leads: str | int | List[str | int] | None = None, same_range: bool = False, waves: Dict[str, Sequence[int]] | None = None, **kwargs: Any) None[source]

Plot the signals of a record or external signals (units in μV), with metadata (fs, labels, tranche, etc.), possibly also along with wave delineations.

Parameters:
  • rec (int or str) – Record name or index of the record in all_records.

  • data (numpy.ndarray, optional) – (12-lead) ECG signal to plot. Should be of the format “channel_first”, and compatible with leads. If not None, data of rec will not be used. Tthis is useful when plotting filtered data.

  • ann (Sequence[str], optional) – Annotations for data. Ignored if data is None.

  • ticks_granularity (int, default 0) – Granularity to plot axis ticks, the higher the more ticks. 0 (no ticks) –> 1 (major ticks) –> 2 (major + minor ticks)

  • leads (str or int or List[str] or List[int], optional) – The leads of the ECG signal to plot.

  • same_range (bool, default False) – If True, forces all leads to have the same y range.

  • waves (dict, optional) – A dictionary containing the indices of the wave critical points, including “p_onsets”, “p_peaks”, “p_offsets”, “q_onsets”, “q_peaks”, “r_peaks”, “s_peaks”, “s_offsets”, “t_onsets”, “t_peaks”, “t_offsets”.

  • kwargs (dict, optional) – Additional keyword arguments to pass to matplotlib.pyplot.plot().

TODO

  1. Slice too long records, and plot separately for each segment.

  2. Plot waves using matplotlib.pyplot.axvspan().

Note

Locator of plt has default MAXTICKS of 1000. If not modifying this number, at most 40 seconds of signal could be plotted once.

Contributors: Jeethan, and WEN Hao

property url: Dict[str, str]

URL(s) for downloading the database.