SHHS

class torch_ecg.databases.SHHS(db_dir: str | bytes | PathLike | None = None, working_dir: str | bytes | PathLike | None = None, verbose: int = 1, **kwargs: Any)[source]

Bases: NSRRDataBase, PSGDataBaseMixin

Sleep Heart Health Study

ABOUT

ABOUT the dataset (Main webpage [1]):

  1. shhs1 (Visit 1):

    • the baseline clinic visit and polysomnogram performed between November 1, 1995 and January 31, 1998

    • in all, 6,441 men and women aged 40 years and older were enrolled

    • 5,804 rows, down from the original 6,441 due to data sharing rules on certain cohorts and subjects

  2. shhs-interim-followup (Interim Follow-up):

    • an interim clinic visit or phone call 2-3 years after baseline (shhs1)

    • 5,804 rows, despite some subjects not having complete data, all original subjects are present in the dataset

  3. shhs2 (Visit 2):

    • the follow-up clinic visit and polysomnogram performed between January 2001 and June 2003

    • during this exam cycle 3, a second polysomnogram was obtained in 3,295 of the participants

    • 4,080 rows, not all cohorts and subjects took part

  4. shhs-cvd (CVD Outcomes):

    • the tracking of adjudicated heart health outcomes (e.g. stroke, heart attack) between baseline (shhs1) and 2008-2011 (varies by parent cohort)

    • 5,802 rows, outcomes data were not provided on all subjects

  5. shhs-cvd-events (CVD Outcome Events):

    • event-level details for the tracking of heart health outcomes (shhs-cvd)

    • 4,839 rows, representing individual events

  6. ECG was sampled at 125 Hz in shhs1 and 250/256 Hz in shhs2

  7. annotations-events-nsrr and annotations-events-profusion: annotation files both contain xml files, the former processed in the EDF Editor and Translator tool, the latter exported from Compumedics Profusion

  8. about 10% of the records have HRV (including sleep stages and sleep events) annotations

DATA Analysis Tips:

  1. Respiratory Disturbance Index (RDI):

    • A number of RDI variables exist in the data set. These variables are highly skewed.

    • log-transformation is recommended, among which the following transformation performed best, at least in some subsets:

      \[NEWVA = \log(OLDVAR + 0.1)\]
  2. Obstructive Apnea Index (OAI):

    • There is one OAI index in the data set. It reflects obstructive events associated with a 4% desaturation or arousal. Nearly 30% of the cohort has a zero value for this variable

    • Dichotomization is suggested (e.g. >=3 or >=4 events per hour indicates positive)

  3. Central Apnea Index (CAI):

    • Several variables describe central breathing events, with different thresholds for desaturation and requirement/non-requirement of arousals. ~58% of the cohort have zero values

    • Dichotomization is suggested (e.g. >=3 or >=4 events per hour indicates positive)

  4. Sleep Stages:

    • Stage 1 and stage 3-4 are not normally distributed, but stage 2 and REM sleep are.

    • To use these data as continuous dependent variables, stages 1 and 3-4 must be transformed. The following formula is suggested:

      \[–\log(-\log(val/100+0.001))\]
  5. Sleep time below 90% O2:

    • Percent of total sleep time with oxygen levels below 75%, 80%, 85% and 90% were recorded

    • Dichotomization is suggested (e.g. >5% and >10% of sleep time with oxygen levels below a specific O2 level indicates positive)

ABOUT signals: (ref. [9])

  1. C3/A2 and C4/A1 EEGs, sampled at 125 Hz

  2. right and left electrooculograms (EOGs), sampled at 50 Hz

  3. a bipolar submental electromyogram (EMG), sampled at 125 Hz

  4. thoracic and abdominal excursions (THOR and ABDO), recorded by inductive plethysmography bands and sampled at 10 Hz

  5. “AIRFLOW” detected by a nasal-oral thermocouple, sampled at 10 Hz

  6. finger-tip pulse oximetry sampled at 1 Hz

  7. ECG from a bipolar lead, sampled at 125 Hz for most SHHS-1 studies and 250 (and 256?) Hz for SHHS-2 studies

  8. Heart rate (PR) derived from the ECG and sampled at 1 Hz

  9. body position (using a mercury gauge sensor)

  10. ambient light (on/off, by a light sensor secured to the recording garment)

ABOUT annotations (NOT including “nsrrid”, “visitnumber”, “pptid” etc.):

  1. hrv annotations: (in csv files, ref. [2])

    Start__sec_

    5 minute window start time

    NN_RR

    Ratio of consecutive normal sinus beats (NN) over all cardiac inter-beat (RR) intervals (NN/RR)

    AVNN

    Mean of all normal sinus to normal sinus interbeat intervals (NN)

    IHR

    Instantaneous heart rate

    SDNN

    Standard deviation of all normal sinus to normal sinus interbeat (NN) intervals

    SDANN

    Standard deviation of the averages of normal sinus to normal sinus interbeat (NN) intervals in all 5-minute segments

    SDNNIDX

    Mean of the standard deviations of normal sinus to normal sinus interbeat (NN) intervals in all 5-minute segments

    rMSSD

    Square root of the mean of the squares of difference between adjacent normal sinus to normal sinus interbeat (NN) intervals

    pNN10

    Percentage of differences between adjacent normal sinus to normal sinus interbeat (NN) intervals that are >10 ms

    pNN20

    Percentage of differences between adjacent normal sinus to normal sinus interbeat (NN) intervals that are >20 ms

    pNN30

    Percentage of differences between adjacent normal sinus to normal sinus interbeat (NN) intervals that are >30 ms

    pNN40

    Percentage of differences between adjacent normal sinus to normal sinus interbeat (NN) intervals that are >40 ms

    pNN50

    Percentage of differences between adjacent normal sinus to normal sinus interbeat (NN) intervals that are >50 ms

    tot_pwr

    Total normal sinus to normal sinus interbeat (NN) interval spectral power up to 0.4 Hz

    ULF

    Ultra-low frequency power, the normal sinus to normal sinus interbeat (NN) interval spectral power between 0 and 0.003 Hz

    VLF

    Very low frequency power, the normal sinus to normal sinus interbeat (NN) interval spectral power between 0.003 and 0.04 Hz

    LF

    Low frequency power, the normal sinus to normal sinus interbeat (NN) interval spectral power between 0.04 and 0.15 Hz

    HF

    High frequency power, the normal sinus to normal sinus interbeat (NN) interval spectral power between 0.15 and 0.4 Hz

    LF_HF

    The ratio of low to high frequency

    LF_n

    Low frequency power (normalized)

    HF_n

    High frequency power (normalized)

  2. wave delineation annotations: (in csv files, NOTE: see “CAUTION” by the end of this part, ref. [2])

    RPoint

    Sample Number indicating R Point (peak of QRS)

    Start

    Sample Number indicating start of beat

    End

    Sample Number indicating end of beat

    STLevel1

    Level of ECG 1 in Raw data ( 65536 peak to peak rawdata = 10mV peak to peak)

    STSlope1

    Slope of ECG 1 stored as int and to convert to a double divide raw value by 1000.0

    STLevel2

    Level of ECG 2 in Raw data ( 65536 peak to peak rawdata = 10mV peak to peak)

    STSlope2

    Slope of ECG 2 stored as int and to convert to a double divide raw value by 1000.0

    Manual

    (True / False) True if record was manually inserted

    Type

    Type of beat (0 = Artifact / 1 = Normal Sinus Beat / 2 = VE / 3 = SVE)

    Class

    no longer used

    PPoint

    Sample Number indicating peak of the P wave (-1 if no P wave detected)

    PStart

    Sample Number indicating start of the P wave

    PEnd

    Sample Number indicating end of the P wave

    TPoint

    Sample Number indicating peak of the T wave (-1 if no T wave detected)

    TStart

    Sample Number indicating start of the T wave

    TEnd

    Sample Number indicating end of the T wave

    TemplateID

    The ID of the template to which this beat has been assigned (-1 if not assigned to a template)

    nsrrid

    nsrrid of this record

    samplingrate

    frequency of the ECG signal of this record

    seconds

    Number of seconds from beginning of recording to R-point (Rpoint / sampling rate)

    epoch

    Epoch (30 second) number

    rpointadj

    R Point adjusted sample number (RPoint * (samplingrate/256))

CAUTION that all the above sampling numbers except for rpointadj assume 256 Hz, while the rpointadj column has been added to provide an adjusted sample number based on the actual sampling rate.

  1. event annotations: (in xml files) TODO

  2. event_profusion annotations: (in xml files) TODO

DEFINITION of concepts in sleep study (mainly apnea and arousal, ref. [8] for corresponding knowledge):

  1. Arousal: (ref. [3], [4])

    • interruptions of sleep lasting 3 to 15 seconds

    • can occur spontaneously or as a result of sleep-disordered breathing or other sleep disorders

    • sends you back to a lighter stage of sleep

    • if the arousal last more than 15 seconds, it becomes an awakening

    • the higher the arousal index (occurrences per hour), the more tired you are likely to feel, though people vary in their tolerance of sleep disruptions

  2. Central Sleep Apnea (CSA): (ref. [3], [5], [6])

    • breathing repeatedly stops and starts during sleep

    • occurs because your brain (central nervous system) doesn’t send proper signals to the muscles that control your breathing, which is point that differs from obstructive sleep apnea

    • may occur as a result of other conditions, such as heart failure, stroke, high altitude, etc.

  3. Obstructive Sleep Apnea (OSA): (ref. [3], [7])

    • occurs when throat muscles intermittently relax and block upper airway during sleep

    • a noticeable sign of obstructive sleep apnea is snoring

  4. Complex (Mixed) Sleep Apnea: (ref. [3])

    • combination of both CSA and OSA

    • exact mechanism of the loss of central respiratory drive during sleep in OSA is unknown but is most likely related to incorrect settings of the CPAP (Continuous Positive Airway Pressure) treatment and other medical conditions the person has

  5. Hypopnea: overly shallow breathing or an abnormally low respiratory rate. Hypopnea is defined by some to be less severe than apnea (the complete cessation of breathing)

  6. Apnea Hypopnea Index (AHI): to write

    • used to indicate the severity of OSA

    • number of apneas or hypopneas recorded during the study per hour of sleep

    • based on the AHI, the severity of OSA is classified as follows

      • none/minimal: AHI < 5 per hour

      • mild: AHI ≥ 5, but < 15 per hour

      • moderate: AHI ≥ 15, but < 30 per hour

      • severe: AHI ≥ 30 per hour

  7. Oxygen Desaturation:

    • used to indicate the severity of OSA

    • reductions in blood oxygen levels (desaturation)

    • at sea level, a normal blood oxygen level (saturation) is usually 96 - 97%

    • (no generally accepted classifications for severity of oxygen desaturation)

      • mild: >= 90%

      • moderate: 80% - 89%

      • severe: < 80%

Usage

  1. Sleep stage

  2. Sleep apnea

Issues

  1. Start__sec_ might not be the start time, but rather the end time, of the 5 minute windows in some records

  2. the current version “0.15.0” removed EEG spectral summary variables

References

Citation

10.1093/jamia/ocy064

Parameters:
  • db_dir (path-like, optional) – Storage path of the database. If not specified, data will be fetched from Physionet.

  • working_dir (path-like, optional) – Working directory, to store intermediate files and log files.

  • verbose (int, default 1) – Level of logging verbosity.

  • kwargs (dict, optional) – Auxilliary key word arguments

property database_info: DataBaseInfo

The DataBaseInfo object of the database.

form_paths() None[source]

Form paths to the database files.

get_absolute_path(rec: str | int, rec_path: str | bytes | PathLike | None = None, rec_type: str = 'psg') Path[source]

Get the absolute path of specific type of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rec_path (path-like, optional) – Path of the file which contains the desired data. If is None, default path will be used.

  • rec_type (str, default "psg") – Record type, either data (psg, etc.) or annotations.

Returns:

rp – Absolute path of the record rec with type rec_type.

Return type:

pathlib.Path

get_available_signals(rec: str | int | None) List[str] | None[source]

Get available signals for a record.

If input rec is None, this function finds available signals for all records, and assign to self._df_records['available_signals'].

Parameters:

rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

Returns:

available_signals – Names of available signals for rec.

Return type:

List[str]

get_chn_num(rec: str | int, sig: str = 'ECG', rec_path: str | bytes | PathLike | None = None) int[source]

Get the index of the channel of the signal in the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • sig (str, default "ECG") – Signal name.

  • rec_path (path-like, optional) – Path of the file which contains the PSG data. If is None, default path will be used.

Returns:

chn_num – Index of channel of the signal sig of the record rec. Returns -1 if corresponding signal (.edf) file is not available, or the signal file does not contain the signal sig.

Return type:

int

get_fs(rec: str | int, sig: str = 'ECG', rec_path: str | bytes | PathLike | None = None) Real[source]

Get the sampling frequency of a signal of a record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • sig (str, default "ECG") – Signal name or annotation name (e.g. “rpeak”). Some annotation files (*-rpeak.csv) have a sampling frequency column.

  • rec_path (path-like, optional) – Path of the file which contains the PSG data. If is None, default path will be used.

Returns:

fs – Sampling frequency of the signal sig of the record rec. If corresponding signal (.edf) file is not available, or the signal file does not contain the signal sig, -1 will be returned.

Return type:

numbers.Real

get_nsrrid(rec: str | int) int[source]

Get nsrrid from rec.

Parameters:

rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

Returns:

nsrrid extracted from rec.

Return type:

int

get_subject_id(rec: str | int) int[source]

Attach a unique subject ID for the record.

Parameters:

rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

Returns:

pid – Subject ID derived from (attached to) the record.

Return type:

int

get_table(table_name: str) DataFrame[source]

Get table by name.

Parameters:

table_name (str) – Table name. For available table names, call method list_table_names().

Returns:

table – The loaded table.

Return type:

pandas.DataFrame

get_tranche(rec: str | int) str[source]

Get tranche (“shhs1” or “shhs2”) from rec.

Parameters:

rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

Returns:

Tranche extracted from rec.

Return type:

str

get_visitnumber(rec: str | int) int[source]

Get visitnumber from rec.

Parameters:

rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

Returns:

Visit number extracted from rec.

Return type:

int

list_table_names() List[str][source]

List available table names.

load_ann(rec: str | int, ann_type: str, ann_path: str | bytes | PathLike | None = None, **kwargs: Any) ndarray | DataFrame | dict[source]

Load annotations of specific type of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • ann_type (str,) – Type of the annotation, can be “event”, “event_profusion”, “hrv_summary”, “hrv_detailed”, “sleep”, “sleep_stage”, “sleep_event”, “apnea” (alias “sleep_apnea”), “wave_delineation”, “rpeak”, “rr”, “nn”.

  • ann_path (path-like, optional) – Path of the file which contains the annotations. If is None, default path will be used.

  • kwargs (dict, optional) – Other arguments for specific annotation type.

Returns:

annotations – The loaded annotations.

Return type:

numpy.ndarray or pandas.DataFrame or dict

load_apnea_ann(rec: str | int, source: str = 'event', apnea_types: List[str] | None = None, apnea_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame[source]

Load annotations on apnea events of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • source ({"event", "event_profusion"}, optional) – Source of the annotations, case insensitive, by default “event”.

  • apnea_types (List[str], optional) – Types of apnea events to load, should be a subset of “CSA”, “OSA”, “MSA”, “Hypopnea”. If is None, then all types of apnea will be loaded.

  • apnea_ann_path (path-like, optional) – Path of the file which contains the apnea event annotations. If is None, default path will be used.

Returns:

df_apnea_ann – Apnea event annotations of the record.

Return type:

pandas.DataFrame

load_data(rec: str | int, rec_path: str | bytes | PathLike | None = None, sampfrom: int | None = None, sampto: int | None = None, data_format: str = 'channel_first', units: str | None = 'mV', fs: int | None = None, return_fs: bool = True) ndarray | Tuple[ndarray, Real][source]

Load ECG data of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rec_path (path-like, optional) – Path of the file which contains the ECG data. If is None, default path will be used.

  • sampfrom (int, optional) – Start index of the data to be loaded.

  • sampto (int, optional) – End index of the data to be loaded.

  • data_format (str, default "channel_first") – Format of the ECG data, “channel_last” (alias “lead_last”), or “channel_first” (alias “lead_first”), or “flat” (alias “plain”) which is valid only when leads is a single lead.

  • units (str or None, default "mV") – Units of the output signal, can also be “μV” (aliases “uV”, “muV”). None for digital data, without digital-to-physical conversion.

  • fs (numbers.Real, optional) – Sampling frequency of the loaded data. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used.

  • return_fs (bool, default True) – Whether to return the sampling frequency of the output signal.

Returns:

  • data (numpy.ndarray) – The loaded ECG data.

  • data_fs (numbers.Real) – Sampling frequency of the loaded ECG data. Returned if return_fs is True.

NOTE: one should call load_psg_data to load other channels.

load_ecg_data(rec: str | int, rec_path: str | bytes | PathLike | None = None, sampfrom: int | None = None, sampto: int | None = None, data_format: str = 'channel_first', units: str | None = 'mV', fs: int | None = None, return_fs: bool = True) ndarray | Tuple[ndarray, Real][source]

Load ECG data of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rec_path (path-like, optional) – Path of the file which contains the ECG data. If is None, default path will be used.

  • sampfrom (int, optional) – Start index of the data to be loaded.

  • sampto (int, optional) – End index of the data to be loaded.

  • data_format (str, default "channel_first") – Format of the ECG data, “channel_last” (alias “lead_last”), or “channel_first” (alias “lead_first”), or “flat” (alias “plain”) which is valid only when leads is a single lead.

  • units (str or None, default "mV") – Units of the output signal, can also be “μV” (aliases “uV”, “muV”). None for digital data, without digital-to-physical conversion.

  • fs (numbers.Real, optional) – Sampling frequency of the loaded data. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used.

  • return_fs (bool, default True) – Whether to return the sampling frequency of the output signal.

Returns:

  • data (numpy.ndarray) – The loaded ECG data.

  • data_fs (numbers.Real) – Sampling frequency of the loaded ECG data. Returned if return_fs is True.

load_eeg_band_ann(rec: str | int, eeg_band_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame[source]

Load annotations on EEG bands of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • eeg_band_ann_path (path-like, optional) – Path of the file which contains EEG band annotations. if is None, default path will be used.

Returns:

A DataFrame of EEG band annotations.

Return type:

pandas.DataFrame

load_eeg_spectral_ann(rec: str | int, eeg_spectral_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame[source]

Load annotations on EEG spectral summary of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • eeg_spectral_ann_path (path-like, optional) – Path of the file which contains EEG spectral summary annotations. If is None, default path will be used.

Returns:

A DataFrame of EEG spectral summary annotations.

Return type:

pandas.DataFrame

load_event_ann(rec: str | int, event_ann_path: str | bytes | PathLike | None = None, simplify: bool = False, **kwargs: Any) DataFrame[source]

Load event annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • event_ann_path (path-like, optional) – Path of the file which contains the events-nsrr annotations. If is None, default path will be used.

Returns:

df_events – Event annotations of the record.

Return type:

pandas.DataFrame

load_event_profusion_ann(rec: str | int, event_profusion_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) dict[source]

Load events-profusion annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • event_profusion_ann_path (path-like, optional) – Path of the file which contains the events-profusion annotations. If is None, default path will be used.

Returns:

Event-profusions annotations of the record, with items “sleep_stage_list”, “df_events”.

Return type:

dict

TODO

Merge “sleep_stage_list” and “df_events” into one DataFrame.

load_hrv_detailed_ann(rec: str | int, hrv_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame[source]

Load detailed HRV annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • hrv_ann_path (path-like, optional) – Path of the detailed HRV annotation file. If is None, default path will be used.

Returns:

df_hrv_ann – Detailed HRV annotations of the record.

Return type:

pandas.DataFrame.

load_hrv_summary_ann(rec: str | int | None = None, hrv_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame[source]

Load summary HRV annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • hrv_ann_path (path-like, optional) – Path of the summary HRV annotation file. If is None, default path will be used.

Returns:

df_hrv_ann – If rec is not None, df_hrv_ann is the summary HRV annotations of rec; if rec is None, df_hrv_ann is the summary HRV annotations of all records that had HRV annotations (about 10% of all the records in SHHS).

Return type:

pandas.DataFrame

load_nn_ann(rec: str | int, rpeak_ann_path: str | bytes | PathLike | None = None, units: str | None = 's', **kwargs: Any) ndarray[source]

Load annotations on NN intervals of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rpeak_ann_path (os.PathLike, optional) – Path of the file which contains R peak annotations. If is None, default path will be used.

  • units ({None, "s", "ms"}, optional) – Units of the returned R peak locations, by default “s”, case insensitive. None for no conversion, using indices of samples.

Returns:

nn – Array of nn intervals, of shape (n, 2). Each row is a nn interval, and the first column is the location of the R peak.

Return type:

numpy.ndarray

load_psg_data(rec: str | int, channel: str = 'all', rec_path: str | bytes | PathLike | None = None, sampfrom: Real | None = None, sampto: Real | None = None, fs: int | None = None, physical: bool = True) Dict[str, Tuple[ndarray, Real]] | Tuple[ndarray, Real][source]

Load PSG data of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • channel (str, default "all") – Name of the channel of PSG. If is “all”, then all channels will be returned.

  • rec_path (path-like, optional) – Path of the file which contains the PSG data. If is None, default path will be used.

  • sampfrom (numbers.Real, optional) – Start time (units in seconds) of the data to be loaded, valid only when channel is some specific channel.

  • sampto (numbers.Real, optional) – End time (units in seconds) of the data to be loaded, valid only when channel is some specific channel

  • fs (numbers.Real, optional) – Sampling frequency of the loaded data. If not None, the loaded data will be resampled to this frequency, otherwise, the original sampling frequency will be used. Valid only when channel is some specific channel.

  • physical (bool, default True) – If True, then the data will be converted to physical units, otherwise, the data will be in digital units.

Returns:

If channel is “all”, then a dictionary will be returned:

  • keys: PSG channel names;

  • values: PSG data and sampling frequency

Otherwise, a 2-tuple will be returned: (numpy.ndarray, numbers.Real), which is the PSG data of the channel channel and its sampling frequency.

Return type:

dict or tuple

load_rpeak_ann(rec: str | int, rpeak_ann_path: str | bytes | PathLike | None = None, exclude_artifacts: bool = True, exclude_abnormal_beats: bool = True, units: str | None = None, **kwargs: Any) ndarray[source]

Load annotations on R peaks of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rpeak_ann_path (path-like, optional) – Path of the file which contains R peak annotations. If is None, default path will be used.

  • exclude_artifacts (bool, default True) – Whether exlcude those beats (R peaks) that are labelled artifact or not.

  • exclude_abnormal_beats (bool, default True) – Whether exlcude those beats (R peaks) that are labelled abnormal (“VE” and “SVE”) or not.

  • units ({None, "s", "ms"}, optional) – Units of the returned R peak locations, case insensitive. None for no conversion, using indices of samples.

Returns:

Locations of R peaks of the record, of shape (n_rpeaks, ).

Return type:

numpy.ndarray

load_rr_ann(rec: str | int, rpeak_ann_path: str | bytes | PathLike | None = None, units: str | None = 's', **kwargs: Any) ndarray[source]

Load annotations on RR intervals of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rpeak_ann_path (path-like, optional) – Path of the file which contains R peak annotations. If is None, default path will be used.

  • units ({None, "s", "ms"}, optional) – units of the returned R peak locations, by default “s”, case insensitive. None for no conversion, using indices of samples.

Returns:

rr – Array of RR intervals, of shape (n_rpeaks - 1, 2). Each row is a RR interval, and the first column is the location of the R peak.

Return type:

numpy.ndarray.

load_sleep_ann(rec: str | int, source: str = 'event', sleep_ann_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame | dict[source]

Load sleep annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • source ({"hrv", "event", "event_profusion"}, optional) – Source of the annotations, case insensitive, by default “event”

  • sleep_ann_path (path-like, optional) – Path of the file which contains the sleep annotations. If is None, default path will be used.

Returns:

df_sleep_ann – All sleep annotations of the record.

Return type:

pandas.DataFrame or dict

load_sleep_event_ann(rec: str | int, source: str = 'event', event_types: List[str] | None = None, sleep_event_ann_path: str | bytes | PathLike | None = None) DataFrame[source]

Load sleep event annotations of a record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • source ({"hrv", "event", "event_profusion"}, optional) – Source of the annotations, case insensitive, by default “event”.

  • event_types (List[str], optional) – List of event types to be loaded, by default None. The event types are: “Respiratory” (including “Apnea”, “SpO2”), “Arousal”, “Apnea” (including “CSA”, “OSA”, “MSA”, “Hypopnea”), “SpO2”, “CSA”, “OSA”, “MSA”, “Hypopnea”. Used only when source is “event” or “event_profusion”.

  • sleep_event_ann_path (path-like, optional) – Path of the file which contains the sleep event annotations. If is None, default path will be used.

Returns:

df_sleep_event_ann – Sleep event annotations of the record.

Return type:

pandas.DataFrame

load_sleep_stage_ann(rec: str | int, source: str = 'event', sleep_stage_ann_path: str | bytes | PathLike | None = None, sleep_stage_protocol: str = 'aasm', with_stage_names: bool = True, **kwargs: Any) DataFrame[source]

Load sleep stage annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • source ({"hrv", "event", "event_profusion"}, optional) – Source of the annotations, case insensitive, by default “event”.

  • sleep_stage_ann_path (path-like, optional) – Path of the file which contains the sleep stage annotations. If is None, default path will be used.

  • sleep_stage_protocol (str, default "aasm") – The protocol to classify sleep stages. Currently can be “aasm”, “simplified”, “shhs”. The only difference lies in the number of different stages of the NREM periods.

  • with_stage_names (bool, default True) – If True, an additional column “sleep_stage_name” will be added to the returned DataFrame.

Returns:

df_sleep_stage_ann – Sleep stage annotations of the record.

Return type:

pandas.DataFrame

load_wave_delineation_ann(rec: str | int, wave_deli_path: str | bytes | PathLike | None = None, **kwargs: Any) DataFrame[source]

Load annotations on wave delineations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • wave_deli_path (path-like, optional) – Path of the file which contains wave delineation annotations. If is None, default path will be used.

Returns:

df_wave_delineation – Wave delineation annotations of the record.

Return type:

pandas.DataFrame

Note

See the part describing wave delineation annotations of the docstring of the class, or call self.database_info(detailed=True).

locate_abnormal_beats(rec: str | int, wave_deli_path: str | bytes | PathLike | None = None, abnormal_type: str | None = None, units: str | None = None) Dict[str, ndarray] | ndarray[source]

Locate “abnormal beats” in the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • wave_deli_path (path-like, optional) – Path of the file which contains wave delineation annotations. If is None, default path will be used.

  • abnormal_type ({"VE", "SVE"}, optional) – Type of abnormal beat type to locate. If is None, both “VE” and “SVE” will be located.

  • units ({None, "s", "ms"}, optional) – Units of the returned R peak locations, by default None, case insensitive. None for no conversion, using indices of samples.

Returns:

abnormal_rpeaks – If abnormal_type is None, return a dictionary of abnormal beat locations, which contains two keys “VE” and/or “SVE”, and values are indices (or time) of abnormal beats, of shape (n,). If abnormal_type is not None, return a ndarray of abnormal beat locations, of shape (n,).

Return type:

dict or numpy.ndarray

locate_artifacts(rec: str | int, wave_deli_path: str | bytes | PathLike | None = None, units: str | None = None) ndarray[source]

Locate “artifacts” in the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • wave_deli_path (path-like, optional) – Path of the file which contains wave delineation annotations. If is None, default path will be used.

  • units ({None, "s", "ms"}, optional) – Units of the returned artifact locations, can be one of “s”, “ms”, case insensitive, None for no conversion, using indices of samples.

Returns:

artifacts – Array of indices (or time) of artifacts locations, of shape (n_artifacts,).

Return type:

numpy.ndarray

match_channel(channel: str, raise_error: bool = True) str[source]

Match the channel name to the standard channel name.

Parameters:
  • channel (str) – Channel name.

  • raise_error (bool, default True) – Whether to raise error if no match is found. If False, returns the input channel directly.

Returns:

sig – Standard channel name in SHHS. If no match is found, and raise_error is False, returns the input channel directly.

Return type:

str

plot_ann(rec: str | int, stage_source: str | None = None, stage_kw: dict = {}, event_source: str | None = None, event_kw: dict = {}, plot_format: str = 'span') None[source]

Plot annotations of the record.

Plot the sleep stage annotations and sleep event annotations of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • stage_source ({"hrv", "event", "event_profusion"}, optional) – Source of the sleep stage annotations, case in-sensitive. If is None, then annotations of sleep stages of rec won’t be plotted.

  • stage_kw (dict, optional) – Key word arguments to the function load_sleep_stage_ann().

  • event_source ({"hrv", "event", "event_profusion"}, optional) – Source of the sleep event annotations, case in-sensitive. If is None, then annotations of sleep events of rec won’t be plotted.

  • event_kw (dict, optional) – Key word arguments to the function load_sleep_event_ann().

  • plot_format ({"span", "hypnogram"}, optional) – Format of the plot, case insensitive, by default “span”.

TODO

  1. ~~Implement the “hypnogram” format.~~

  2. Implement plotting of sleep events.

show_rec_stats(rec: str | int, rec_path: str | bytes | PathLike | None = None) None[source]

Print the statistics of the record.

Parameters:
  • rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in all_records.

  • rec_path (path-like, optional) – Path of the file which contains the PSG data. If is None, default path will be used.

split_rec_name(rec: str | int) Dict[str, str | int][source]

Split rec into tranche, visitnumber, nsrrid

Parameters:

rec (str or int) – Record name, typically in the form “shhs1-200001”, or index of the record in attr:self.all_records.

Returns:

Keys: “tranche”, “visitnumber”, “nsrrid”.

Return type:

dict

str_to_real_number(s: str | Real) Real[source]

Convert a string to a real number.

Some columns in the annotations might incorrectly been converted from numbers.Real to string, using xmltodict.

Parameters:

s (str or numbers.Real) – The string to be converted.

Returns:

The converted number.

Return type:

numbers.Real

update_sleep_stage_names() None[source]

Update self.sleep_stage_names according to self.sleep_stage_protocol.

property url: str

URL(s) for downloading the database.