Sleep Stage Scoring with Artificial Intelligence

Sleep stage scoring is a fundamental task when objectively evaluating sleep with polysomnography (PSG), the gold standard method for sleep assessment. The current rules for sleep stage scoring are defined in the Manual for Scoring Sleep and Related Events, which is regularly updated by the American Academy of Sleep Medicine (AASM). Using these rules, sleep experts and technicians manually score the PSG recordings, labeling each 30-second period of sleep (also called an “epoch”) as wakefulness, rapid eye movement (REM) sleep, or one of the three non-REM sleep stages. Manual sleep stage scoring is time-consuming and can be inconsistent because the rules for scoring sleep stages are prone to subjective interpretation. When a PSG recording is given to two different technicians, they tend to agree on only 85-90% of the epochs. The agreement decreases significantly for recordings of patients with neurodegenerative diseases.

Automatic methods would be highly beneficial in this field. Algorithms are faster and provide consistent and objective scoring, overcoming both problems of time consumption and subjective interpretation of PSG recordings. The first attempts to automatically score sleep stages began in the 1960s when simple rule-based algorithms were proposed. Such algorithms consisted in a series of human-defined rules to try to mimic manual scoring. However, due to their simplistic approach and the many exceptions that could not be foreseen by the rules, the algorithms did not perform as well as human scoring.

Artificial intelligence (AI) is a branch of computer science in which, to complete the required task, the computer captures the patterns between the data and the desired output on its own without explicitly defined rules from a programmer. In the context of sleep stage scoring, the computers are given a set of thousands of sleep epochs previously scored by humans to learn the relationship between the data and the expected sleep stage. Once the learning is complete, the AI algorithms are tested on new unseen epochs and compared with human scoring to understand whether the algorithm performs as well as humans. In recent years, different research groups have proposed new AI algorithms, which reached the same accuracy as human experts in scoring sleep stages. These achievements are due to the increased computational power of computers as well as to the availability of large datasets, which allow computers to learn even the most difficult and rare patterns.

The results recently achieved by AI algorithms are promising in the sense of making AI-based sleep scoring a routine process in sleep labs around the world. This would allow sleep labs to evaluate PSGs faster and make it possible to record more patients and/or to reduce the costs related to PSGs as well as increase the consistency between different sleep labs, making evaluation more standardized. Thanks to the increasing development of more portable at-home PSG systems, AI-based sleep stage scoring has the potential to record even more sleep studies and make waiting times for PSGs significantly shorter.

In addition to scoring PSGs, research studies show that AI-based sleep stage scoring might provide new information. AI algorithms not only provide the sleep stage for each sleep epoch but also estimate the probability of such stage, which is useful to better understand the transition between different stages and the stability of sleep structure. Research shows that the instability measures derived from these probabilities might be useful for improved diagnosis of narcolepsy type 1. Furthermore, since sleep is a continuous and dynamic process, automatic AI-based scoring could be adapted to score sleep epochs shorter than 30 seconds. This is better suited to capture the underlying information rather than simplifying it to 30-second epochs, which is a rudiment dating from the time sleep was scored on paper. Recent evidence shows that AI-based sleep stage scoring could potentially be used in clinics and, in the not so distant future, to improve the diagnosis of sleep disorders. The integration of AI in sleep medicine, however, requires clear guidelines to build trust in these automatic procedures. The AASM is actively working in this direction with a dedicated taskforce. To certify the reliability of AI-based sleep stage scoring algorithms, the AASM is developing a certification program in which AI algorithms for sleep stage scoring will be tested on unseen recordings owned by AASM and scored by several scorers. Based on the agreement with the human scorings, AI algorithms will receive a certification ensuring their reliability.

The time for moving from manual to AI-based sleep stage scoring is approaching, and clinics and sleep centers need to be prepared for this shift, which will reshape their organizational structure as well as the approach with patients.

 

 

Matteo Cesari, PhD has been working in the field of sleep medicine & research since 2016. Dr. Cesari is currently Postdoc Fellow at the Sleep Disorders Unit, Department of Neurology, Medical University of
Innsbruck in Austria.

 

 

 

 

 

 

Alexander Wachter, MSc has been in the field of computer science since 2020. He is currently PhD student at the Sleep Disorders Unit, Department of Neurology, Medical University of Innsbruck in  Austria.

 

 

 

 

References

Berry RB, Brooks R, Gamaldo CE, et al. The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and Technical Specifications: Version 2.6. American Academy of Sleep Medicine; 2020.

Lee YJ, Lee JY, Cho JH, Choi JH. Interrater reliability of sleep stage scoring: a meta-analysis. J Clin Sleep Med. 2022;18(1):193-202. doi:10.5664/JCSM.9538

Danker-Hopfe H, Kunz D, Gruber G, et al. Interrater reliability between scorers from eight European sleep laboratories in subjects with different sleep disorders. J Sleep Res. 2004;13(1):63-69. doi:10.1046/j.1365-2869.2003.00375.x

Itil TM. Automatic classification of sleep stages and the discrimination of vigilance changes using digital computer methods. Agressologie. 1969;10:Suppl:603-10.

Fiorillo L, Puiatti A, Papandrea M, et al. Automated sleep scoring: A review of the latest approaches. Sleep Med Rev. 2019;48:101204. doi:10.1016/j.smrv.2019.07.007

Stephansen JB, Olesen AN, Olsen M, et al. Neural network analysis of sleep stages enables efficient diagnosis of narcolepsy. Nat Commun. 2018;9(1):1-15. doi:10.1038/s41467-018-07229-3

Cesari M, Stefani A, Mitterling T, Frauscher B, Schönwald S V., Högl B. Sleep modelled as a continuous and dynamic process predicts healthy ageing better than traditional sleep scoring. Sleep Med. 2021;77:136-146. doi:10.1016/j.sleep.2020.11.033

AASM. AASM AI/Autoscoring Pilot Certification is coming soon. Accessed October 5, 2022. https://aasm.org/aasm-ai-autoscoring-pilot-certification-is-coming-soon/

Facebook
Twitter
LinkedIn

Subscribe for Free

Subscribe to the digital edition of Healthier Sleep for free! Issues are emailed to subscribers at least four times per year. Your email will be used for this purpose only.