Event Abstract

A Big-Data Approach to Automated EEG Labeling

  • 1 Temple University, Electrical and Computer Engineering, United States
  • 2 Temple University Hospital, United States

Electroencephalograms (EEGs) are valuable indicators of neural activity, both because of their non-invasive and inexpensive nature, and because a wealth of prior knowledge exists on their interpretation. Software for automatically reading EEGs has long been a focus of neuroinformatics research, since such a tool would be a boon to neuroscientists studying brain function as well as to neurologists who must manually scan hours of patient recordings. Prior research has relied either on heuristic rules for signal interpretation or on pattern recognition algorithms trained on insufficient datasets. Owing to the variability and complexity of neural function, both approaches are fundamentally limited, and resulting tools have failed to be transformative to both clinicians and researchers. In response, we have created a new data-rich EEG resource by amassing archival clinical EEG data recorded Temple University Hospital over the past decade. The resulting data corpus (TUH-EEG) comprises some 22,000 EEG records from approximately 15,000 unique patients, and includes medical histories and clinical diagnoses along with the raw EEG traces. Data have been de-identified appropriately, and all work has been approved by the Temple University IRB. The size and scope of the TUH-EEG dataset is enabling us to apply a new generation of machine learning technology based on deep learning. This technology automatically self-organizes knowledge in a data-driven manner and learns to emulate a physician’s decision-making process. We are combining deep learning with unsupervised training so that detailed transcriptions of the data are not required. Performance of unsupervised training on vast amounts of data has recently been shown to approach or even exceed supervised training on much less data, giving rise to the notion of big data – learning from vast archives of noisy, poorly transcribed data. We have validated this approach on the CHB-MIT scalp EEG database and have achieved a 94% seizure detection rate, which compares favorably with state of the art on this task. We are presently using unsupervised deep learning on the TUH-EEG corpus to train a system that can differentiate between six so-called signal primitives: (1) focal epileptiform, (2) general epileptiform, (3) focal abnormal, (4) general abnormal, (5) artifacts, (6) background. Once these primitives can be reliably detected, they can be used to assess the presence of higher-level phenomena specific to certain disease states or conditions. We are implementing an unsupervised training technique in which we iteratively annotate the data using the previous iteration of the technology. This is the approach we believe will be most promising for TUH-EEG since we do not have access to manually transcribed labels. It will not only demonstrate our ability to learn from data automatically, but also provide time­aligned marks for physicians to review. Preliminary results suggest a primitive sensitivity of 74% with a false alarm rate of 0.6/session; our goal for operational performance is 95% sensitivity.

Figure 1

Acknowledgements

Portions of this work were sponsored by the DARPA MTO (Contract No. D13AP00065), Temple University’s College of Engineering and Office of the Senior Vice-Provost for Research, the University City Science Center (QED Award No. S1313) and the NSF (Grant No. CNS-09-58854, Grant No. CNS-1305190).

Keywords: EEG, big data, machine learning, HMM, Epilepsy

Conference: Neuroinformatics 2014, Leiden, Netherlands, 25 Aug - 27 Aug, 2014.

Presentation Type: Poster, to be considered for oral presentation

Topic: Clinical neuroscience

Citation: Obeid I, Harati A, Jacobson M and Picone J (2014). A Big-Data Approach to Automated EEG Labeling. Front. Neuroinform. Conference Abstract: Neuroinformatics 2014. doi: 10.3389/conf.fninf.2014.18.00094

Copyright: The abstracts in this collection have not been subject to any Frontiers peer review or checks, and are not endorsed by Frontiers. They are made available through the Frontiers publishing platform as a service to conference organizers and presenters.

The copyright in the individual abstracts is owned by the author of each abstract or his/her employer unless otherwise stated.

Each abstract, as well as the collection of abstracts, are published under a Creative Commons CC-BY 4.0 (attribution) licence (https://creativecommons.org/licenses/by/4.0/) and may thus be reproduced, translated, adapted and be the subject of derivative works provided the authors and Frontiers are attributed.

For Frontiers’ terms and conditions please see https://www.frontiersin.org/legal/terms-and-conditions.

Received: 15 Apr 2014; Published Online: 27 Jun 2014.

* Correspondence: Dr. Iyad Obeid, Temple University, Electrical and Computer Engineering, Philadelphia, PA, 19122, United States, iobeid@temple.edu