TY - GEN
T1 - Interpretable representation learning for healthcare via capturing disease progression through time
AU - Bai, Tian
AU - Egleston, Brian L.
AU - Zhang, Shanshan
AU - Vucetic, Slobodan
N1 - Publisher Copyright:
© 2018 Association for Computing Machinery.
PY - 2018/7/19
Y1 - 2018/7/19
N2 - Various deep learning models have recently been applied to predictive modeling of Electronic Health Records (EHR). In medical claims data, which is a particular type of EHR data, each patient is represented as a sequence of temporally ordered irregularly sampled visits to health providers, where each visit is recorded as an unordered set of medical codes specifying patient's diagnosis and treatment provided during the visit. Based on the observation that different patient conditions have different temporal progression patterns, in this paper we propose a novel interpretable deep learning model, called Timeline. The main novelty of Timeline is that it has a mechanism that learns time decay factors for every medical code. This allows the Timeline to learn that chronic conditions have a longer lasting impact on future visits than acute conditions. Timeline also has an attention mechanism that improves vector embeddings of visits. By analyzing the attention weights and disease progression functions of Timeline, it is possible to interpret the predictions and understand how risks of future visits change over time. We evaluated Timeline on two large-scale real world data sets. The specific task was to predict what is the primary diagnosis category for the next hospital visit given previous visits. Our results show that Timeline has higher accuracy than the state of the art deep learning models based on RNN. In addition, we demonstrate that time decay factors and attentions learned by Timeline are in accord with the medical knowledge and that Timeline can provide a useful insight into its predictions.
AB - Various deep learning models have recently been applied to predictive modeling of Electronic Health Records (EHR). In medical claims data, which is a particular type of EHR data, each patient is represented as a sequence of temporally ordered irregularly sampled visits to health providers, where each visit is recorded as an unordered set of medical codes specifying patient's diagnosis and treatment provided during the visit. Based on the observation that different patient conditions have different temporal progression patterns, in this paper we propose a novel interpretable deep learning model, called Timeline. The main novelty of Timeline is that it has a mechanism that learns time decay factors for every medical code. This allows the Timeline to learn that chronic conditions have a longer lasting impact on future visits than acute conditions. Timeline also has an attention mechanism that improves vector embeddings of visits. By analyzing the attention weights and disease progression functions of Timeline, it is possible to interpret the predictions and understand how risks of future visits change over time. We evaluated Timeline on two large-scale real world data sets. The specific task was to predict what is the primary diagnosis category for the next hospital visit given previous visits. Our results show that Timeline has higher accuracy than the state of the art deep learning models based on RNN. In addition, we demonstrate that time decay factors and attentions learned by Timeline are in accord with the medical knowledge and that Timeline can provide a useful insight into its predictions.
KW - Attention model
KW - Deep learning
KW - Electronic Health Records
KW - Healthcare
UR - http://www.scopus.com/inward/record.url?scp=85051526108&partnerID=8YFLogxK
UR - https://www.webofscience.com/api/gateway?GWVersion=2&SrcApp=purepublist2023&SrcAuth=WosAPI&KeyUT=WOS:000455346400005&DestLinkType=FullRecord&DestApp=WOS
U2 - 10.1145/3219819.3219904
DO - 10.1145/3219819.3219904
M3 - Conference contribution
C2 - 31037221
SN - 9781450355520
VL - 2018
T3 - Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
SP - 43
EP - 51
BT - KDD 2018 - Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
PB - Association for Computing Machinery
T2 - 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2018
Y2 - 19 August 2018 through 23 August 2018
ER -