Machine Learning for Data-Driven Decisions

Overview. Hospitals today are collecting an immense amount of patient data (e.g., images, lab tests, vital sign measurements), but are still ignoring the vast majority of it. Despite the fact that health data are messy and often incomplete, these data are useful and can help improve patient care. To this end, we have pioneered work in leveraging machine learning (ML) and electronic health records for predicting adverse outcomes or events (e.g., infections). Based on collaborations with 30+ clinicians, we have identified key characteristics for the safe and meaningful adoption of ML in healthcare. Beyond accuracy, models must be, actionable (tell a clinician how to reduce a patient’s risk not just who’s at risk) and robust (capable of adapting to changes across populations and time). Achieving accurate models with these characteristics presents unique technical challenges. Specifically, in healthcare, one often deals with high-dimensional data (i.e., many covariates) but has few examples to learn from – ‘high D small N.’ To address these challenges, we have developed new ML techniques.


Shengpu Tang , Parmida Davarmanesh , Yanmeng Song , Danai Koutra , Michael Sjoding , Jenna Wiens. MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing. PhysioNet, April 2021.

Jiaxuan Wang, Jenna Wiens, Scott Lundberg. Shapley Flow: A Graph-based Approach to Interpreting Model Predictions. AISTATS, April 2021. (Preprint on arXiv, November 2020) [Code]

Jaewon Hur*, Shengpu Tang*, Vidhya Gunaseelan, Joceline Vu, Chad M.Brummett, Michael Englesbe, Jennifer Waljee, Jenna Wiens. Predicting postoperative opioid use with machine learning and insurance claims in opioid-naïve patients. The American Journal of Surgery, March 2021. (*co-first authors, †co-senior authors) [Code]

Fahad Kamran, Jenna Wiens. Estimating Calibrated Individualized Survival Curves with Deep Learning. AAAI, February 2021. [Code]

Donna Tjandra, Jenna Wiens. A Guided Approach to Multi-Event Survival Analysis. AAAI, February 2021. [Code]


Karandeep Singh, Thomas S. Valley, Shengpu Tang, Benjamin Y. Li, Fahad Kamran, Michael W. Sjoding, Jenna Wiens, Erkin Otles, John P. Donnelly, Melissa Y. Wei, Jonathon P. McBride, Jie Cao, Carleen Penoza, John Z. Ayanian, Brahmajee K. Nallamothu. Evaluating a Widely Implemented Proprietary Deterioration Index Model Among Hospitalized COVID-19 Patients. Annals of the American Thoracic Society. December 2020. (Preprint first appeared on medRxiv, June 2020) [Code]

Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, Danai Koutra, Michael W. Sjoding, Jenna Wiens. Democratizing EHR Analyses with FIDDLE – a Flexible Preprocessing Pipeline for Structured Clinical DataJournal of the American Medical Informatics Association (JAMIA), October 2020. [Code]

Harry Rubin-Falcone, Ian Fox, Jenna Wiens. Deep Residual Time-Series Forecasting: Application to Blood Glucose Prediction. KDH (Knowledge Discovery in Healthcare Data), August 2020. [Code] [Video]

Sarah Jabbour, David Fouhey, Ella Kazerooni, Michael W. Sjoding, Jenna Wiens. Deep Learning Applied to Chest X-Rays: Exploiting and Preventing Shortcuts. MLHC, August 2020. [Code] [Video]

Ian Fox, Joyce Lee, Rodica Busui, Jenna Wiens. Deep Reinforcement Learning for Closed-Loop Blood Glucose Control. MLHC, August 2020. [Code] [Video]

Shengpu Tang, Aditya Modi, Michael W. Sjoding, Jenna Wiens. Clinician-in-the-loop Decision Making: Reinforcement Learning with Near-Optimal Set-Valued Policies. ICML, July 2020. [Website] [Code] [Video]

Jiaxuan Wang, Jenna Wiens. AdaSGD: Bridging the gap between SGD and Adam. arXiv, June 2020.

Donna Tjandra, Raymond Migrino, Bruno Giordani, and Jenna Wiens. Cohort discovery and risk stratification for Alzheimer’s disease: An electronic health record-based approach. Alzheimer’s and Dementia: Translational Research & Clinical Interventions. June 2020.

Shengpu Tang, Grant T. Chappell, Amanda Mazzoli, Muneesh Tewari, Sung Won Choi*, and Jenna Wiens*. Predicting Acute Graft-versus-Host Disease Using Machine Learning and Longitudinal Vital Sign Data from Electronic Health Records. JCO Clinical Cancer Informatics, February 2020. (*co-senior authors) [Code]

Vincent X. Liu, Jenna Wiens. ‘No growth to date’? Predicting positive blood cultures in critical illness. Intensive Care Medicine, January 2020.

Jenna Wiens, W. Nicholson Price II, and Michael W. Sjoding. Diagnosing bias in data-driven algorithms for healthcare. Nature Medicine, January 2020.


Vincent X. Liu, David Bates, Jenna Wiens, and Nigam Shah. The Number Needed to Benefit: Estimating the Value of Predictive Analytics in Healthcare. Journal of the American Medical Informatics Associations (JAMIA), December 2019.

Tom Pollard, Irene Chen, Jenna Wiens, Steven Horng, Danny J Wong, Marzyeh Ghassemi, Heather Mattie, Emily Lindmeer, and Trishan Panch. Turning the Crank for Machine Learning: Ease, at What Expense? The Lancet Digital Health, September 2019.

Jenna Wiens, Suchia Saria, Mark Sendak, Marzyeh Ghassemi, Vincent Liu, Finale Doshi-Velez, Kenneth Jung, Katherine Heller, David Kale, Mohammed Saeed, Pilar Ossorio, Sonoo Thadaney-Israni, and Anna Goldenberg. Do No Harm: A Roadmap for Responsible ML for Health Care. Nature Medicine, September 2019.

Ian Fox and Jenna Wiens. Advocacy Learning: Learning through Competition and Class-Conditional RepresentationsIJCAI, August 2019. [Code]

Michael W. Sjoding, Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, Danai Koutra, Jenna Wiens. Democratizing EHR Analyses – A Comprehensive Pipeline for Learning from Clinical DataMLHC (Clinical Abstract), August 2019. [Code]

Erkin Ötleş, Haozhu Wang, Suyanpeng Zhang, Brian Denton, Jon Seymour, and Jenna Wiens. Return to Work After Injury: A Sequential Prediction and Prescription ProblemMLHC (Clinical Abstract), August 2019.

Jeeheh Oh, Jiaxuan Wang, Shengpu Tang, Michael Sjoding, and Jenna Wiens. Relaxed Parameter Sharing: Effectively Modeling Time-Varying Relationships in Clinical Time-SeriesMLHC, June 2019. [Code]

Donna Tjandra, Raymond Migrino, Bruno Giordani, and Jenna Wiens. An EHR-based Cohort Discovery Tool for Identifying Probable ADAlzheimer’s Association International Conference (AAIC), May 2019.

Ian Fox and Jenna Wiens. Reinforcement Learning for Blood Glucose Control: Challenges and OpportunitiesICML RL4RealLife Workshop, May 2019.

Ben Li, Jeeheh Oh, Vincent Young, Krishna Rao, and Jenna Wiens. Using Machine Learning and the Electronic Health Record to Predict Complicated Clostridium difficile InfectionOpen Forum Infectious Diseases 6(5), April 2019. [Code]

Daniel Zeiberg, Tejas Prahlad, Brahmajee Nallamothu, Theodore J. Iwashyna, Jenna Wiens* and Michael W. Sjoding*Machine learning for patient risk stratification for acute respiratory distress syndromePLOS ONE, March 2019. (*co-senior authors) [Code]

Tian Bao, Brooke N. Klatt, Susan L. Whitney, Kathleen H. Sienko, and Jenna Wiens. Automatically evaluating balance: a machine learning approachIEEE Transactions on Neural Systems and Rehabilitation Engineering, February 2019.

Saige Rutherford, Pascal Sturmfels, Mike Angstadt, Jasmine Hect, Jenna Wiens, Marion I. van den Heuvel, Dustin Scheinost, Moriah Thomason, Chandra Sripada. Observing the origins of human brain development: Automated processing of fetal fMRIbioRxiv, January 2019.


Devendra Goyal, Donna Tjandra, Raymond Migrino, Bruno Giordani, Zeeshan Syed, and Jenna Wiens. Characterizing heterogeneity in the progression of Alzheimer’s disease using longitudinal clinical and neuroimaging biomarkers. Alzheimer’s & Dementia: Diagnosis, Assessment & Disease Monitoring, August 2018. [Code]

Jeeheh Oh, Jiaxuan Wang, and Jenna Wiens. Learning to Exploit Invariances in Clinical Time-Series Data using Sequence Transformer NetworksMLHC, August 2018. [Code]

Pascal Sturmfels, Saige Rutherford, Mike Angstadt, Mark Peterson, Chandra Sripada, Jenna Wiens. A Domain Guided CNN Architecture for Predicting Age from Structural Brain ImagesMLHC, August 2018.

Ian Fox, Lynn Ang, Mamta Jaiswal, Rodica Pop-Busui, Jenna Wiens. Deep Multi-Output Forecasting: Learning to Accurately Predict Blood Glucose TrajectoriesKDD, August 2018. [Code]

Jiaxuan Wang, Jeeheh Oh, Haozhu Wang, Jenna Wiens. Learning Credible ModelsKDD, August 2018. [Code]

Jenna Wiens and James Fackler. Striking the Right Balance – Applying Machine Learning to Pediatric Critical Care DataPediatric Critical Care Medicine, July 2018.

David Yeh, Sankar Basu, Ruchir Puri, Sanjit A. Seshia, Jenna Wiens, Li-C. Wang. Autonomous Systems and the Challenges in Verification, Validation, and Test. IEEE Design & Test, 35(3), 89-97, June 2018.

Jeeheh Oh, Maggie Makar, Christopher Fusco, Robert McCaffrey, Krishna Rao, Erin E. Ryan, Laraine Washer, Lauren R. West, Vincent B. Young, John Guttag, David C. Hooper, Erica S. Shenoy, Jenna Wiens. A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health CentersInfection Control and Hospital Epidemiology, March 2018.

Devendra Goyal, Zeeshan Syed, and Jenna Wiens. Clinically Meaningful Comparisons Over Time: An Approach to Measuring Patient Similarity based on Subsequence AlignmentarXiv:1803.00744, March 2018.

Jiaxuan Wang, Ian Fox, Jonathan Skaza, Nick Linck, Satinder Singh, and Jenna Wiens. The Advantage of Doubling: A Deep Reinforcement Learning Approach to Studying the Double Team in the NBASloan Sports Analytics Conference, February 2018. [Poster]

Maggie Makar, John Guttag, and Jenna Wiens. Learning the Probability of Activation in the Presence of Latent SpreadersAAAI (oral presentation), February 2018.

Jenna Wiens, Graham M Snyder, Samuel Finlayson, Monica V. Mahoney, and Leo A. Celi. Potential Adverse Effects of Broad-Spectrum Antimicrobial Exposure in the Intensive Care Unit. Open Forum Infectious Diseases, February 2018.


Eli Sherman, Hitinder Gurm, Ulysses Balis, Scott Owens, Jenna Wiens. Leveraging Clinical Time-Series Data for Prediction: A Cautionary TaleAMIA Annual Symposium (oral presentation), November 2017.

Jeeheh Oh, Maggie Makar, Christopher Fusco, Robert McCaffrey, Krishna Rao, Erin E. Ryan, Laraine Washer, Lauren R. West, Vincent B. Young, John Guttag, David C. Hooper, Erica S. Shenoy, and Jenna Wiens. A Data-Driven Approach to Predict Daily risk of Clostridium difficile Infection at Two Large Academic Health CentersInfectious Disease Week, October 2017. [Code]

Jenna Wiens and Erica Shenoy. Machine Learning for Healthcare: On the Verge of a Major Shift in Healthcare EpidemiologyClinical Infectious Diseases, August 2017.

Ian Fox, Lynn Ang, Mamta Jaiswal, Rodica Pop-Busui, Jenna Wiens. Contextual Motifs: Increasing the Utility of Motifs using Contextual DataKDD, August 2017. [Code]


Jose Javier Gonzalez Ortiz, Cheng Perng Phoo, and Jenna Wiens. Heart Sound Classification Based on Temporal Alignment TechniquesComputing in Cardiology, September 2016. [Code]

Mason Wright and Jenna Wiens. Method to their March Madness: Insights from Mining a Novel Large-Scale Dataset of Pool BracketsKDD Workshop on Large-Scale Sports Analytics, August 2016.

Jenna Wiens, John Guttag, and Eric Horvitz. Patient Risk Stratification with Time-Varying Parameters: A Multitask Learning ApproachJMLR, April 2016.

Jenna Wiens and Byron C Wallace. Editorial: Special Issue on Machine Learning for Health and Medicine. Machine Learning, March 2016.

Avery McIntyre, Joel Brooks, John Guttag, and Jenna Wiens. Recognizing and Analyzing Ball Screen Defense in the NBASloan Sports Analytics Conference, March 2016.

Abhishek Bafna and Jenna Wiens. Automated Feature Learning: Mining Unstructured Data for Useful AbstractionsICDM, November 2015.

Abhishek Bafna and Jenna Wiens. Learning Useful Abstractions from the WebAMIA Annual Symposium, November 2015.

Sai R. Gouravajhala, Sree Sesha Aravind Vadrevu, Matthew Hicks, Jenna Wiens, and Kevin Fu. An LED Blink is Worth a Thousand Packets: Inferring a Networked Device’s Activity from its LED BlinksUSENIX Summit on Information Technologies for Health (poster), August 2015.

Devendra Goyal, Zeeshan Syed, and Jenna Wiens. Predicting Disease Progression in Alzheimer’s DiseaseMUCMD, August 2015.

Jenna Wiens, Wayne N. Campbell, Ella S. Franklin, John V. Guttag, and Eric Horvitz. Learning Data-Driven Patient Risk Stratification Models for Clostridium difficile, Open Forum Infectious Diseases, July 2014.

Jenna Wiens. Learning to Prevent Healthcare-Associated Infections: Leveraging Data Across Time and Space to Improve Local Predictions, Ph.D. Thesis, MIT, May 2014.

Armand McQueen, Jenna Wiens, and John Guttag. Automatically Recognizing On-Ball ScreensSloan Sports Analytics Conference, February 2014.

Jenna Wiens, John Guttag, Eric Horvitz. A Study in Transfer Learning: Leveraging Data from Multiple Hospitals to Enhance Hospital-Specific PredictionsJournal of the American Medical Informatics Association (JAMIA), January 2014.

Jenna Wiens, Guha Balakrishnan, Joel Brooks, John Guttag.  To Crash or Not to Crash: A quantitative look a the relationship between offensive rebounding and transition defense in the NBASloan Sports Analytics Conference, March 2013.

Jenna Wiens, Eric Horvitz, John V. Guttag. Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification TaskNeural Information Processing Systems (NIPS), December 2012. [Video]

Jenna Wiens, John V. Guttag, Eric Horvitz. Learning Evolving Patient Risk Processes for C. diff ColonizationICML Workshop on Clinical Data Analysis, June 2012. [Slides]

Jenna Wiens, John V. Guttag, Eric Horvitz.  On the Promise of Topic Models for Abstracting Complex Medical Data: A Study of Patients and their MedicationsNIPS Workshop on Personalized Medicine, December 2011.

Jenna Wiens and John Guttag. Patient-Specific Ventricular Beat Classification without Patient-Specific Expert Knowledge: A Transfer Learning ApproachIEEE EMBS Conference, September 2011.

Jenna Wiens and John Guttag. Active Learning Applied to Patient-Adaptive Heartbeat ClassificationNeural Information Processing Systems (NIPS), December 2010.

Jenna Wiens, Machine Learning for Ectopic Beat Classification, Master’s thesis, MIT, May 2010.

Jenna Wiens and John Guttag. Patient-Adaptive Ectopic Beat Classification using Active Learning. Computing in Cardiology (CinC), September 2010.