TY - JOUR
T1 - Multiscale PHATE identifies multimodal signatures of COVID-19
AU - Yale IMPACT Team
AU - Kuchroo, Manik
AU - Huang, Jessie
AU - Wong, Patrick
AU - Grenier, Jean Christophe
AU - Shung, Dennis
AU - Tong, Alexander
AU - Lucas, Carolina
AU - Klein, Jon
AU - Burkhardt, Daniel B.
AU - Gigante, Scott
AU - Godavarthi, Abhinav
AU - Rieck, Bastian
AU - Israelow, Benjamin
AU - Simonov, Michael
AU - Mao, Tianyang
AU - Oh, Ji Eun
AU - Silva, Julio
AU - Takahashi, Takehiro
AU - Odio, Camila D.
AU - Casanovas-Massana, Arnau
AU - Fournier, John
AU - Obaid, Abeer
AU - Moore, Adam
AU - Lu-Culligan, Alice
AU - Nelson, Allison
AU - Brito, Anderson
AU - Nunez, Angela
AU - Martin, Anjelica
AU - Wyllie, Anne L.
AU - Watkins, Annie
AU - Park, Annsea
AU - Venkataraman, Arvind
AU - Geng, Bertie
AU - Kalinich, Chaney
AU - Vogels, Chantal B.F.
AU - Harden, Christina
AU - Todeasa, Codruta
AU - Jensen, Cole
AU - Kim, Daniel
AU - McDonald, David
AU - Shepard, Denise
AU - Courchaine, Edward
AU - White, Elizabeth B.
AU - Song, Eric
AU - Silva, Erin
AU - Kudo, Eriko
AU - DeIuliis, Giuseppe
AU - Wang, Haowei
AU - Rahming, Harold
AU - Fauver, Joseph
N1 - Funding Information:
We thank the Mila COVID-19 task force for fruitful discussions and feedback during the conception, development and application of the methods presented here. This study was supported by the Beatrice Kleinberg Neuwirth Fund; the Sendas Family Fund, Yale School of Public Health; and Department of Internal Medicine at the Yale School of Medicine. This work was partially funded by the Institute for Data Valorisation: IVADO Professor funds (G.W.) and COVID-19 Rapid Response grant CVD19-030 (J.G.H.); the Montreal Heart Institute Foundation (J.G.H.); the Canadian Institute for Advanced Research (Canada CIFAR AI Chair) and the Natural Sciences and Engineering Research Council of Canada (NSERC Discovery grant 03267) (G.W.); Merck Investigator Studies Program 60293 (S.F.); the Chan-Zuckerberg Initiative (grants CZF2019-182702 and CZF2019-002440) (S.K.); the National Institute of Health grants 1F30AI157270-01 (M.K.), R01GM135929 (G.W., M.J.H. and S.K.), R01GM130847 (G.W. and S.K.), R01AI157488 and K23MH118999 (S.F.), 1K23DK125718-01A1 (D.S.) and R01DK113191, P30DK079310 and R01HS027626 (F.P.W.); the National Science Foundation (grant DMS1845856) (M.J.H.) and NSF CARRER grant 2047856 (S.K.); and the Sloan Fellowship FG-2021-15883 (S.K.). The content provided here is solely the responsibility of the authors and does not necessarily represent the official views of the funding agencies.
Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Nature America, Inc.
PY - 2022/5
Y1 - 2022/5
N2 - As the biomedical community produces datasets that are increasingly complex and high dimensional, there is a need for more sophisticated computational tools to extract biological insights. We present Multiscale PHATE, a method that sweeps through all levels of data granularity to learn abstracted biological features directly predictive of disease outcome. Built on a coarse-graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse resolutions for high-level summarizations of data and at fine resolutions for detailed representations of subsets. We apply Multiscale PHATE to a coronavirus disease 2019 (COVID-19) dataset with 54 million cells from 168 hospitalized patients and find that patients who die show CD16hiCD66blo neutrophil and IFN-γ+ granzyme B+ Th17 cell responses. We also show that population groupings from Multiscale PHATE directly fed into a classifier predict disease outcome more accurately than naive featurizations of the data. Multiscale PHATE is broadly generalizable to different data types, including flow cytometry, single-cell RNA sequencing (scRNA-seq), single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq), and clinical variables.
AB - As the biomedical community produces datasets that are increasingly complex and high dimensional, there is a need for more sophisticated computational tools to extract biological insights. We present Multiscale PHATE, a method that sweeps through all levels of data granularity to learn abstracted biological features directly predictive of disease outcome. Built on a coarse-graining process called diffusion condensation, Multiscale PHATE learns a data topology that can be analyzed at coarse resolutions for high-level summarizations of data and at fine resolutions for detailed representations of subsets. We apply Multiscale PHATE to a coronavirus disease 2019 (COVID-19) dataset with 54 million cells from 168 hospitalized patients and find that patients who die show CD16hiCD66blo neutrophil and IFN-γ+ granzyme B+ Th17 cell responses. We also show that population groupings from Multiscale PHATE directly fed into a classifier predict disease outcome more accurately than naive featurizations of the data. Multiscale PHATE is broadly generalizable to different data types, including flow cytometry, single-cell RNA sequencing (scRNA-seq), single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq), and clinical variables.
UR - http://www.scopus.com/inward/record.url?scp=85127383099&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85127383099&partnerID=8YFLogxK
U2 - 10.1038/s41587-021-01186-x
DO - 10.1038/s41587-021-01186-x
M3 - Article
C2 - 35228707
AN - SCOPUS:85127383099
VL - 40
SP - 681
EP - 691
JO - Biotechnology
JF - Biotechnology
SN - 1087-0156
IS - 5
ER -