clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data

Junyi Zhou, Ying Zhang, Wanzhu Tu

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Longitudinal data clustering is challenging because the grouping has to account for the similarity of individual trajectories in the presence of sparse and irregular times of observation. This paper puts forward a hierarchical agglomerative clustering method based on a dissimilarity metric that quantifies the cost of merging two distinct groups of curves, which are depicted by B-splines for the repeatedly measured data. Extensive simulations show that the proposed method has superior performance in determining the number of clusters, classifying individuals into the correct clusters, and in computational efficiency. Importantly, the method is not only suitable for clustering multivariate longitudinal data with sparse and irregular measurements but also for intensely measured functional data. Towards this end, we provide an R package for the implementation of such analyses. To illustrate the use of the proposed clustering method, two large clinical data sets from real-world clinical studies are analyzed.

Original languageEnglish (US)
Pages (from-to)1131-1144
Number of pages14
JournalJournal of Computational and Graphical Statistics
Issue number3
StatePublished - 2023


  • B-splines
  • Dissimilarity metric
  • Functional data
  • Longitudinal data
  • Multiple outcomes

ASJC Scopus subject areas

  • Discrete Mathematics and Combinatorics
  • Statistics and Probability
  • Statistics, Probability and Uncertainty


Dive into the research topics of 'clusterMLD: An Efficient Hierarchical Clustering Method for Multivariate Longitudinal Data'. Together they form a unique fingerprint.

Cite this