Abstract
We have leveraged STARE indexing to package partitioned data chunks from diverse datasets into netCDF files, distributed them on a cluster of 16 lightweight nodes with their placements spatiotemporally co-aligned, and demonstrated a few integrative analyses using netCDF parallel I/O and Python MPI, with single-user performance and scalability comparable to, or even better than, that of a parallel array database management system (ADBMS) such as SciDB. However, records of the node location and STARE index ranges for each data chunk, similar to the chunk maps of SciDB, must be maintained and consulted by the I/O and analysis code for coordinating the analytic operations in parallel, in order to achieve the good performance and scalability.
Original language | English (US) |
---|---|
Pages | 10063-10066 |
Number of pages | 4 |
DOIs | |
State | Published - 2019 |
Event | 39th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019 - Yokohama, Japan Duration: Jul 28 2019 → Aug 2 2019 |
Conference
Conference | 39th IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019 |
---|---|
Country/Territory | Japan |
City | Yokohama |
Period | 7/28/19 → 8/2/19 |
Keywords
- Big Data
- Data-intensive analysis
- Interoperability
- Parallel processing
- Scalability
ASJC Scopus subject areas
- Computer Science Applications
- General Earth and Planetary Sciences