Section: Overall Objectives
Objective: efficient support for scalable data-intensive computing
Our research activities focus on data-intensive high-performance applications that exhibit the need to handle:
-
massive data BLOBs (Binary Large OBjects), in the order of Terabytes,
-
stored in a large number of nodes, thousands to tens of thousands,
-
accessed under heavy concurrency by a large number of processes, thousands to tens of thousands at a time,
-
with a relatively fine access grain, in the order of Megabytes.
Examples of such applications are:
-
Massively parallel cloud data-mining applications (e.g., Map-Reduce-based data analysis);
-
Advanced Platform-as-a-Service (PaaS) cloud data services requiring efficient data sharing under heavy concurrency;
-
Advanced concurrency-optimized, versioning-oriented cloud services for virtual-machine-image storage and management at IaaS (Infrastructure-as-a-Service) level;
-
Scalable storage solutions for I/O-intensive HPC simulations for post-Petascale architectures;
-
Storage and I/O stacks for big-data analysis in applications that manipulate structured scientific data (e.g. very large multi-dimensional arrays).