COEUS
accelerating scientific insights using enriched metadata.
coeus represents an exciting partnership with sandia national laboratories (snl) and oak ridge national laboratory (ornl), where our research team investigates advanced metadata management techniques to accelerate complex queries on scientific data while optimizing data placement across storage hierarchies. as a co-pi on this doe ascr funded project, my role focuses on leading the storage and data placement research thrust, where we develop novel approaches for intelligent data movement and enhanced metadata management.
research vision & leadership π‘
working closely with dr. jay lofstead (snl) and dr. scott klasky (ornl), we identified critical challenges in scientific data management that led to the coeus project. our team at iit focuses on developing innovative solutions for:
- storage-driven data movement: novel techniques for intelligent data placement
- ml-guided optimization: advanced prediction models for data access patterns
- metadata enhancement: new approaches for derived quantity management
- hierarchical storage management: efficient use of modern storage tiers
technical innovations π§
under our teamβs research direction, several breakthrough technologies have emerged:
- context-aware active storage: a novel framework adapting storage behavior based on application context, developed by phd candidate jaime cernuda
- global file heatmaps: an innovative system for tracking cross-process data access patterns, implemented by phd student luke logan
- ml-based prediction engine: advanced models achieving over 90% accuracy in predicting data access patterns
- adaptive data movement: dynamic policies responding to changing workload characteristics
mentorship & team development π₯
the success of coeus relies heavily on the dedication and innovation of our outstanding research team:
- phd students: leading core research thrusts in machine learning and storage optimization
- post-doctoral researchers: bridging theoretical foundations with practical implementations
- visiting researchers: contributing diverse perspectives from partner institutions
- undergraduate researchers: gaining valuable exposure to cutting-edge research
impact on scientific applications π
through collaborative efforts with domain scientists at doe facilities, our research has enhanced several critical applications:
- fusion research: supporting xgc with efficient data query capabilities
- particle physics: optimizing i/o performance for warpx
- climate modeling: enabling complex queries on large-scale climate data
- scientific visualization: accelerating data access for visualization tools
knowledge dissemination π
our team actively shares research findings through:
- guest lectures at partner institutions
- technical workshops at major conferences
- open-source software releases
- peer-reviewed publications
project resources π οΈ
the team maintains and continues to develop the core framework:
publications π
-
hades: a context-aware active storage framework for accelerating large-scale data analysis
cernuda, j., logan, l., gainaru, a., klasky, s., lofstead, j., kougkas, a., & sun, x.-h.
the 24th ieee/acm international symposium on cluster, cloud and internet computing (ccgrid), 2024
doi: 10.1109/ccgrid59990.2024.00070 -
to derive or not to derive: i/o libraries take charge of derived quantities computation
gainaru, a., eisenhauer, g., podhorszki, n., dulac, l., gong, q., kougkas, a., lofstead, j., sun, x.-h., klasky, s.
ieee/sbc 36th international symposium on computer architecture and high performance computing (sbac-pad), 2024
acknowledgements π
this research is supported by the u.s. department of energy, office of science, under award number de-sc0023386. we are grateful to our collaborators at sandia national laboratories and oak ridge national laboratory for this partnership in advancing the state of scientific data management. special thanks to our talented students and post-doctoral researchers whose dedication and innovation drive this project forward.
Interested in research opportunities or potential collaborations? Feel free to reach out!