COEUS

accelerating scientific insights using enriched metadata.

coeus represents an exciting partnership with sandia national laboratories (snl) and oak ridge national laboratory (ornl), where our research team investigates advanced metadata management techniques to accelerate complex queries on scientific data while optimizing data placement across storage hierarchies. as a co-pi on this doe ascr funded project, my role focuses on leading the storage and data placement research thrust, where we develop novel approaches for intelligent data movement and enhanced metadata management.

research vision & leadership

working closely with dr. jay lofstead (snl) and dr. scott klasky (ornl), we identified critical challenges in scientific data management that led to the coeus project. our team at iit focuses on developing innovative solutions for:

  • storage-driven data movement: novel techniques for intelligent data placement
  • ml-guided optimization: advanced prediction models for data access patterns
  • metadata enhancement: new approaches for derived quantity management
  • hierarchical storage management: efficient use of modern storage tiers

technical innovations

under our team’s research direction, several breakthrough technologies have emerged:

  • context-aware active storage: a novel framework adapting storage behavior based on application context, developed by phd candidate jaime cernuda
  • global file heatmaps: an innovative system for tracking cross-process data access patterns, implemented by phd student luke logan
  • ml-based prediction engine: advanced models achieving over 90% accuracy in predicting data access patterns
  • adaptive data movement: dynamic policies responding to changing workload characteristics

mentorship & team development

the success of coeus relies heavily on the dedication and innovation of our outstanding research team:

  • phd students: leading core research thrusts in machine learning and storage optimization
  • post-doctoral researchers: bridging theoretical foundations with practical implementations
  • visiting researchers: contributing diverse perspectives from partner institutions
  • undergraduate researchers: gaining valuable exposure to cutting-edge research

impact on scientific applications

through collaborative efforts with domain scientists at doe facilities, our research has enhanced several critical applications:

  • fusion research: supporting xgc with efficient data query capabilities
  • particle physics: optimizing i/o performance for warpx
  • climate modeling: enabling complex queries on large-scale climate data
  • scientific visualization: accelerating data access for visualization tools

knowledge dissemination

our team actively shares research findings through:

  • guest lectures at partner institutions
  • technical workshops at major conferences
  • open-source software releases
  • peer-reviewed publications

project resources

the team maintains and continues to develop the core framework:

acknowledgements

this research is supported by the u.s. department of energy, office of science, under award number de-sc0023386. we are grateful to our collaborators at sandia national laboratories and oak ridge national laboratory for this partnership in advancing the state of scientific data management. special thanks to our talented students and post-doctoral researchers whose dedication and innovation drive this project forward.


Interested in research opportunities or potential collaborations? Feel free to reach out!

Related Publications

2024

  1. Jaime Cernuda, Jie Ye, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the 53rd International Conference on Parallel Processing , Aug 2024
  2. Jaime Cernuda, Luke Logan, Ana Gainaru, Scott Klasky, Jay Lofstead, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the 24th International Symposium on Cluster, Cloud and Internet Computing , May 2024