IOWarp

bridging the gap between hpc and storage.

as the creator and co-prinicipal investigator of this $5m nsf-funded project, i’m working to improve scientific data management. iowarp aims to enhance how we handle data in modern scientific workflows, especially those involving ai. this project builds on my previous work in storage systems and i/o optimization, focusing on practical solutions for current challenges in scientific computing.

🎧
audio overview - 15:54 minutes

research vision & innovation

iowarp emerged from a critical observation that modern scientific workflows - particularly those integrating ai - are severely constrained by traditional data management approaches. leading a team of researchers across illinois institute of technology, the hdf group, and the university of utah, we are developing a comprehensive platform that:

  • bridges multiple worlds: seamlessly integrates hpc, big data, and ai workflows
  • enables intelligence: incorporates llm-driven data exploration with warpgpt
  • optimizes performance: leverages advanced hardware like cxl and gpudirect
  • ensures adaptability: provides a flexible, plugin-based architecture

technical breakthroughs

building on the success with the hermes i/o buffering system, i designed several key innovations in iowarp:

  • content assimilation engine: a novel approach for unifying diverse data formats
  • advanced storage integration: direct support for emerging storage technologies
  • ml-guided data placement: intelligent data movement across storage tiers
  • content exploration interface: natural language-driven data analytics

system architecture

the iowarp architecture, which i conceptualized and developed with my team, consists of four major components:

  • content assimilation engine (cae): transforms diverse data formats into a unified representation
  • content transfer engine (cte): manages efficient data movement across storage tiers
  • content exploration interface (cei): provides llm-powered data discovery capabilities
  • platform plugins interface (ppi): enables seamless integration with external services

impact on scientific computing

under my leadership, iowarp is already demonstrating significant impact across various scientific domains:

  • materials science: accelerating x-ray tomography analysis workflows by 7x
  • climate modeling: enabling real-time data analysis for climate simulations
  • ai/ml research: supporting efficient model training and inference operations
  • bioinformatics: streamlining large-scale genomic data processing

team development & mentorship

a core aspect of my leadership in this project involves mentoring the next generation of researchers:

  • 2 phd students specializing in storage systems and ai
  • 1 postdoctoral researchers in advanced data management
  • 3 master students as research assistants in system development
  • fostering collaboration with industry partners

project resources

  • 💻 source code: iowarp framework
  • 📖 documentation: comprehensive guides and api references
  • 📚 educational materials: training modules and tutorials
  • 🛠️ development tools: ci/cd pipeline and testing infrastructure

acknowledgements 🙏

this material is based upon work supported by the national science foundation. i thank my collaborators at the hdf group and the university of utah for their invaluable contributions.


For inquiries about collaboration opportunities or to learn more about the project, please feel free to reach out!

Related Publications

2024

  1. Luke Logan, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis SC , Nov 2024
  2. Hariharan Devarajan, Loïc Pottier, Kaushik Velusamy, Huihuo Zheng, Izzet Yildirim, Olga Kogiou, Weikuan Yu, Anthony Kougkas, Xian-He Sun, Jae Seung Yeom, and Kathryn Mohror
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis , Nov 2024
  3. Meng Tang, Jaime Cernuda, Jie Ye, Luanzheng Guo, Nathan R. Tallent, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the IEEE International Conference on Cluster Computing , Sep 2024
  4. Luke Logan, Jay Lofstead, Xian-He Sun, and Anthony Kougkas
    In Proceedings of the ACM SIGOPS Operating Systems Review , Aug 2024
  5. Jie Ye, Jaime Cernuda, Neeraj Rajesh, Keith Bateman, Orcun Yildiz, Tom Peterka, Arnur Nigmetov, Dmitriy Morozov, Xian-He Sun, Anthony Kougkas, and Bogdan Nicolae
    In Proceedings of the 53rd International Conference on Parallel Processing , Aug 2024
  6. Jaime Cernuda, Jie Ye, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the 53rd International Conference on Parallel Processing , Aug 2024
  7. Neeraj Rajesh, Keith Bateman, Jean Luca Bez, Suren Byna, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the International Parallel and Distributed Processing Symposium , May 2024

2023

  1. Hyungro Lee, Luanzheng Guo, Meng Tang, Jesun Firoz, Nathan Tallent, Anthony Kougkas, and Xian-He Sun
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , Nov 2023
  2. Izzet Yildirim, Hariharan Devarajan, Anthony Kougkas, Xian-He Sun, and Kathryn Mohror
    In Proceedings of the SC’23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis , Nov 2023
  3. Luke Logan, Jay Lofstead, Xian-He Sun, and Anthony Kougkas
    In Proceedings of the 3rd Workshop on Challenges and Opportunities of Efficient and Performant Storage Systems , May 2023

2022

  1. Luke Logan, Jaime Cernuda Garcia, Jay Lofstead, Xian–He Sun, and Anthony Kougkas
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , Nov 2022
  2. Hariharan Devarajan, Anthony Kougkas, Huihuo Zheng, Venkatram Vishwanath, and Xian-He Sun
    In Proceedings of the 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing , May 2022