IOWarp
bridging the gap between hpc and storage.
as the creator and co-principal investigator of this pioneering $5m nsf-funded project, i am leading a transformative effort to revolutionize scientific data management. iowarp represents a fundamental reimagining of how we handle data in the era of ai-driven scientific discovery, building upon my decade-long research in storage systems and i/o optimization.
🎧 audio overview
a deep dive into iowarp's vision, architecture, and impact on scientific computing.
duration: 15:54
research vision & innovation 💡
iowarp emerged from a critical observation that modern scientific workflows - particularly those integrating ai - are severely constrained by traditional data management approaches. leading a team of researchers across illinois institute of technology, the hdf group, and the university of utah, we are developing a comprehensive platform that:
- bridges multiple worlds: seamlessly integrates hpc, big data, and ai workflows
- enables intelligence: incorporates llm-driven data exploration with warpgpt
- optimizes performance: leverages advanced hardware like cxl and gpudirect
- ensures adaptability: provides a flexible, plugin-based architecture
technical breakthroughs 🔧
building on my previous success with the hermes i/o buffering system, i designed several key innovations in iowarp:
- content assimilation engine: a novel approach for unifying diverse data formats
- advanced storage integration: direct support for emerging storage technologies
- ml-guided data placement: intelligent data movement across storage tiers
- content exploration interface: natural language-driven data analytics
system architecture
the iowarp architecture, which i conceptualized and developed with my team, consists of four major components:
- content assimilation engine (cae): transforms diverse data formats into a unified representation
- content transfer engine (cte): manages efficient data movement across storage tiers
- content exploration interface (cei): provides llm-powered data discovery capabilities
- platform plugins interface (ppi): enables seamless integration with external services
impact on scientific computing 🌍
under my leadership, iowarp is already demonstrating significant impact across various scientific domains:
- materials science: accelerating x-ray tomography analysis workflows by 7x
- climate modeling: enabling real-time data analysis for climate simulations
- ai/ml research: supporting efficient model training and inference operations
- bioinformatics: streamlining large-scale genomic data processing
team development & mentorship 👥
a core aspect of my leadership in this project involves mentoring the next generation of researchers:
- 2 phd students specializing in storage systems and ai
- 1 postdoctoral researchers in advanced data management
- 3 master students as research assistants in system development
- fostering collaboration with industry partners
project resources 🛠️
- source code: iowarp framework
- documentation: comprehensive guides and api references
- educational materials: training modules and tutorials
- development tools: ci/cd pipeline and testing infrastructure
acknowledgements 🙏
this material is based upon work supported by the national science foundation. i thank my collaborators at the hdf group and the university of utah for their invaluable contributions.
For inquiries about collaboration opportunities or to learn more about the project, please feel free to reach out!