Hermes
bridging the gap between hpc and storage
modern scientific applications are generating unprecedented amounts of data, pushing traditional storage systems to their limits. hermes is my solution to this challenge - an intelligent middleware system that seamlessly bridges high-performance computing and complex storage hierarchies.
🎧 audio overview
a deep dive into hermes' architecture, capabilities, and impact on scientific computing.
duration: 17:05
what makes hermes special? đź’ˇ
imagine your data storage as a multi-lane highway system. some lanes are incredibly fast but expensive (like ram), while others are slower but can handle more traffic (like traditional storage). hermes acts as an intelligent traffic control system, automatically routing your data through the most efficient paths. unlike traditional solutions that treat all storage the same, hermes understands and adapts to different storage technologies, making it uniquely powerful for modern scientific computing.
key innovations
- smart data management: using machine learning and advanced algorithms to automatically place data where it performs best
- seamless integration: works with existing scientific tools through posix and hdf5 interfaces
- adaptive performance: learns from your application’s behavior to continuously improve performance
- distributed architecture: scales efficiently across computing clusters
real-world impact 🌍
hermes isn’t just a research project - it’s making a real difference in scientific discovery:
- 2-10x performance boost for common scientific workflows
- adopted by major labs including pnnl, llnl, and ornl
- powering discoveries in climate modeling, particle physics, and ai-driven research
featured applications
- deepdrivemd: accelerating molecular dynamics simulations
- hacc: enabling faster cosmological simulations
- ai storm tracking: improving severe weather predictions
- lammps: enhancing materials science research
behind the innovation
as the principal architect of hermes, i led its development from initial concept through successful nsf funding and deployment at major research facilities. working with the hdf group, we’ve transformed hermes into a thriving open-source project that’s continually evolving to meet the needs of the scientific computing community.
join the hermes community 🤝
hermes is designed as a community-driven project, and contributions are welcome from both academic and industrial researchers. we invite you to explore the source code on github, where you can join our growing network of contributors, explore our documentation, and get involved in ongoing development.
together, we can continue to push the boundaries of high-performance i/o, making complex storage systems more accessible and efficient for the entire hpc community.
get started 🔧
- repository: github - hermes project
- documentation: comprehensive guides and tutorials available in our wiki.
- community forums: join discussions, share ideas, and collaborate on our zulip channel.
looking forward
the future of scientific computing demands increasingly sophisticated data management solutions. hermes continues to evolve, with ongoing development in areas like:
- advanced machine learning integration
- enhanced support for emerging storage technologies
- expanded application integration
- improved automation and self-optimization
acknowledgements 🙏
the development of hermes would not have been possible without the support of the national science foundation (nsf), whose funding played a crucial role in transforming preliminary research into a full-scale project. additionally, our collaboration with the hdf group was instrumental in refining the hermes architecture and maintaining the open-source codebase, ensuring its robustness and continued growth.
Want to learn more about Hermes or discuss potential collaborations? Feel free to reach out!
Related Publications
2024
- cluster24 DaYu: Optimizing Distributed Scientific Workflows by Decoding Dataflow Semantics and DynamicsIn Proceedings of the IEEE International Conference on Cluster Computing , Sep 2024
- In Proceedings of the 53rd International Conference on Parallel Processing , Aug 2024
2023
- In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , Nov 2023
2022
- In Proceedings of the 29th International Conference on High Performance Computing, Data, and Analytics , Dec 2022
- In Proceedings of the 22nd IEEE International Symposium on Cluster, Cloud and Internet Computing , May 2022
2021
- In Proceedings of the International Conference on Cluster Computing , Sep 2021
- In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing , Jun 2021
2020
- bigdata20 Hreplica: a dynamic data replication engine with adaptive compression for multi-tiered storageIn Proceedings of the International Conference on Big Data , Dec 2020
- In Proceedings of the International Conference on Cluster Computing , Sep 2020
- ipdps20.1 Hfetch: Hierarchical data prefetching for scientific workflows in multi-tiered storage environmentsIn Proceedings of the International Parallel and Distributed Processing Symposium , Jul 2020
- In Proceedings of the International Parallel and Distributed Processing Symposium , Jul 2020
- In Proceedings of the International Journal of Computer Science and Technology , Jan 2020
2019
- In Proceedings of the 19th International Symposium on Cluster, Cloud and Grid Computing , May 2019
2018
- In Proceedings of the 25th International Conference on High Performance Computing , Dec 2018
- cluster18 Harmonia: An interference-aware dynamic I/O scheduler for shared non-volatile burst buffersIn Proceedings of the International Conference on Cluster Computing , Sep 2018
- In Proceedings of the 27th International Symposium on High-Performance Parallel and Distributed Computing , Jun 2018
2016
- In Proceedings of the 12th International Conference on e-Science , Jun 2016