DTIO

A novel task-driven approach to scalable and distributed I/O services.

DTIO (DataTask I/O) is a groundbreaking framework that introduces a new I/O paradigm to support a wide variety of conflicting I/O workloads under a single scalable platform ready for the convergence of High-Performance Computing (HPC), AI, and Big Data. As scientific applications become increasingly data-intensive with diverse, often conflicting I/O requirements, DTIO provides a unified solution through an innovative task-based I/O approach.

What makes DTIO special?

DTIO acts as an intelligent mediator between different I/O workloads. The core innovation is the concept of DataTasks (DTs) - a tuple of an operation and a pointer to data that allows applications to treat data as active objects capable of performing operations on themselves or other DTs. This enables seamless integration of compute-centric and data-centric environments while providing robust I/O optimization capabilities.

Behind the innovation

DTIO emerged from the recognition that modern scientific workflows require a diverse and often conflicting set of storage features and semantics. Our research shows that a unified approach is essential to address the challenges of data-intensive computing. The project, funded by the Department of Energy’s Advanced Scientific Computing Research (DOE ASCR) program, aims to create a novel data movement infrastructure that can efficiently handle the convergence of HPC, Big Data, and AI workloads.

Key innovations

DataTask abstraction: Novel concept enabling atomic, distributed and composable data transformations
QoS-aware scheduling: Intelligent scheduling with asynchronous I/O capabilities
Unified I/O solution: Seamless support for both legacy and modern I/O interfaces
High performance: Achieves significant speedup on real scientific applications
Resilience: Built-in fault tolerance and data lineage tracking
Active storage: Programmable locality-aware storage capabilities

Real-world impact

DTIO is making significant contributions across various scientific domains:

Climate modeling: Supporting applications like CM1 with efficient data analysis integration
Molecular dynamics: Enabling seamless I/O for applications like LAMMPS
Astronomical data processing: Accelerating workflows like Montage
Weather forecasting: Enhancing I/O performance for WRF simulations
High-performance data analytics: Bridging the gap between computing and data processing

Technical architecture

DTIO consists of several key components:

DataTask Manager: Handles DT specification and composition
QoS Scheduler: Manages DT scheduling and resource allocation
Asynchronous I/O Engine: Enables efficient overlapping of I/O operations
Resilience Manager: Provides fault tolerance and data lineage
Active Storage Layer: Supports programmable storage capabilities

Looking forward

DTIO continues to evolve with exciting developments in:

Extended support for various I/O interfaces and patterns
Enhanced QoS-aware scheduling techniques
Advanced resilience and fault tolerance capabilities
Deeper integration with active storage technologies

Join the DTIO community

DTIO is an open-source project welcoming contributions from both academic and industrial researchers:

Repository: GitHub - DTIO project
Documentation: Coming soon
Research papers: Coming soon

Acknowledgements

This material is based upon work supported by the U.S. Department of Energy, Office of Science, under Award Number DE-SC0023386. This project is a collaborative effort between Illinois Institute of Technology and Argonne National Laboratory. I am grateful to my partners at ANL and the broader DOE community whose expertise has been instrumental in advancing this project.

Interested in learning more about DTIO or discussing potential collaborations? Feel free to reach out!

Related Publications

2024

icpp24 Viper: A High-Performance I/O Framework for Transparently Updating, Storing, and Transferring Deep Neural Network Models

Jie Ye, Jaime Cernuda, Neeraj Rajesh, Keith Bateman, Orcun Yildiz, Tom Peterka, Arnur Nigmetov, Dmitriy Morozov, Xian-He Sun, Anthony Kougkas, and Bogdan Nicolae

In Proceedings of the 53rd International Conference on Parallel Processing , Aug 2024

ABS BIB Cite

Scientific workflows increasingly need to train a DNN model in real-time during an experiment (e.g. using ground truth from a simulation), while using it at the same time for inferences. Instead of sharing the same model instance, the training (producer) and inference server (consumer) often use different model replicas that are kept synchronized. In addition to efficient I/O techniques to keep the model replica of the producer and consumer synchronized, there is another important trade-off: frequent model updates enhance inference quality but may slow down training; infrequent updates may lead to less precise inference results. To address these challenges, we introduce Viper: a new I/O framework designed to determine a near-optimal checkpoint schedule and accelerate the delivery of the latest model updates. Viper builds an inference performance predictor to identify the optimal checkpoint schedule to balance the trade-off between training slowdown and inference quality improvement. It also creates a memory-first model transfer engine to accelerate model delivery through direct memory-to-memory communication. Our experiments show that Viper can reduce the model update latency by ≈ 9x using the GPU-to-GPU data transfer engine and ≈ 3x using the DRAM-to-DRAM host data transfer. The checkpoint schedule obtained from Viper’s predictor also demonstrates improved cumulative inference accuracy compared to the baseline of epoch-based solutions.
@inproceedings{ye2024viper, entry_type = {conference}, author = {Ye, Jie and Cernuda, Jaime and Rajesh, Neeraj and Bateman, Keith and Yildiz, Orcun and Peterka, Tom and Nigmetov, Arnur and Morozov, Dmitriy and Sun, Xian-He and Kougkas, Anthony and Nicolae, Bogdan}, booktitle = {Proceedings of the 53rd International Conference on Parallel Processing}, title = {Viper: A High-Performance I/O Framework for Transparently Updating, Storing, and Transferring Deep Neural Network Models}, year = {2024}, month = aug, publisher = {ACM}, volume = {}, number = {}, pages = {812-821}, kkeywords = {Deep Learning I/O, Data Movement Optimization, Workflow Optimization, Storage Bridging}, doi = {10.1145/3673038.3673070}, url = {https://doi.org/10.1145/3673038.3673070}, }

2022

sc22 LabStor: A modular and extensible platform for developing high-performance, customized I/O stacks in userspace

Luke Logan, Jaime Cernuda Garcia, Jay Lofstead, Xian–He Sun, and Anthony Kougkas

In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis , Nov 2022

ABS BIB Cite

Traditionally, I/O systems have been developed within the confines of a centralized OS kernel. This led to monolithic and rigid storage systems that are limited by low development speed, expressiveness, and performance. Various assumptions are imposed including reliance on the UNIX-file abstraction, the POSIX standard, and a narrow set of I/O policies. However, this monolithic design philosophy makes it difficult to develop and deploy new I/O approaches to satisfy the rapidly-evolving I/O requirements of modern scientific applications. To this end, we propose LabStor: a modular and extensible platform for developing high-performance, customized I/O stacks. Single-purpose I/O modules (e.g, I/O schedulers) can be developed in the comfort of userspace and released as plug-ins, while end-users can compose these modules to form workload- and hardware-specific I/O stacks. Evaluations show that by switching to a fully modular design, tailored I/O stacks can yield performance improvements of up to 60% in various applications.
@inproceedings{logan2022labstor, entry_type = {conference}, author = {Logan, Luke and Garcia, Jaime Cernuda and Lofstead, Jay and Sun, Xian--He and Kougkas, Anthony}, booktitle = {Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis}, title = {LabStor: A modular and extensible platform for developing high-performance, customized I/O stacks in userspace}, year = {2022}, month = nov, publisher = {ACM}, volume = {}, number = {}, pages = {1--15}, keywords = {Storage Bridging, Elastic Storage, I/O Acceleration, Task-Based I/O}, doi = {10.1109/SC41404.2022.00028}, url = {https://ieeexplore.ieee.org/abstract/document/10046077}, }

2020

bigdata20 Hreplica: a dynamic data replication engine with adaptive compression for multi-tiered storage

Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun

In Proceedings of the International Conference on Big Data , Dec 2020

ABS BIB Cite

As the diversity of big data applications increases, their requirements diverge and often conflict with one other. Managing this diversity in any supercomputer or data center is a major challenge for system designers. Data replication is a popular approach to meet several of these requirements, such as low latency, read availability, durability, etc. This approach can be enhanced using new modern heterogeneous hardware and software techniques such as data compression. However, both these enhancements work in isolation to the detriment of both. In this work, we present HReplica: a dynamic data replication engine which harmoniously leverages data compression and hierarchical storage to increase the effectiveness of data replication. We have developed a novel dynamic selection algorithm that facilitates the optimal matching of replication schemes, compression libraries, and tiered storage. Our evaluation shows that HReplica can improve scientific and cloud application performance by 5.2x when compared to other state-of-the-art replication schemes.
@inproceedings{devarajan2020hreplica, entry_type = {conference}, author = {Devarajan, Hariharan and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the International Conference on Big Data}, title = {Hreplica: a dynamic data replication engine with adaptive compression for multi-tiered storage}, year = {2020}, month = dec, publisher = {IEEE}, volume = {}, number = {}, pages = {256--265}, keywords = {Data Replication, Data Compression Techniques, Multi-Tiered Storage Hierarchy, I/O Acceleration}, doi = {10.1109/BigData50022.2020.9378167}, url = {https://ieeexplore.ieee.org/abstract/document/9378167}, }
cluster20 Hcl: Distributing parallel data structures in extreme scales

Hariharan Devarajan, Anthony Kougkas, Keith Bateman, and Xian-He Sun

In Proceedings of the International Conference on Cluster Computing , Sep 2020

ABS BIB Cite

Most parallel programs use irregular control flow and data structures, which are perfect for one-sided communication paradigms such as MPI or PGAS programming languages. However, these environments lack the presence of efficient function-based application libraries that can utilize popular communication fabrics such as TCP, Infinity Band (IB), and RDMA over Converged Ethernet (RoCE). Additionally, there is a lack of high-performance data structure interfaces. We present Hermes Container Library (HCL), a high-performance distributed data structures library that offers high-level abstractions including hash-maps, sets, and queues. HCL uses a RPC over RDMA technology that implements a novel procedural programming paradigm. In this paper, we argue a RPC over RDMA technology can serve as a high-performance, flexible, and co-ordination free backend for implementing complex data structures. Evaluation results from testing real workloads shows that HCL programs are 2x to 12x faster compared to BCL, a state-of-the-art distributed data structure library.
@inproceedings{devarajan2020hcl, entry_type = {conference}, author = {Devarajan, Hariharan and Kougkas, Anthony and Bateman, Keith and Sun, Xian-He}, booktitle = {Proceedings of the International Conference on Cluster Computing}, title = {Hcl: Distributing parallel data structures in extreme scales}, year = {2020}, month = sep, publisher = {IEEE}, volume = {}, number = {}, pages = {248--258}, keywords = {Distributed Data Structures, Parallel I/O Optimization, Storage Architectures, High-Performance Computing}, doi = {10.1109/CLUSTER49012.2020.00035}, url = {https://ieeexplore.ieee.org/abstract/document/9229595}, }

2019

bigdata19 NIOBE: An intelligent i/o bridging engine for complex and distributed workflows

Kun Feng, Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun

In Proceedings of the International Conference on Big Data , Dec 2019

ABS BIB Cite

In the age of data-driven computing, integrating High Performance Computing(HPC) and Big Data(BD) environments may be the key to increasing productivity and to driving scientific discovery forward. Scientific workflows consist of diverse applications (i.e., HPC simulations and BD analysis) each with distinct representations of data that introduce a semantic barrier between the two environments. To solve scientific problems at scale, accessing semantically different data from different storage resources is the biggest unsolved challenge. In this work, we aim to address a critical question: ”How can we exploit the existing resources and efficiently provide transparent access to data from/to both environments”. We propose iNtelligent I/O Bridging Engine(NIOBE), a new data integration framework that enables integrated data access for scientific workflows with asynchronous I/O and data aggregation. NIOBE performs the data integration using available I/O resources, in contrast to existing optimizations that ignore the I/O nodes present on the data path. In NIOBE, data access is optimized to consider both the ongoing production and the consumption of the data in the future. Experimental results show that with NIOBE, an integrated scientific workflow can be accelerated by up to 10x when compared to a no-integration baseline and by up to 133% compared to other state-of-the-art integration solutions.
@inproceedings{feng2019niobe, entry_type = {conference}, author = {Feng, Kun and Devarajan, Hariharan and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the International Conference on Big Data}, title = {NIOBE: An intelligent i/o bridging engine for complex and distributed workflows}, year = {2019}, month = dec, publisher = {IEEE}, volume = {}, number = {}, pages = {493--502}, keywords = {Data Integration Frameworks, I/O Acceleration, Storage Bridging, Data Movement Optimization}, doi = {10.1109/BigData47090.2019.9006363}, url = {https://ieeexplore.ieee.org/abstract/document/9006363}, }
ccgrid19 An intelligent, adaptive, and flexible data compression framework

Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun

In Proceedings of the 19th International Symposium on Cluster, Cloud and Grid Computing , May 2019

ABS BIB Cite

The data explosion phenomenon in modern applications causes tremendous stress on storage systems. Developers use data compression, a size-reduction technique, to address this issue. However, each compression library exhibits different strengths and weaknesses when considering the input data entry_type and format. We present Ares, an intelligent, adaptive, and flexible compression framework which can dynamically choose a compression library for a given input data based on the entry_type of the workload and provides an appropriate infrastructure to users to fine-tune the chosen library. Ares is a modular framework which unifies several compression libraries while allowing the addition of more compression libraries by the user. Ares is a unified compression engine that abstracts the complexity of using different compression libraries for each workload. Evaluation results show that under real-world applications, from both scientific and Cloud domains, Ares performed 2-6x faster than competitive solutions with a low cost of additional data analysis (i.e., overheads around 10%) and up to 10x faster against a baseline of no compression at all.
@inproceedings{devarajan2019intelligent, entry_type = {conference}, author = {Devarajan, Hariharan and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the 19th International Symposium on Cluster, Cloud and Grid Computing}, title = {An intelligent, adaptive, and flexible data compression framework}, year = {2019}, month = may, publisher = {IEEE}, volume = {}, number = {}, pages = {82--91}, keywords = {Data Compression Techniques, Data Management in HPC, I/O Acceleration, Storage Resource Provisioning}, doi = {10.1109/CCGRID.2019.00019}, url = {https://ieeexplore.ieee.org/abstract/document/8752926}, }

2018

hipc18 Vidya: Performing code-block I/O characterization for data access optimization

Hariharan Devarajan, Anthony Kougkas, Prajwal Challa, and Xian-He Sun

In Proceedings of the 25th International Conference on High Performance Computing , Dec 2018

ABS BIB Cite

Understanding, characterizing and tuning scientific applications’ I/O behavior is an increasingly complicated process in HPC systems. Existing tools use either offline profiling or online analysis to get insights into the applications’ I/O patterns. However, there is lack of a clear formula to characterize applications’ I/O. Moreover, these tools are application specific and do not account for multi-tenant systems. This paper presents Vidya, an I/O profiling framework which can predict application’s I/O intensity using a new formula called Code-Block I/O Characterization (CIOC). Using CIOC, developers and system architects can tune an application’s I/O behavior and better match the underlying storage system to maximize performance. Evaluation results show that Vidya can predict an application’s I/O intensity with a variance of 0.05%. Vidya can profile applications with a high accuracy of 98% while reducing profiling time by 9x. We further show how Vidya can optimize an application’s I/O time by 3.7x.
@inproceedings{devarajan2018vidya, entry_type = {conference}, author = {Devarajan, Hariharan and Kougkas, Anthony and Challa, Prajwal and Sun, Xian-He}, booktitle = {Proceedings of the 25th International Conference on High Performance Computing}, title = {Vidya: Performing code-block I/O characterization for data access optimization}, year = {2018}, month = dec, publisher = {IEEE}, volume = {}, number = {}, pages = {255--264}, keywords = {I/O Profiling, Data Management in HPC, I/O Performance Optimization, Task-Based I/O}, doi = {10.1109/HiPC.2018.00036}, url = {https://ieeexplore.ieee.org/abstract/document/8638067}, }
ics18 Iris: I/o redirection via integrated storage

Anthony Kougkas, Hariharan Devarajan, and Xian-He Sun

In Proceedings of the International Conference on Supercomputing , Jun 2018

ABS BIB Cite

There is an ocean of available storage solutions in modern high-performance and distributed systems. These solutions consist of Parallel File Systems (PFS) for the more traditional high-performance computing (HPC) systems and of Object Stores for emerging cloud environments. More of ten than not, these storage solutions are tied to specific APIs and data models and thus, bind developers, applications, and entire computing facilities to using certain interfaces. Each storage system is designed and optimized for certain applications but does not perform well for others. Furthermore, modern applications have become more and more complex consisting of a collection of phases with different computation and I/O requirements. In this paper, we propose a unified storage access system, called IRIS (i.e., I/O Redirection via Integrated Storage). IRIS enables unified data access and seamlessly bridges the semantic gap between file systems and object stores. With IRIS, emerging High-Performance Data Analytics software has capable and diverse I/O support. IRIS can bring us closer to the convergence of HPC and Cloud environments by combining the best storage subsystems from both worlds. Experimental results show that IRIS can grant more than 7x improvement in performance than existing solutions.
@inproceedings{kougkas2018iris, entry_type = {conference}, author = {Kougkas, Anthony and Devarajan, Hariharan and Sun, Xian-He}, booktitle = {Proceedings of the International Conference on Supercomputing}, title = {Iris: I/o redirection via integrated storage}, year = {2018}, month = jun, publisher = {}, volume = {}, number = {}, pages = {33--42}, keywords = {Data Integration Frameworks, Storage Bridging, Elastic Storage, Data Movement Optimization}, doi = {10.1145/3205289.3205322}, url = {https://dl.acm.org/doi/abs/10.1145/3205289.3205322}, }