Chronolog

Distributed log store that manages activity/event data.

Modern applications—from cutting-edge scientific instruments to ubiquitous IoT devices—generate massive amounts of activity data at unprecedented rates. Managing, storing, and processing this data efficiently is a significant challenge. ChronoLog is an innovative, open-source solution that transforms how we handle this deluge of data, making storage and retrieval faster, more efficient, and more scalable than ever before.

What is ChronoLog?

ChronoLog is a distributed, tiered, shared log store that leverages physical time as a natural ordering mechanism for data. By using time itself to organize data, ChronoLog eliminates the need for complex synchronization, enabling high concurrency and efficient data retrieval. It automatically manages data across multiple storage tiers, optimizing for both performance and capacity.

✨ What makes ChronoLog special?

• ⏰ Time-based organization

Like a well-organized diary, ChronoLog uses physical time to naturally order data. This means:

No expensive synchronization needed
Natural, intuitive data ordering
Efficient storage and retrieval operations

• 🔄 Smart storage management

Think of it as an intelligent librarian that:

🚀 Keeps recent books (data) on the front desk (memory)
📚 Moves older volumes to easily accessible shelves (SSDs)
📦 Archives historical records in the basement (HDDs)
⚡ Auto-tiering happens seamlessly in the background

• 🌐 High concurrency and scalability

Like a busy library that:

📝 Allows multiple people to write in different journals
📖 Enables countless others to read simultaneously
🔄 Scales up or out automatically as needed
🎯 No manual intervention required

• 🔍 Efficient data retrieval

Imagine finding exactly what you need:

⚡ Lightning-fast range queries
📊 Perfect for time-series analysis
🎯 Precise temporal data access
📈 Optimized partial log processing

Technical Architecture

ChronoLog’s architecture consists of three main components working seamlessly:

ChronoVisor: the central control unit that manages system operations, ensuring smooth coordination among components.
ChronoKeeper: handles recent data with lightning-fast access, maintaining high performance for the most frequently accessed data.
ChronoStore: manages long-term storage across multiple tiers, ensuring data durability and optimal storage utilization.

ChronoLog’s architecture featuring ChronoVisor, ChronoKeeper, and tiered ChronoStore components.

Key Innovations

Physical time as global truth: utilizing physical time for data ordering eliminates synchronization overhead.
3D log distribution: scales both horizontally and vertically, distributing data across nodes and storage tiers.
Synchronization-free data distribution: simplifies the system design and improves performance.
Elastic storage with auto-tiering: automatically adjusts storage allocation based on data age and access patterns.
Native plugins for high-level interfaces: integrates seamlessly with various applications and services.

Impact and Applications

ChronoLog serves as a foundation for a wide range of data-intensive applications:

Scientific research: supporting data collection and analysis from telescopes, particle accelerators, and other instruments.
IoT and edge computing: managing massive streams of sensor data from distributed devices.
Financial systems: enabling fast processing of time-sensitive trading data and time-series analysis.
System monitoring: tracking and analyzing system performance in real-time for telemetry and diagnostics.
Performance analysis tools: providing detailed logs for debugging and optimization.
NoSQL databases and querying systems: enhancing data retrieval capabilities with efficient range queries.

Featured Applications

IceCube Neutrino Observatory: ChronoLog captures monitoring information from the IceCube detector, aiding in the study of neutrinos and cosmic events.
CyberGIS: supports spatial data synthesis and GIS analytics, enabling complex geospatial computations and visualizations.
Dark Energy Science Collaboration: assists in monitoring large-scale scientific workflows, contributing to our understanding of dark energy and the universe’s expansion.
Financial Computing: processes market transaction data in real-time, supporting high-frequency trading and financial analysis.

Performance Highlights

Performance comparison with existing solutions.

ChronoLog outperforms existing log storage solutions, offering higher throughput and lower latency. Its innovative architecture and optimizations ensure that applications can handle increasing data volumes without sacrificing performance.

Looking Forward

ChronoLog continues to evolve with exciting developments:

Advanced querying capabilities: enhancing data retrieval and analysis features.
Machine learning support: optimizing for workloads involving AI and machine learning.
Improved automation: incorporating self-optimization techniques for better resource management.
Expanded integration: developing plugins and interfaces for broader application support.

Community and Development

ChronoLog is an open-source project under the BSD license, welcoming contributions from academic and industrial researchers alike. We adhere to best practices in software development, ensuring a robust and reliable platform.

Get Involved

Repository: GitHub - ChronoLog project
Documentation: comprehensive guides and tutorials available in our wiki.
Community Forums: join discussions, share ideas, and collaborate on our Zulip channel.

Acknowledgements 🙏

ChronoLog’s development is supported by the National Science Foundation (NSF) under grant CSSI-2104013. I extend my gratitude to my collaborators at the Illinois Institute of Technology and the University of Chicago.

Join Us!

Are you passionate about distributed systems, data storage, or big data analytics? We’re looking for talented individuals to join our team. Whether you’re a student seeking research opportunities, a professional exploring new challenges, or a collaborator interested in integrating ChronoLog into your applications, we’d love to hear from you.

Interested in learning more about ChronoLog or discussing potential collaborations? Don’t hesitate to reach out!

Related Publications

2024

icpp24 HStream: A hierarchical data streaming engine for high-throughput scientific applications

Jaime Cernuda, Jie Ye, Anthony Kougkas, and Xian-He Sun

In Proceedings of the 53rd International Conference on Parallel Processing , Aug 2024

ABS BIB Cite

Data streaming is gaining traction in high-performance computing (HPC) as a mechanism for continuous data transfer, but remains underutilized as a processing paradigm due to the inadequacy of existing technologies, which are primarily designed for cloud architectures and ill-equipped to tackle HPC-specific challenges. This work introduces HStream, a novel data management design for out-of-core data streaming engines. Central to the HStream design is the separation of data and computing planes at the task level. By managing them independently, issues such as memory thrashing and back-pressure, caused by the high volume, velocity, and burstiness of I/O in HPC environments, can be effectively addressed at runtime. Specifically, HStream utilizes adaptive parallelism and hierarchical memory management, enabled by this design paradigm, to alleviate memory pressure and enhance system performance. These improvements enable HStream to match the performance of state-of-the-art HPC streaming engines and achieve up to a 1.5x reduction in latency under high data loads.
@inproceedings{cernuda2024hstream, entry_type = {conference}, author = {Cernuda, Jaime and Ye, Jie and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the 53rd International Conference on Parallel Processing}, title = {HStream: A hierarchical data streaming engine for high-throughput scientific applications}, year = {2024}, month = aug, publisher = {ACM}, volume = {}, number = {}, pages = {231-240}, keywords = {Data Movement Optimization, Elastic Storage, Data Integration Frameworks, Hierarchical Buffering}, doi = {10.1145/3673038.3673150}, url = {https://dl.acm.org/doi/abs/10.1145/3673038.3673150}, }

2022

hipc22 LuxIO: Intelligent Resource Provisioning and Auto-Configuration for Storage Services

Keith Bateman, Neeraj Rajesh, Jaime Cernuda Garcia, Luke Logan, Jie Ye, Stephen Herbein, Anthony Kougkas, and Xian-He Sun

In Proceedings of the 29th International Conference on High Performance Computing, Data, and Analytics , Dec 2022

ABS BIB Cite

Storage in HPC is typically a single Remote and Static Storage (RSS) resource. However, applications demonstrate diverse I/O requirements that can be better served by a multi-storage approach. Current practice employs ephemeral storage systems running on either node-local or shared storage resources. Yet, the burden of provisioning and configuring intermediate storage falls solely on the users, while global job schedulers offer little to no support for custom deployments. This lack of support often leads to over- or under-provisioning of resources and poorly configured storage systems. To mitigate this, we present LuxIO, an intelligent storage resource provisioning and auto-configuration service. LuxIO constructs storage deployments configured to best match I/O requirements. LuxIO-tuned storage services show performance improvements up to 2× across common applications and benchmarks, while introducing minimal overhead of 93.40 ms on top of existing job scheduling pipelines. LuxIO improves resource utilization by up to 25% in select workflows.
@inproceedings{bateman2022luxio, entry_type = {conference}, author = {Bateman, Keith and Rajesh, Neeraj and Garcia, Jaime Cernuda and Logan, Luke and Ye, Jie and Herbein, Stephen and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the 29th International Conference on High Performance Computing, Data, and Analytics}, title = {LuxIO: Intelligent Resource Provisioning and Auto-Configuration for Storage Services}, year = {2022}, month = dec, publisher = {IEEE}, volume = {}, number = {}, pages = {246--255}, keywords = {Storage Resource Provisioning, Data Management in HPC, Elastic Storage, Workflow Optimization}, doi = {10.1109/HiPC56025.2022.00041}, url = {https://ieeexplore.ieee.org/abstract/document/10106285}, }

2021

hpdc21 Apollo: An ML-assisted real-time storage resource observer

Neeraj Rajesh, Hariharan Devarajan, Jaime Cernuda Garcia, Keith Bateman, Luke Logan, Jie Ye, Anthony Kougkas, and Xian-He Sun

In Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing , Jun 2021

ABS BIB Cite

Applications and middleware services, such as data placement engines, I/O scheduling, and prefetching engines, require low-latency access to telemetry data in order to make optimal decisions. However, typical monitoring services store their telemetry data in a database in order to allow applications to query them, resulting in significant latency penalties. This work presents Apollo: a low-latency monitoring service that aims to provide applications and middleware libraries with direct access to relational telemetry data. Monitoring the system can create interference and overhead, slowing down raw performance of the resources for the job. However, having a current view of the system can aid middleware services in making more optimal decisions which can ultimately improve the overall performance. Apollo has been designed from the ground up to provide low latency, using Publish-Subscriber Pub-Sub semantics, and low overhead, using adaptive intervals in order to change the length of time between polling the resource for telemetry data and machine learning in order to predict changes to the telemetry data between actual resource polling. This work also provides some high level abstractions called I/O curators, which can further aid middleware libraries and applications to make optimal decisions. Evaluations showcase that Apollo can achieve sub-millisecond latency for acquiring complex insights with a memory overhead of 57 MB and CPU overhead being only 7% more than existing state-of-the-art systems.
@inproceedings{rajesh2021apollo, entry_type = {conference}, author = {Rajesh, Neeraj and Devarajan, Hariharan and Garcia, Jaime Cernuda and Bateman, Keith and Logan, Luke and Ye, Jie and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing}, title = {Apollo: An ML-assisted real-time storage resource observer}, year = {2021}, month = jun, publisher = {ACM}, volume = {}, number = {}, pages = {147--159}, keywords = {Storage Resource Provisioning, I/O Profiling, Data Management in HPC, High-Performance Computing}, doi = {10.1145/3431379.3460640}, url = {https://dl.acm.org/doi/abs/10.1145/3431379.3460640}, }

2020

msst20 ChronoLog: A Distributed Shared Tiered Log Store with Time-based Data Ordering

Anthony Kougkas, Hariharan Devarajan, Keith Bateman, Jaime Cernuda, Neeraj Rajesh, and Xian-He Sun

In Proceedings of the 36th International Conference on Massive Storage Systems and Technology , Oct 2020

ABS BIB Cite

Modern applications produce and process massive amounts of activity (or log) data. Traditional storage systems were not designed with an append-only data model and a new storage abstraction aims to fill this gap: the distributed shared log store. However, existing solutions struggle to provide a scalable, parallel, and high-performance solution that can support a diverse set of conflicting log workload requirements. Finding the tail of a distributed log is a centralized point of contention. In this paper, we show how using physical time can help alleviate the need of centralized synchronization points. We present ChronoLog, a new, distributed, shared, and multi-tiered log store that can handle more than a million tail operations per second. Evaluation results show ChronoLog’s potential, outperforming existing solution by an order of magnitude.
@inproceedings{kougkas2020chronolog, entry_type = {conference}, author = {Kougkas, Anthony and Devarajan, Hariharan and Bateman, Keith and Cernuda, Jaime and Rajesh, Neeraj and Sun, Xian-He}, booktitle = {Proceedings of the 36th International Conference on Massive Storage Systems and Technology}, title = {ChronoLog: A Distributed Shared Tiered Log Store with Time-based Data Ordering}, year = {2020}, month = oct, publisher = {Santa Clara University - School of Engineering}, volume = {}, number = {}, pages = {}, keywords = {Shared Log Storage Systems, Multi-Tiered Storage Hierarchy, Data-Intensive Applications, Storage Architectures}, doi = {}, url = {https://msstconference.org/MSST-history/2020/Papers/06.ChronoLog.pdf}, }
cluster20 Hcl: Distributing parallel data structures in extreme scales

Hariharan Devarajan, Anthony Kougkas, Keith Bateman, and Xian-He Sun

In Proceedings of the International Conference on Cluster Computing , Sep 2020

ABS BIB Cite

Most parallel programs use irregular control flow and data structures, which are perfect for one-sided communication paradigms such as MPI or PGAS programming languages. However, these environments lack the presence of efficient function-based application libraries that can utilize popular communication fabrics such as TCP, Infinity Band (IB), and RDMA over Converged Ethernet (RoCE). Additionally, there is a lack of high-performance data structure interfaces. We present Hermes Container Library (HCL), a high-performance distributed data structures library that offers high-level abstractions including hash-maps, sets, and queues. HCL uses a RPC over RDMA technology that implements a novel procedural programming paradigm. In this paper, we argue a RPC over RDMA technology can serve as a high-performance, flexible, and co-ordination free backend for implementing complex data structures. Evaluation results from testing real workloads shows that HCL programs are 2x to 12x faster compared to BCL, a state-of-the-art distributed data structure library.
@inproceedings{devarajan2020hcl, entry_type = {conference}, author = {Devarajan, Hariharan and Kougkas, Anthony and Bateman, Keith and Sun, Xian-He}, booktitle = {Proceedings of the International Conference on Cluster Computing}, title = {Hcl: Distributing parallel data structures in extreme scales}, year = {2020}, month = sep, publisher = {IEEE}, volume = {}, number = {}, pages = {248--258}, keywords = {Distributed Data Structures, Parallel I/O Optimization, Storage Architectures, High-Performance Computing}, doi = {10.1109/CLUSTER49012.2020.00035}, url = {https://ieeexplore.ieee.org/abstract/document/9229595}, }

2019

ccgrid19 An intelligent, adaptive, and flexible data compression framework

Hariharan Devarajan, Anthony Kougkas, and Xian-He Sun

In Proceedings of the 19th International Symposium on Cluster, Cloud and Grid Computing , May 2019

ABS BIB Cite

The data explosion phenomenon in modern applications causes tremendous stress on storage systems. Developers use data compression, a size-reduction technique, to address this issue. However, each compression library exhibits different strengths and weaknesses when considering the input data entry_type and format. We present Ares, an intelligent, adaptive, and flexible compression framework which can dynamically choose a compression library for a given input data based on the entry_type of the workload and provides an appropriate infrastructure to users to fine-tune the chosen library. Ares is a modular framework which unifies several compression libraries while allowing the addition of more compression libraries by the user. Ares is a unified compression engine that abstracts the complexity of using different compression libraries for each workload. Evaluation results show that under real-world applications, from both scientific and Cloud domains, Ares performed 2-6x faster than competitive solutions with a low cost of additional data analysis (i.e., overheads around 10%) and up to 10x faster against a baseline of no compression at all.
@inproceedings{devarajan2019intelligent, entry_type = {conference}, author = {Devarajan, Hariharan and Kougkas, Anthony and Sun, Xian-He}, booktitle = {Proceedings of the 19th International Symposium on Cluster, Cloud and Grid Computing}, title = {An intelligent, adaptive, and flexible data compression framework}, year = {2019}, month = may, publisher = {IEEE}, volume = {}, number = {}, pages = {82--91}, keywords = {Data Compression Techniques, Data Management in HPC, I/O Acceleration, Storage Resource Provisioning}, doi = {10.1109/CCGRID.2019.00019}, url = {https://ieeexplore.ieee.org/abstract/document/8752926}, }