HFT Data (2+ PB, Low-Latency Analytics — POC)

All blogs

HFT Data (2+ PB, Low-Latency Analytics — POC)

Aug 10, 2024

Project Type: Direct Engagement
Project Timeline: May 2024 – Aug 2024
Client Region: APAC
Industry: Finance

Context & Objectives

A leading high-frequency trading (HFT) firm specializing in equities and derivatives markets, the client operates at extreme scale processing millions of tick, order, and trade events every second. The firm’s success depends on its ability to access and analyze real-time market data to power risk assessment, strategy execution, and market surveillance across global exchanges.
With trading strategies that rely on split-second decisions, even minor delays in data processing can result in missed opportunities or increased risk exposure. While the firm had advanced trading systems, its existing analytics and reporting platform was batch-oriented, causing delays of more than a day for daily reporting. This limited the ability of risk and strategy teams to react promptly to market changes and make informed decisions in real time.

To address these challenges, the firm sought a proof-of-concept architecture capable of delivering near real-time analytics, handling petabyte-scale data, and ensuring accuracy, consistency, and operational reliability for mission-critical trading workflow

Business Challenge:

The client’s existing data platform was primarily batch-oriented, which caused significant delays in reporting and real-time monitoring. Generating daily reports often took up to 1.5 days, leaving analysts and trading teams with stale data that could not support fast-paced decision-making.
Risk and strategy teams lacked intraday visibility into trading activities, relying instead on end-of-day summaries that were often outdated by the time they were reviewed. Critical metrics, including hot-path aggregates such as symbol-venue performance and latency histograms, were computed using manual scripts and creating operational bottlenecks.

Project Goal:
The objective of this initiative was to validate a high-performance, streaming + columnar OLAP architecture capable of handling petabyte-scale trading data while delivering low-latency analytics for mission-critical operations. Specifically, the proof-of-concept aimed to:

Reduce reporting cycle time from days to minutes, enabling risk, strategy, and operational teams to access timely insights.

Support sub-second aggregation queries on recent data (1–7 day windows), allowing rapid decision-making for high-frequency trading strategies.

Challenges

The firm faced a complex set of challenges arising from the scale, speed, and criticality of high-frequency trading systems. These challenges spanned data volume, latency, operational stability, and correctness, creating significant bottlenecks for analytics and decision-making.

High Data Volume & Velocity

Trading and market data generated millions of events per second, including ticks, orders, fills, and risk logs. We observed that the firm’s existing batch-oriented systems could not handle this continuous inflow of data, resulting in delayed ingestion and limited real-time visibility.

Latency Bottlenecks

Analysts and strategy teams required sub-second query performance for recent data (1–7 day windows), while still needing fast access to broader historical datasets. We identified that existing relational and Hadoop-based architectures could not meet these latency requirements, restricting the firm’s ability to act on live market events.

Operational Instability

The reliance on manual processes for merges, compactions, and aggregations led to frequent slowdowns and operational incidents, especially during peak trading periods. We recognized that SRE teams were spending significant time handling query performance issues and recovery tasks, reducing focus on strategic improvements.

Data Correctness & Replay

Streaming computations lacked exactly-once guarantees, occasionally producing duplicate or missing aggregates after job restarts or backfills. We needed to ensure that any architecture we proposed would provide deterministic replay and maintain data integrity, which was critical for trading decisions and risk reporting.

Constraints & Non-Functional Requirements

For this proof-of-concept, we needed to design a system that could meet the rigorous performance, reliability, and correctness standards expected of a production-grade HFT platform.

Key requirements included:

Throughput : We had to ensure that the platform could sustain millions of rows per second during peak trading hours, including tick events, orders, fills, and risk logs, without causing ingestion backlogs or delays in downstream processing.

Latency :

For recent data (1–7 day windows), we aimed for P95 query latency under 1 second to support near-real-time decision-making.

For broader historical queries, which span weeks or months, we targeted P95 latency under 4 seconds to ensure timely analytics without overloading the system.

Storage & Cost Efficiency : The platform needed to scale to over 2 PB of data while keeping storage costs predictable. We implemented tiered storage, with hot data on high-performance disks and cold data offloaded to object storage.

Correctness & Reliability : We enforced exactly-once processing semantics to ensure that each event contributed precisely once to downstream aggregates. Deterministic replay was achieved through Kafka offsets, and strict schema validation guaranteed that streaming and batch data remained consistent, even after restarts or backfills.

Solution & Implementation

To meet the client’s ultra-low-latency and correctness requirements, we designed and implemented a streaming OLAP architecture that combined Apache Flink for real-time processing and ClickHouse for high-performance columnar storage and analytics.

The architecture was designed as a high-throughput, end-to-end pipeline :

Key Highlights of Our Approach:

Real-Time Ingestion:
Kafka topics ingested ticks, orders, fills, and risk logs with millions of events per second, forming the backbone of the streaming pipeline.
Stream Processing:
We used Flink to parse incoming data, validate schemas, enrich events, and maintain exactly-once semantics, ensuring correctness even under restarts or failures.
Columnar Analytics:
ClickHouse MergeTree tables stored structured, time-series data, while materialized views and projections accelerated common aggregations and hot-path queries.
Interactive Serving:
We exposed dashboards via Grafana and ad-hoc SQL queries via ClickHouse, enabling analysts and trading teams to interact with live and historical data seamlessly.

Phase 1: Data Ingestion & Stream Processing

We ingested millions of events per second from multiple Kafka topics (ticks, orders, fills, risk logs), partitioned by symbol and venue. Using Apache Flink, we performed parsing, schema validation, and enrichment, joining streams for a unified event context.
Checkpointing and Kafka offsets ensured exactly-once processing, while watermarks maintained correct event-time ordering for late-arriving data, providing a fault-tolerant, real-time ingestion layer.

Phase 2: Storage & Lakehouse Layer

We used ClickHouse (MergeTree engine) to design time-series tables partitioned by event date and keyed by (symbol, venue, event_time). Leveraging columnar storage enabled efficient compression and vectorized reads.
Tiered storage retained hot data locally while older partitions were offloaded to object storage using TTL policies. PREWHERE pruning optimized scans by symbol and time, merge scheduling balanced throughput with query performance, and ZSTD compression minimized disk and network I/O.

Phase 3: Transformation & Modeling

We built materialized views to calculate key metrics like trade volumes, spreads, and order depths. For frequent group-bys (symbol, venue, minute), we defined projections to reduce the need for full-table scans. All transformations were deterministic, ensuring consistent results during replays or backfills.

This approach enabled high-speed analytics, allowing real-time dashboards and intraday summaries to run directly within ClickHouse.

Phase 4: Serving & Analytics

We delivered real-time insights through Grafana dashboards, visualizing risk, latency, and operational metrics, while enabling ad-hoc analysis using ClickHouse SQL for quants, risk analysts, and data engineers.
Access controls were implemented with ClickHouse roles, quotas, and network allowlists to ensure secure and controlled query access.

Phase 5: Observability, Quality & Operations

We monitored system health using ClickHouse system tables to track ingestion lag, merge performance, and compaction load. Grafana dashboards displayed Flink job lag, checkpoint latency, and storage tier transitions, while runbooks guided recovery from compaction backlogs or replay issues.

To ensure data quality, we enforced schema checks in Flink pipelines and used statistical drift detection to flag abnormal volumes.

For security, we implemented role-based access, IP allowlists, and mTLS where supported, and confirmed that datasets contained no PII, maintaining regulatory compliance.

Implementation Highlights

We delivered a high-performance, production-ready analytics platform with the following key achievements:

Cycle Time: Reduced reporting from 1.5 days to ~15 minutes.

Low Latency: Achieved sub-second query performance for 7-day hot data windows.

High Throughput: Sustained ingestion of millions of rows per second.

Operational Stability: Enabled predictable merge and compaction behavior, reducing SRE interventions.

Cost Optimization: Implemented TTL-based tiering, ZSTD compression, and projections, cutting storage costs by ~40%.

Governance & Security: Enforced roles, quotas, and network-based access controls to maintain secure, compliant operations.

Deliverables

A complete set of artifacts was provided to support analytics, operations, and maintainability:

ClickHouse DDL Catalog: Table definitions, materialized views, projections, and rollups.

Flink Job Templates & Deployment Scripts: Stream processors with checkpointing and deterministic replay logic.

Grafana Dashboards: Monitoring for ingest lag, merge health, and risk/operations metrics.

SRE Runbook: Step-by-step procedures for incident response, replay recovery, and capacity planning.

Outcomes & Business Impact

The project delivered significant improvements across reporting, analytics, and operations. We reduced reporting latency from 36 hours to just 15 minutes in POC tests, enabling risk and strategy teams to access hot-path data with sub-second query performance, greatly increasing analyst productivity.
Streamlined ingestion and merge management improved operational stability, minimizing fire drills and downtime during peak trading periods.
The architecture was validated to scale to petabyte-level growth with predictable resource utilization and cost, while our deployment patterns provided a clear path to production.

Tech Stack

Data Sources:
Kafka (ticks, orders, fills, risk/ops logs)

Ingestion & Processing:
Apache Flink (parse, validate, enrich, checkpointing)

Storage / OLAP:
ClickHouse (MergeTree engine)

Transformation & Modeling:
ClickHouse materialized views, projections

Query & Serving:
Grafana dashboards, ClickHouse SQL

Governance & Security:
ClickHouse roles & quotas, allowlists, TLS/mTLS

Observability & Quality:
ClickHouse system tables, Grafana panels, drift detection

DevOps & CI/CD:
Versioned ClickHouse DDLs, Flink job templates, controlled rollouts

Cost & FinOps:
TTL tiering , ZSTD compression, projections to minimize scans

Target Architecture

The platform follows a streaming OLAP pipeline designed for high-throughput, low-latency analytics.

Raw data from Kafka topics including ticks, orders, fills, and risk logs—is ingested into Flink stream processors, where events are parsed, validated, and enriched.

Processed streams are stored in ClickHouse OLAP tables using the MergeTree engine, with materialized views and projections to accelerate common queries.

The serving layer exposes data via Grafana dashboards for operational and risk monitoring, and ClickHouse SQL for ad-hoc analytics.
Observability is maintained through system tables, lag and merge dashboards, and alerts/runbooks to monitor pipeline health and manage operational issues.

Conclusion: Enabling Low-Latency Analytics at Petabyte Scale

Through this proof-of-concept, we demonstrated a streaming + OLAP architecture capable of handling petabyte-scale data while delivering sub-second query performance. By combining Apache Flink for real-time processing with ClickHouse for fast analytical storage, we achieved reliable, high-throughput ingestion and instant insights for trading and risk management.

The solution proved to be technically robust and operationally sustainable, with tiered storage for cost control and strong observability. We laid the foundation for a production-ready, low-latency analytics platform, giving traders, analysts, and risk teams immediate visibility into fast-moving markets and empowering smarter, faster decisions.