How We Accidentally Became One of UK's First StarRocks Production Pioneers

We moved home-page analytics off Postgres to StarRocks, cutting p95 from ~20s to ~200 ms with minute-level freshness. Kafka→Flink powers…

Anton Borisov

fresha-data-engineering

· ~11 min read · August 18, 2025 (Updated: August 19, 2025) · Free: Yes

By mid-2024 our data platform had a split personality. Snowflake powered classic BI and the occasional heavy export, while Postgres — our customer-facing OLTP — was taking an OLAP beating from ad-hoc analytics and product dashboards. It worked… until it didn't: slowdowns during traffic spikes, unpredictable dashboard latency, and a creeping fear that one more "quick" team query would tip an important transaction path over.

It was fine… right up until it wasn't.

We needed two things at once:

a way to work with large volumes of historical data without punishing Postgres, and
an analytics engine that engineers could adopt quickly and that would scale predictably as our usage grew.

StarRocks fit that gap unusually well. The MySQL wire protocol meant every engineer — and most BI tools — could connect with zero ceremony. Federated queries let us keep historical data in open table formats while accelerating the hot paths. And for our most time-sensitive paths we also write into StarRocks' internal columnar format, which gives us the lowest end-to-end latency for near real-time product and operational analytics.

After a short evaluation we shipped our first production workloads. To our knowledge, that made Fresha one of the early UK adopters running StarRocks in production since Spring, 2025 — not because we chased novelty, but because the problem demanded a tool that balanced speed, simplicity, and open data.

In this post, I'll share why we chose StarRocks, the architecture we landed on, what improved (and what didn't), plus a few traps we'd sidestep if we were starting today.

Quick primer: What StarRocks is

StarRocks is a vectorized, MPP SQL engine designed for sub-second to single-digit-second analytics at scale. A few traits that mattered for us:

Native StarRocks storage (internal tables). A columnar format with key models (duplicate/aggregate/primary) and fast ingest (e.g., via Flink or routine loads). This is our shortest path to "near real-time" availability for the hottest metrics.
Lakehouse acceleration. Reads external tables in open formats (e.g., Iceberg/Paimon/Hive) via catalogs and pushes down filters/projections to cut object-store scans — perfect for historical data.
Materialized views with transparent rewrite. Define incremental rollups or pre-joins; the optimizer rewrites eligible queries automatically to hit those MVs.
Shared-data architecture. Compute scales elastically without shuffling data between nodes, which keeps costs and latency predictable during bursts.
MySQL protocol & ecosystem fit. Works out of the box with common BI tools and familiar client libraries, so engineers can adopt it fast.

Diagram of client → FE (Leader/Follower/Observer) → CN execution/caches → distributed storage.

StarRocks shared-data architecture — clients connect over MySQL to FE nodes (Leader, Followers/Observers) for catalog + query coordination, while CNs execute queries and cache data; durable data sits in distributed storage, so compute scales by adding CNs without reshuffling storage.

In short: we run a hybrid model — external catalogs for breadth over the lakehouse, and internal tables for the most time-critical slices — keeping data open where it makes sense while making common queries fast.

Where We Started: Postgres & Snowflake

Our stack worked, but each piece was doing a job it wasn't designed for:

Postgres (OLTP) was powering customer-facing systems and, increasingly, ad-hoc analytics. Wide joins and heavy aggregates created head-of-line blocking, noisy neighbors, and the occasional "why is checkout slow?" incident.
Snowflake (BI/exports) handled classic dashboards and heavy batch exports well, but it wasn't cost-effective — or fast enough — for chatty, near-real-time product and operational analytics.

So we wrote down the constraints for the tool we were missing:

Protect PostgreSQL by shifting historical analytics off the OLTP path.
Stay open-format first (Iceberg/Paimon on object storage) to use our lakehouse as the source of truth and operate with historical data without bloating PostgreSQL.
Be engineer-friendly: MySQL protocol, standard drivers, minimal re-tooling.
Predictable scale: handle spikes without days of capacity planning.
Tiered freshness: seconds-to-minutes for the hottest paths, minutes for everything else.
Low-ops: fewer bespoke pipelines, fewer "special" systems to babysit.

StarRocks matched that checklist unusually well. It let us federate over open tables for breadth, and for the most time-sensitive metrics we could ingest into StarRocks internal tables to get the shortest end-to-end path.

Architecture at a Glance

Think of the platform as a single ingestion spine with three lanes branching out: a real-time lane into StarRocks internal tables, a historical lane into Iceberg/Paimon, and a search lane into Elasticsearch. StarRocks sits in the middle as the one SQL surface. Engineers connect with the MySQL protocol and can join across all three lanes without thinking about where the data physically lives

Diagram showing Sources→Kafka→Compute (Flink/Spark) ↔ Lake (Iceberg/Paimon) and Sinks where StarRocks federates to the lake via external catalogs, alongside Kafka and Postgres sinks.

High-level data flow at Fresha — Sources (Postgres → Debezium + Schema Registry → Kafka), Compute (Flink & Spark), Lake (Iceberg + Paimon), and Sinks with StarRocks as the unified SQL endpoint. StarRocks reaches the lake via external catalogs, while compute reads/writes the lake for real-time and historical paths.

Ingestion spine. Everything begins with Debezium streaming change events from Postgres into Kafka using Avro with a Schema Registry. That gives us a strongly typed, evolvable envelope for CDC and a single, reliable backbone for downstream consumers. Kafka is the fan-out point: from here, Flink and Spark pick up the same truth and drive it into the stores that are best for each access pattern.

Real-time lane (StarRocks internal). For paths where freshness is measured in seconds and user experience depends on consistent tail latency, Flink writes directly into StarRocks' internal columnar format. We use primary-key tables where rows need to be kept current (change-log style updates), aggregate-key tables when we can pre-roll metrics (sums, counts, mins/maxes), and duplicate-key tables for append-heavy streams that are later compacted or summarized. This arrangement is intentionally "short": Kafka → Flink → StarRocks → dashboards/API. It removes object storage from the critical path and lets us absorb spikes by scaling StarRocks backends rather than waiting on remote listings and GETs. On top of those internal tables we define materialized views (MVs) for common rollups and pre-joins — StarRocks' optimizer will rewrite eligible user queries to hit those MVs transparently, which keeps the SQL our teams write pleasantly boring.

Historical lane (Iceberg/Paimon). Not all queries are urgent, and almost none of them are only about "right now." Flink lands operational CDC into Paimon, and both Flink and Spark write long-term facts and slowly-changing dimensions into Iceberg on object storage. Spark handles the heavier work: backfills, repairs, compaction, and producing consistent snapshots across large time ranges. This gives us cheap, durable history with proper schema evolution and partitioning; it also lets us keep the lakehouse as the source of truth. StarRocks attaches external catalogs to both Iceberg and Paimon, so historical queries can federate over open formats without copying data. When a backfill lands, we can rebuild or refresh the relevant MVs inside StarRocks to make the historical slice just as snappy as the real-time slice.

Search lane (Elasticsearch). Some workloads are not strictly relational: fuzzy match, prefix/suffix search, tokenization, relevance scoring. We index those shapes into Elasticsearch from the same Kafka/lakehouse truth (via Flink or Spark), then expose them to engineers through StarRocks' experimental Elasticsearch catalog. The value isn't that ES exists — we could have told teams to call ES directly — it's that they don't have to. From their perspective, a search-heavy index is just another table they can join to, using SQL, inside the same connection they already use for analytics. That keeps the cognitive load low and the infra surface area centralized.

Horizontal bands for Ingestion, Real-time, Historical, and Search. Boxes show Postgres → Debezium/Schema Registry → Kafka, then flows to Flink→StarRocks internal→MVs→clients; Flink→Paimon and Spark/Flink→Iceberg with StarRocks catalogs; and Spark/Flink→Elasticsearch with StarRocks ES catalog. Arrows connect the fan-out from Kafka into each lane.

Ingestion spine with Kafka (typed CDC via Debezium/Schema Registry) fans out into three lanes: Real-time (Flink → StarRocks internal + MVs → dashboards/APIs), Historical (Flink→Paimon; Spark/Flink→Iceberg; StarRocks external catalogs for federation/MV refresh), and Search (Spark/Flink→Elasticsearch; joined from SQL via the ES catalog).

StarRocks is our front door: one MySQL endpoint that unifies hot data, history, and search. We keep the fastest-changing slices in StarRocks' internal columnar tables, long-term facts and dims in Iceberg/Paimon, and text-heavy shapes in Elasticsearch. StarRocks reaches all three via external catalogs, so engineers write plain SQL and don't think about where the bytes live.

The optimizer and materialized-view rewrite send each query to the fastest path — internal tables or MVs when possible, pushdown to the lakehouse or ES when it isn't. We run StarRocks in shared-data mode to decouple compute from storage, letting us scale backends for bursts without reshuffling data, which keeps tail latency predictable and operations simple. Backfills land in the lakehouse and show up through federation or refreshed MVs, so the interface stays stable while the data underneath evolves.

Three stacked cards labeled Hot, Warm, and Deep history, summarizing our real-time, lakehouse, and backfill paths.

Freshness tiers we run in production — Hot (seconds) via Kafka → Flink → StarRocks internal; Warm (minutes) via StarRocks over Iceberg/Paimon (federation or optional MVs); Deep history via Iceberg/Paimon with Spark backfills as versioned snapshots.

Case study: fixing slow home-page analytics

Our home page surfaces customer-facing analytics — "top employees" (two-month comparison), "top services," and a feed of recent sales. Served from Postgres, these worked for small accounts but collapsed for our largest partners: 15–20 s loads or outright timeouts, plus collateral damage to OLTP. The failure pattern was classic: a cold miss blew the buffer cache, the first request timed out after dragging in piles of pages, and the next request "succeeded" while polluting RAM and slowing unrelated transactions.

We moved these views to StarRocks with a non-negotiable requirement: minute-level staleness so users don't wonder where a just-completed sale went. Our first try was Iceberg. Functionally fine; operationally not — frequent small files and compaction pressure made minute-grade freshness unpredictable. We shifted the hot path to StarRocks internal tables and kept Iceberg/Paimon as the historical record.

Critically, we didn't use materialized views. Instead, we modeled the domain as layered SQL views on top of the internal tables so developers reuse semantics rather than re-invent them. The stack looks like this:

a base rt_sales table ingested by Flink (Debezium→Kafka→Flink→StarRocks),
a canonical vw_sales_enriched view that applies business joins and status rules,
a vw_recent_sales view that defines "recent" (time window, eligible statuses),
and higher-level views like vw_top_employees_2m and vw_top_services that build on those layers.

Layered SQL views for the home page: rt_sales (StarRocks internal, CDC via Debezium→Kafka→Flink) → vw_sales_enriched (business joins, status rules, partition predicate, derived filter fields) → vw_recent_sales (recent window + eligible statuses) → vw_top_employees_2m and vw_top_services. It shows a day-based predicate (e.g., date_trunc(‘day’, sold_at)) and derived fields (day, is_eligible, provider_bucket, sale_status_normalized) to prevent full scans and simplify queries.

Layered views example for the home page — rt_sales (CDC upserts) → vw_sales_enriched (business joins, status rules, partition predicate, and derived filter fields like day, is_eligible, provider_bucket) → vw_recent_sales → vw_top_*.

Because the semantics live in views, product teams just query vw_top_* and vw_recent_*; they don't have to remember which statuses count, what "recent" means, or how a sale is "enriched." StarRocks' optimizer pushes filters and projections through the view stack, so we still get plan quality without managing MV refresh schedules.

Result: home page analytics queries now render in ~200 ms even with the heaviest filters and aggregations, with the minute-level freshness users expect. Postgres is no longer an accidental cache; OLTP stays hot for transactions, and StarRocks absorbs the analytical burstiness of the home page. Deep history remains in the lakehouse (Spark backfills Iceberg/Paimon), and those same layered views work over longer windows by federating when needed — no codepaths forked, just one set of reusable definitions.

Time-series chart of p50–Max latencies for the home-page analytics endpoint; after the StarRocks feature flag is enabled, the curves flatten sharply with sub-second p95 and far fewer tail spikes.

Latency percentiles before vs. after enabling the StarRocks path (feature flag). Left: legacy Postgres-backed queries with frequent multi-second spikes; right: StarRocks on — p95 collapses to ~sub-second and long-tail (p99/p99.9) spikes disappear.

Pitfalls we hit — and how we fixed them

200 ms on the surface; several lifetimes of infra underneath xD

DDL migrations without foot-guns. We built an ActiveRecord-style migration tool: hierarchical naming conventions, explicit up/down SQL for every change, and a declarative schema version tracked in StarRocks (a single source of truth we bump atomically). Because StarRocks applies many DDLs asynchronously, the tool polls alter status and won't advance the version until all background jobs reach a terminal FINISHED state; on failure it rolls back via the paired down. The result: reversible, team-safe schema evolution aligned with StarRocks' semantics.

Reasoning about query performance. People can't optimize what they can't see. We standardized on EXPLAIN ANALYZE profiles and a small set of common sense metrics(scanned bytes, partitions touched, join type, P50/P95). That gave everyone the same mental model for "what got slow": too many partitions, the wrong join strategy, or a filter that couldn't be pushed down.

Partitioning that doesn't leak into product code. We partition by time and distribute by business keys (e.g., provider_id), but we hide predicates inside views so developers don't accidentally scan everything. vw_recent_sales encodes "recent" and eligible statuses; higher-level views build on it. The planner still pushes filters through the stack, but callers don't have to remember the exact partition math.

Dimensional joins without massive shuffles. Large fact ↔ small dim uses broadcast; large ↔ large uses colocate (align distribution keys and bucket counts) or bucket-shuffle when colocation isn't possible. We version dims and keep them narrow for broadcastability; when a dim grows, we promote it to a colocate group and update the distribution to match the main fact.

Data skipping & the "right" indexes. We leaned on StarRocks' zone maps (per-segment min/max) and the prefix/short-key index on sort keys to make range/point probes cheap, plus selective bloom/bitmap indexes only where they moved the needle. The rule: prove with a profile that an index saves scanned bytes before adding it, and remove stale ones.

Schema evolution without breaking ingestion. All changes start in Avro Schema Registry with compatibility checks; writers roll out last. Internal tables take additive columns first; views use versioned definitions (vw_sales_enriched_v2) and a vw_sales_enriched pointer we flip after backfill. Flink sinks are idempotent or reconciled by PK, and CI gates any change that would invalidate downstream models.

Closing Notes

StarRocks is becoming our reliable, everyday tool for analytics: one SQL surface that unifies hot paths, history, and search; predictable performance in shared-data mode; and developer-friendly ergonomics that let teams ship with plain SQL instead of bespoke pipelines. It unlocks the patterns we care about — near-real-time reads on internal tables, federated history on open formats, and search joins via the ES catalog — while keeping the lakehouse as the source of truth.

There's much more under the hood, and we'll share it next:

How we tame "monstrous" multi-way joins
How full-text search threads through StarRocks
Where we flatten in Flink vs. leave shape to StarRocks
The guardrails that keep schema changes safe
Post-mortem: the sudden 10× S3 cost spike after a table delete

* how full-text search threads through StarRocks; where we flatten in Flink vs. leave shape to StarRocks; the guardrails that keep schema changes safe; and the post-mortem on a sudden 10× S3 cost spike after a table delete.

We've only warmed up — more tales to spill, more queries to break and tame, more wows ahead. :)

Credits

Anton Borisov — chief instigator; visionary when useful, rule-breaker when necessary. Operating motto: "I'll find a way or make one" — then open a PR.
Emiliano Mancuso — head of the hand-waving department; insists there's a simpler way and, annoyingly, is sometimes right.
Nicoleta Lazar — resident Java enjoyer; keeper of the "great success" stamp, applied sparingly.
Samuel Valente — Terraform alchemist; conjures Snowflake out of terraform and data out of Spark, refuses to explain the trick twice.
Paritosh Anand — DevEx whisperer; YAML by day, Spark tamer by night; turns infra riddles into runnable things.
Piotr Królikowski — patron saint of shared-data; carried our StarRocks cluster through several "learning experiences".
Emil Smoleński — Datadog realist-in-chief; turns dashboards into budget interventions and brings everyone back to earth with the bill-shaped reality check.

Huge thanks to everyone at Fresha who made this possible — Data Platform & DevEx, Product & Design, Analytics, SRE & Infra, Security, Finance/Procurement, Legal/Compliance, and Support. You reviewed schemas, asked the hard questions, kept on-call sane, unblocked access, watched costs, and generally kept the lights on while we iterated.

If I missed your name or team, blame word count, not appreciation. Thank you for the rigor, patience, and partnership — this only works because of you.

And a special thanks to Fresha for consistently choosing leadership over comfort.

#starrocks #data-engineering #real-time-analytics #data-lakehouse #kafka