modellerUpdated 2026-04-25

Cold-start Latency

What this covers

Every deploy resets the warm cache. The Cold-start Latency dashboard, on the Model Health tab, lets the modeller see exactly what happens to query latency in the seconds and minutes after a deploy — whether predictive aggregates absorbed the first BI users, or whether queries fell through to the source.

Where it lives

Open Model Builder → Model Health tab → Cold-start latency section. It sits between Model information and Model alerts.

The section is empty until the model has been deployed at least once.

What you see

Stat cards

CardMeaning
DeployedTimestamp of the most recent deploy.
SamplesThe number of queries observed since that deploy, capped by the configured sample_size (default 25).
Median (post-deploy)Median execution time across the sampled queries.
P95 (post-deploy)95th-percentile execution time across the sampled queries.
Baseline median (Nd before)Median execution time across the N days of queries before the deploy (default 14). The bar to compare against.

The chart

An inline SVG bar chart, one bar per sample in deploy order:

Hover any bar for a tooltip with the sample sequence, execution time, and aggregate flag.

How to read it

A healthy post-deploy chart looks like:

A predictive-aggregate misfire looks like:

A capacity issue looks like:

API

The section reads from GET /api/v1/models/{model_id}/cold-start:

Query paramDefaultMeaning
sample_size25Cap on the number of post-deploy queries to render.
baseline_days14Window of pre-deploy queries used to compute the baseline median.

The endpoint reads Model.last_deployed_at and joins against query_logs.created_at. Pre-deploy percentile is computed inline (median + p95) on at most 2,000 rows from the baseline window.

When the section is empty

Related