Understanding Kleinberg’s Burst Detection Algorithm

In the previous blog we looked at Kleinberg’s Burst Detection Algorithm. In this blog we will break down cost computation in burst detection algorithm in simple terms.

Quick Recap of the Burst Detection Concept

You’re tracking event frequencies over time (like keyword counts per day).
You want to detect bursts, meaning sudden spikes in activity.
Kleinberg models this with states (normal state, burst state)
Each state has:
- An expected frequency of activity
- A cost to stay or switch between states
- A cost for how well the state fits the observed data

The goal is to find the sequence of states (one per time window) that explains the data with minimum total cost.

In Kleinberg’s burst detection model, the expected frequency for each state is not guessed—it’s calculated based on a base rate and scaled exponentially.

Key Parameters That Affect Results:

Gamma (γ): Controls how costly it is to switch states. High gamma = fewer switches.
Alpha (α): A scaling factor that controls the base rate increase between state

Here’s the idea:

Choose the number of states S depending on how granular you want burst detection to be.
Choose α empirically (e.g., 1.5 to 3.0): α is a constant > 1 (like 2.0), which controls how quickly expected frequency increases across states
We define a base rate, r, which is the average frequency per unit time (e.g., average number of keyword mentions per day). Compute r from the data (mean frequency per time unit)
Each state i has an expected frequency rᵢ defined by:
rᵢ = r × αⁱ
Where:
- r is the base rate (average over time)
- α is a constant > 1
- i is the state level (0 for normal, 1 for low burst, etc.)

Observed Event Counts

Let’s say we have the following observation:

Day	Count of “earthquake”
1	2
2	3
3	25
4	30
5	4

We’re monitoring how many times the word “earthquake” appears in daily news over 5 days. We want to detect if and when a burst happened.

Choose Parameters

Number of states (S): Let us model this with 2 states
- State 0 = normal
- State 1 = burst
Base rate (r): Average over all days: Over 5 days the word was mentioned 64 times
→ base rate r = (2 + 3 + 25 + 30 + 4) / 5 = 12.8
Alpha (α): 2 (We choose alpha as 2; scaling factor between states)
So expected rates per state:
- State 0: 12.8 × 2⁰ = 12.8
- State 1: 12.8 × 2¹ = 25.6
Gamma (γ): Controls penalty for switching states (we’ll assume 1 for now)

Build the Cost Table

We now calculate the cost of being in each state on each day using:

Emission cost: Determined based on how well the expected rate matches the observed rate
Transition cost: A penalty if we switch from one state to another

We build a dynamic programming table that keeps track of the minimum total cost to be in each state for each day.

Emisstion Cost

We’ll use a simple log-likelihood cost function: Emission cost = -(observed count × log(expected rate) – expected rate)

We go day-by-day, for each state:

Calculate the emission cost using the observed count and that state’s expected frequency.

The emission cost for each state is the cost incurred in emitting the words at the observed frequency. The higher the deviation between the expected frequency of a state and the observed count, higher the cost incurred.

Total Cost

The system attempts to minimise the total cost which is the sum of Emission Cost and Transition Cost.

Transition cost is the cost incurred by the system when it transitions from one state to another. Higher transition cost means syetm will enter burst state only when there is significant increase in the frequency of the keyword. In this example, we are taking the cost of transition from one state to another as 1.

Let us compute the Total Cost for the given observations.

Day1: Transition cost is not applicable for Day1, therefore Total cost is same as emission cost. Since the minimum cost is incurred in State 0 (as highlighted), we predict that system was in Normal State

Day2: Since the system is in state S0 on Day 1, there is no transition cost if the system stays in S0. Therefore Total cost for S0 is the same as the emission cost. For the system to switch to state S1, the total cost = Emission Cost + Transition Cost = 15.9 + 1 = 16.9

Since the minimum cost is incurred in State 0, the predicted state for Day2 is S0

Day 3: Since the system is in state S0 on Day 3, there is no transition cost if the system stays in S0. Therefore Total cost for S0 is the same as the emission cost.

For the system to transition to state S1, the total cost = Emission Cost + Transition Cost = -55.5 + 1 = -54.5

Since the minimum cost is incurred in State 1, the predicted state for Day3 is S1

Day 4: Since the system is in state S1 on Day 3, there is a transition cost of +1 if the system transitions to S0. Therefore Total cost for S0 = Emission Cost + Transition Cost = -63.7 + 1 = -62.7

For the system to be in state S1, the total cost = Emission Cost + Transition Cost. Since the system was in S1 previously, Transition cost is 0.

Since the minimum cost is incurred in State 1, the predicted state for Day2 is S1. Similarly, the predicted state for Day 5 is S0.

The final outcome is shown below which shows that burst occurred on Day 3 and 4.

This example illustrates how the cost are computed and states are predicted. We can use R and python packages to do burst detection for us.

Implementation

Burst detection can be applied to both continuous event streams—like incoming emails—and discrete event batches, such as titles submitted to a yearly conference.

Kleinberg describes both these scenarios – streaming and batch in his paper.

The first half of the paper, involves detecting bursts in continuous streams of events. Package ‘bursts‘ in R implements the algorithms on burst detection in streaming events.

Nikki Marinsek has implemented the burst detection in batch events in the python. Check out her blog for more details.

In conclusion, understanding the cost computation in Kleinberg’s burst detection algorithm is key to grasping how it effectively identifies bursts in event streams. By modeling the data as a hidden Markov process and balancing transition and state costs, the algorithm smartly pinpoints periods of unusually high activity. This cost-based framework ensures both precision and flexibility, making it a powerful tool for detecting meaningful patterns in time-series data.

Check out this piece India in the News: Words That Defined the UPA and NDA Years – which uses Burst Detection to compare news headlines from two decades.

Quick Recap of the Burst Detection Concept

Key Parameters That Affect Results:

Here’s the idea:

Observed Event Counts

Choose Parameters

Build the Cost Table

We build a dynamic programming table that keeps track of the minimum total cost to be in each state for each day.

Emisstion Cost

Total Cost

Implementation

Like this:

Leave a Reply Cancel reply

Recent Posts

Stories

Categories

Quick Recap of the Burst Detection Concept

Key Parameters That Affect Results:

Here’s the idea:

Observed Event Counts

Choose Parameters

Build the Cost Table

We build a dynamic programming table that keeps track of the minimum total cost to be in each state for each day.

Emisstion Cost

Total Cost

Implementation

Share this:

Like this:

Leave a Reply Cancel reply

Recent Posts

Stories

Categories