Wavelet trees are data structures that support efficient queries for the k-th minimum element in a range by maintaining a segment tree over values instead of indices.

Range K-th Smallest

YS - Normal

Focus Problem – try your best to solve this problem before continuing!

Resources
	IOI	Wavelet Trees for Competitive Programming	Introduces Wavelet Tree
	CF	Intro to New DS: Wavelet Trees	Link in blog post is broken, check my comment.

Wavelet Tree Structure

To answer value-based queries efficiently, we'll create a segment tree where each node represents a range of values, instead of indices. Just like a normal segment tree, each subsequent level splits the range into two halves. Note that an index can appear in at most $\log(M)$ nodes.

Wavelet Tree Visualization

Let's say our array is: $[3,5,3,1,2,2,3,4,5,5]$ Each node has an array representing the indices of every number between $l$ and $r$

Solution - Range K-th Smallest

Before we solve this problem, let's consider a simpler version where we are asked, given a range, to count the number of occurrences of value $x$ .

Given a range $l$ , $r$ , count the number of occurrences of value x.

To calculate the number of occurrences from $𝑙$ to $𝑟$ , we can use the following formula:

\begin{aligned} \texttt{occurrences}(l, r) = \texttt{occurrences}(r) - \texttt{occurrences}(l) \end{aligned}

This reduces the problem to counting the number of occurrences in a prefix.

One way to solve the problem is to go to the leaf node and perform a binary search for the number of indices less than $𝑟$ However, let's explore a different approach that can also be extended to the second type of query.

Instead of binary searching on the leaf, we update $𝑟$ as we recurse down the tree. If we can determine the position (index) of $r$ in the left and right children of a node, we can recurse down the tree and determine its position in the leaf node.

To find the position of $𝑟$ in a node's left and right children, we need to determine how many indices are smaller than the middle value (mid) and precede $𝑟$ . This can be done using a prefix sum.

Let's define:

$c[i]$ = as $1$ if $\texttt{index}$ [ $i$ ] is smaller than mid otherwise $0$
$\texttt{prefix\_b}$ [ $i$ ] as prefix sum of $c[i]$

Formally

c[i] = \begin{cases} 1, & \texttt{if } \texttt{index}[i] < \texttt{mid} \\ 0, & \texttt{otherwise} \end{cases}

\texttt{prefix\_b}[i] = \texttt{prefix\_b}[i - 1] + c[i]

To update $r$ as we recurse down, we do the following:

To know the value of $r$ if we recurse left, we use $\texttt{prefix\_b}$ [ $r$ ]
If we recurse right, we use $r$ - $\texttt{prefix\_b}$ [ $r$ ]

Now let's try to solve our main problem.

Given a range $l$ , $r$ find the k-th smallest element

We will determine whether the answer for a given node is in the left or the right segment. We can calculate how many times the elements within the segments' ranges appear in our range $(l, r)$ using our first type of query. Note that this also works for non-leaf nodes using the following formula:

\texttt{occurrences}(l, r) = r - l

Similar

This is similar to counting how many times a value appears up to index $r$ in our previous query. We did this by using the new $r$ value at the leaf node. But now, we consider the difference between the updated $r$ and $l$

Therefore, the occurrences of the left node are

\texttt{left\_occurrences} = \texttt{prefix\_b}[r] - \texttt{prefix\_b}[l]

Note that $\texttt{left\_occurrences}$ is the number of indices between $l$ and $r$ whose value is less than $\texttt{mid}$

If $\texttt{left\_occurrences}$ is greater or equal to $k$ , it means the $k$ -th smallest element is in the left subtree. Therefore, we update our range and recurse into the left child
If $\texttt{left\_occurrences}$ is less than $k$ , it means the $k$ -th smallest element is in the right subtree. We adjust k by subtracting $\texttt{left\_occurrences}$ from $k$ , update our range, and recurse into the right child

Notice

Notice we still update $l, r$ accordingly when we go left or right

The answer then will be the value of the node we end up on (leaf).

In conclusion we maintain our ranges l and r as we recurse down to our child, and when we reach the child node we can return $r$ - $l$ .

Implementation

Time Complexity: $\mathcal{O}(Q \cdot \log(M))$

C++

#include <bits/stdc++.h>

using namespace std;
constexpr int MAX_VAL = 1e9 + 2;

struct Segment {
	Segment *left = nullptr, *right = nullptr;
	int l, r, mid;
	bool children = false;
	vector<pair<int, int>> indices;  // index, value

Supporting updates

Let's support updates that change the value at index  $i$ to  $y$ .

We can traverse down to the leaf to remove the old element and also traverse down to add the new element.

Let's first solve for adding a new element; erasing is similar but the opposite.

So what do the updates change?

Our indices vector
Our prefix vector

At each step of our recursion, the indices vector will need to be modified; we need to insert the new index. Since we can no longer maintain a sorted vector of the indices, we will switch to a set.

On the other hand, to change the prefix vector, since each update could change our prefix vector a lot, we can't maintain just the normal vector. What we could do is use a sparse segment tree.

erasing and inserting can be done by just setting the value to $0$ or $1$ at the specific index
querying for a prefix can be done by querying the segment tree from $0$ to $i$

This approach is not memory efficient and requires a segment tree's implementation. A more friendly approach would be using an order statistics tree, which is a binary search tree implementation in C++ that allows efficient queries for the rank of elements in a set. Querying for a prefix would then be equivalent to $\texttt{order\_of\_key}$ ( $i$ ).

Implemention

Time Complexity: $\mathcal{O}(Q \cdot \log(M) \cdot \log(N))$

C++

#include <bits/stdc++.h>

using namespace std;

#include <ext/pb_ds/assoc_container.hpp>
using namespace __gnu_pbds;
template <class T>
using Tree =
    tree<T, null_type, less<T>, rb_tree_tag, tree_order_statistics_node_update>;

Problems

Source	Problem Name	Difficulty	Tags
CF	Destiny	Normal	Show Tags Wavelet
SPOJ	I Love Kd-Trees	Normal	Show Tags Wavelet
COCI	2021 - Index	Normal	Show Tags Wavelet, Persistent Segtree
Kattis	Easy Query	Very Hard	Show Tags Wavelet
GlobeX Cup	Ninjaclasher's Wrath 2	Very Hard	Show Tags Wavelet

Module Progress:

Join the USACO Forum!

Stuck on a problem, or don't understand a module? Join the USACO Forum and get help from other competitive programmers!

Join Forum

Table of Contents

Wavelet Tree

Prerequisites

Table of Contents

Wavelet Tree

Wavelet Tree Structure

Wavelet Tree Visualization

Solution - Range K-th Smallest

Given a range $l$ , $r$ , count the number of occurrences of value x.

Given a range $l$ , $r$ find the k-th smallest element

Similar

Notice

Implementation

Supporting updates

Implemention

Problems

Module Progress:

Join the USACO Forum!

Table of Contents

Wavelet Tree

Prerequisites

Table of Contents

Wavelet Tree

Wavelet Tree Structure

Wavelet Tree Visualization

Solution - Range K-th Smallest

Given a range lll, rrr, count the number of occurrences of value x.

Given a range lll, rrr find the k-th smallest element

Similar

Notice

Implementation

Supporting updates

Implemention

Problems

Module Progress:Not Started

Join the USACO Forum!

Given a range $l$ , $r$ , count the number of occurrences of value x.

Given a range $l$ , $r$ find the k-th smallest element

Module Progress: