De Bruijn Sequence
Problem Overview
| Aspect | Details |
|---|---|
| Problem | Find the shortest binary string containing all n-bit patterns as substrings |
| Input | Integer n (1 <= n <= 15) |
| Output | Binary string of length 2^n + n - 1 |
| Core Technique | Eulerian circuit on de Bruijn graph |
| Time Complexity | O(2^n) |
| Space Complexity | O(2^n) |
Learning Goals
After studying this problem, you should understand:
- De Bruijn Sequence: A cyclic sequence where every possible n-length substring appears exactly once
- Reduction to Eulerian Circuit: How to model string construction as a graph traversal problem
- Hierholzer’s Algorithm: Finding Eulerian circuits efficiently
Problem Statement
Given an integer n, find the shortest binary string that contains every possible n-bit binary pattern as a substring.
Example: For n = 2, the 2-bit patterns are: 00, 01, 10, 11
One valid answer: 00110 (length 5 = 2^2 + 2 - 1)
- Position 0-1:
00 - Position 1-2:
01 - Position 2-3:
11 - Position 3-4:
10
Key Insight: Reduction to Eulerian Circuit
The brilliant insight is to model this as a graph problem:
Graph Construction:
- Nodes: All (n-1)-bit patterns (there are 2^(n-1) nodes)
- Edges: Each n-bit pattern becomes a directed edge
- Edge Rule: Pattern “abcd” creates edge from “abc” to “bcd”, labeled with ‘d’
Why Eulerian Circuit?
- Each n-bit pattern must appear exactly once -> each edge traversed exactly once
- An Eulerian circuit visits every edge exactly once
- Finding an Eulerian circuit gives us our de Bruijn sequence
Edge Construction Explained
For an n-bit pattern, we create an edge from its first (n-1) bits to its last (n-1) bits.
Example for n=3, pattern “011”:
Pattern: 0 1 1
-----
First 2: 0 1 -> Source node: "01"
Last 2: 1 1 -> Target node: "11"
Edge label: 1 -> The last bit
Edge: “01” --[1]--> “11”
This edge represents including pattern “011” in our sequence.
Why Eulerian Circuit Exists
For an Eulerian circuit to exist, every node must have equal in-degree and out-degree.
Proof for de Bruijn graph:
- Each node is an (n-1)-bit pattern
- From any node “abc”, there are exactly 2 outgoing edges: “abc0” and “abc1”
- To any node “abc”, there are exactly 2 incoming edges: “0abc” and “1abc”
- Therefore: in-degree = out-degree = 2 for every node
Since the graph is connected and balanced, an Eulerian circuit always exists.
Visual Diagram for n=3
Nodes: 2-bit patterns (00, 01, 10, 11)
Edges: 3-bit patterns
[0]
+----------+
| |
v |
+---[00]---+ |
| | | |
[0] | |[1] | |
| v | |
| [01]----+ |
| | |
| |[0] [1] |
| v | |
+--[10]<----+ |
| |
[0] |[1] |
v |
[11]----------+
| [0]
+------+
[1]
Edges (3-bit patterns):
000: 00 -> 00 100: 10 -> 00
001: 00 -> 01 101: 10 -> 01
010: 01 -> 10 110: 11 -> 10
011: 01 -> 11 111: 11 -> 11
Building the Result
Once we find an Eulerian circuit, the de Bruijn sequence is:
Result = Starting node + All edge labels in order
If circuit visits: 00 -[0]-> 00 -[1]-> 01 -[0]-> 10 -[0]-> 00 …
Result starts with: “00” + “0” + “1” + “0” + “0” + …
Length: (n-1) for starting node + 2^n edges = 2^n + n - 1
Dry Run for n=3
Step 1: Build Graph
Nodes: {00, 01, 10, 11}
Adjacency list (each node -> list of (neighbor, edge_label)):
00: [(00, '0'), (01, '1')]
01: [(10, '0'), (11, '1')]
10: [(00, '0'), (01, '1')]
11: [(10, '0'), (11, '1')]
Step 2: Hierholzer’s Algorithm
Start at node "00"
Stack: [00]
Path: []
Iteration 1: At 00, go to 01 via edge '1'
Stack: [00, 01], Path: []
Iteration 2: At 01, go to 11 via edge '1'
Stack: [00, 01, 11], Path: []
Iteration 3: At 11, go to 11 via edge '1'
Stack: [00, 01, 11, 11], Path: []
Iteration 4: At 11, go to 10 via edge '0'
Stack: [00, 01, 11, 11, 10], Path: []
Iteration 5: At 10, go to 01 via edge '1'
Stack: [00, 01, 11, 11, 10, 01], Path: []
Iteration 6: At 01, go to 10 via edge '0'
Stack: [00, 01, 11, 11, 10, 01, 10], Path: []
Iteration 7: At 10, go to 00 via edge '0'
Stack: [00, 01, 11, 11, 10, 01, 10, 00], Path: []
Iteration 8: At 00, go to 00 via edge '0'
Stack: [00, 01, 11, 11, 10, 01, 10, 00, 00], Path: []
Iteration 9: At 00, no more edges, pop to path
Stack: [00, 01, 11, 11, 10, 01, 10, 00], Path: [(00, -)]
Continue popping until stack empty...
Step 3: Build Result
Path (reversed): 00 -> 00 -> 01 -> 10 -> 01 -> 10 -> 11 -> 11 -> 00
Edge labels: 0 1 0 1 0 1 1 0
Result: "00" + "01010110" = "0001010110"
^^^^ ^^^^^^^^
start edge labels
Verify: Contains all 3-bit patterns:
000 at pos 0, 001 at pos 1, 010 at pos 2, 101 at pos 3
010 at pos 4, 101 at pos 5, 011 at pos 6, 110 at pos 7
Wait - we have duplicates! Let me recalculate...
Correct traversal (one valid circuit):
00 -[0]-> 00 -[1]-> 01 -[1]-> 11 -[1]-> 11 -[0]-> 10 -[1]-> 01 -[0]-> 10 -[0]-> 00
Result: "00" + "01110100" = "0001110100"
Python Implementation
def solve():
n = int(input())
if n == 1:
print("01")
return
# Number of (n-1)-bit patterns (nodes)
num_nodes = 1 << (n - 1)
mask = num_nodes - 1 # Mask for (n-1) bits
# Each node has edges to node*2 % num_nodes (append 0)
# and (node*2 + 1) % num_nodes (append 1)
# We track which edges are used: edge_used[node][0] and edge_used[node][1]
edge_used = [[False, False] for _ in range(num_nodes)]
# Hierholzer's algorithm
stack = [0] # Start from node 0 (all zeros)
path = []
while stack:
node = stack[-1]
# Try edge 0, then edge 1
if not edge_used[node][0]:
edge_used[node][0] = True
next_node = (node << 1) & mask # Append 0
stack.append(next_node)
elif not edge_used[node][1]:
edge_used[node][1] = True
next_node = ((node << 1) | 1) & mask # Append 1
stack.append(next_node)
else:
path.append(stack.pop())
# Reverse path to get correct order
path.reverse()
# Build result: start with first node in binary, then append edge bits
result = []
# First node as (n-1)-bit string
first_node = path[0]
for i in range(n - 2, -1, -1):
result.append('0' if (first_node >> i) & 1 == 0 else '1')
# For each edge in path, append the last bit of destination
for i in range(1, len(path)):
result.append('1' if path[i] & 1 else '0')
print(''.join(result))
solve()
Why This Works
Correctness Proof:
- Each n-bit pattern appears exactly once:
- Each n-bit pattern is represented by exactly one edge in the graph
- An Eulerian circuit traverses each edge exactly once
- Therefore, each pattern appears exactly once as a substring
- The result is the shortest possible:
- There are 2^n different n-bit patterns
- Any string containing all of them must have at least 2^n + n - 1 characters
- Our construction achieves exactly this length
- Every window of n bits is unique:
- When we traverse edge from node A to node B, we’re adding a character
- The n-bit window at that position is: last (n-1) bits of A + edge label
- This exactly corresponds to the edge (the n-bit pattern)
- Since each edge is traversed once, each window is unique
Intuition:
- The (n-1)-bit nodes represent “context” - what we’ve seen recently
- Each edge extends the context by one bit while maintaining the window size
- The Eulerian circuit ensures we use every possible extension exactly once
Complexity Analysis
| Operation | Complexity |
|---|---|
| Graph construction | O(2^n) nodes and edges |
| Hierholzer’s algorithm | O(2^n) - visits each edge once |
| Result construction | O(2^n) |
| Total Time | O(2^n) |
| Total Space | O(2^n) for edge tracking |
Related Problems
- CSES Teleporters Path - Eulerian path
- CSES Mail Delivery - Eulerian circuit
- LeetCode 753: Cracking the Safe - Same concept, different framing