Prufer Code
Problem Overview
| Attribute | Value |
|---|---|
| CSES Link | Prufer Code |
| Difficulty | Medium |
| Category | Graph Theory / Tree Encoding |
| Time Limit | 1 second |
| Key Technique | Degree Tracking, Leaf Removal |
Learning Goals
After solving this problem, you will be able to:
- Understand the bijection between labeled trees and Prufer sequences
- Encode a tree into its unique Prufer sequence
- Decode a Prufer sequence back into the original tree
- Apply Cayley’s formula: n^(n-2) labeled trees on n vertices
Problem Statement
Problem: Given a labeled tree with n nodes, construct its Prufer code (encoding). Given a Prufer code, reconstruct the original tree (decoding).
Input (Encoding):
- Line 1: n (number of nodes)
- Lines 2 to n: Two integers a, b representing an edge
Output (Encoding):
- n-2 integers: The Prufer code sequence
Constraints:
- 3 <= n <= 10^5
- Tree is connected with exactly n-1 edges
- Nodes are labeled 1 to n
Example
Input:
5
1 2
2 3
3 4
3 5
Output:
2 3 3
Explanation:
Tree structure:
1 - 2 - 3 - 4
|
5
Step 1: Smallest leaf is 1, parent is 2. Code: [2]. Remove node 1.
Step 2: Smallest leaf is 4, parent is 3. Code: [2, 3]. Remove node 4.
Step 3: Smallest leaf is 5, parent is 3. Code: [2, 3, 3]. Remove node 5.
Stop when 2 nodes remain (nodes 2 and 3).
Intuition: How to Think About This Problem
Pattern Recognition
Key Insight: A Prufer sequence uniquely encodes a labeled tree. Each node’s frequency in the sequence equals (degree - 1).
The Prufer code establishes a bijection between labeled trees on n vertices and sequences of length n-2 where each element is in [1, n]. This proves Cayley’s formula: there are exactly n^(n-2) labeled trees.
Breaking Down the Problem
- What are we looking for? A sequence that uniquely represents the tree structure.
- What information do we have? The tree edges (adjacency).
- What’s the relationship? Repeatedly remove the smallest leaf and record its neighbor.
The Core Algorithm
Encoding: Remove leaves in order, recording their parent. Decoding: Reconstruct edges using degree information from the sequence.
Solution 1: Brute Force - O(n^2)
Idea
Repeatedly find the smallest leaf, record its parent, and remove it from the tree.
Code
def encode_brute_force(n, edges):
"""
Generate Prufer code by repeatedly finding smallest leaf.
Time: O(n^2), Space: O(n)
"""
adj = [set() for _ in range(n + 1)]
for a, b in edges:
adj[a].add(b)
adj[b].add(a)
prufer = []
for _ in range(n - 2):
# Find smallest leaf (node with degree 1)
for v in range(1, n + 1):
if len(adj[v]) == 1:
parent = next(iter(adj[v]))
prufer.append(parent)
adj[v].remove(parent)
adj[parent].remove(v)
break
return prufer
Complexity
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n^2) | n iterations, each scanning up to n nodes |
| Space | O(n) | Adjacency storage |
Solution 2: Optimal - O(n) Linear Time
Key Insight
The Trick: Track node degrees and use a pointer to find the next smallest leaf efficiently.
Instead of scanning for leaves each time, maintain degree counts and process leaves in order.
Algorithm
- Compute degree of each node
- Find smallest leaf (degree = 1)
- Remove leaf, add its parent to code, update degrees
- If parent becomes a leaf AND is smaller than current pointer, process it immediately
- Otherwise, continue to next smallest unprocessed leaf
Dry Run Example
Tree: 1-2-3-4, 3-5
Edges: (1,2), (2,3), (3,4), (3,5)
Initial degrees: [-, 1, 2, 3, 1, 1] (1-indexed, node 3 has degree 3)
Step 1: ptr=1, degree[1]=1 (leaf)
Parent = 2, Code = [2]
degree[2] = 2-1 = 1 (becomes leaf, but 2 > ptr)
Step 2: ptr=2, degree[2]=1 (leaf)
But wait - we already have ptr at 1, continue scan
ptr=4, degree[4]=1 (leaf)
Parent = 3, Code = [2, 3]
degree[3] = 3-1 = 2
Step 3: ptr=5, degree[5]=1 (leaf)
Parent = 3, Code = [2, 3, 3]
degree[3] = 2-1 = 1
Stop: Only 2 nodes remain (2 and 3)
Final Prufer Code: [2, 3, 3]
Code
Python:
def encode_optimal(n, edges):
"""
Generate Prufer code in O(n) time.
Time: O(n), Space: O(n)
"""
adj = [[] for _ in range(n + 1)]
degree = [0] * (n + 1)
for a, b in edges:
adj[a].append(b)
adj[b].append(a)
degree[a] += 1
degree[b] += 1
# Find first parent of each node (for leaf removal)
parent = [0] * (n + 1)
for v in range(1, n + 1):
if adj[v]:
parent[v] = adj[v][0]
prufer = []
ptr = 1
leaf = 0
# Find first leaf
while ptr <= n and degree[ptr] != 1:
ptr += 1
leaf = ptr
for _ in range(n - 2):
# Find parent of current leaf
p = 0
for neighbor in adj[leaf]:
if degree[neighbor] > 0:
p = neighbor
break
prufer.append(p)
degree[leaf] = 0
degree[p] -= 1
# Check if parent became a smaller leaf
if degree[p] == 1 and p < ptr:
leaf = p
else:
ptr += 1
while ptr <= n and degree[ptr] != 1:
ptr += 1
leaf = ptr
return prufer
def decode_optimal(n, prufer):
"""
Reconstruct tree from Prufer code in O(n) time.
Time: O(n), Space: O(n)
"""
# Count occurrences (degree - 1 for each node)
degree = [1] * (n + 1)
degree[0] = 0
for p in prufer:
degree[p] += 1
edges = []
ptr = 1
leaf = 0
# Find first leaf (degree = 1)
while ptr <= n and degree[ptr] != 1:
ptr += 1
leaf = ptr
for p in prufer:
edges.append((leaf, p))
degree[leaf] -= 1
degree[p] -= 1
if degree[p] == 1 and p < ptr:
leaf = p
else:
ptr += 1
while ptr <= n and degree[ptr] != 1:
ptr += 1
leaf = ptr
# Add final edge between remaining two nodes
remaining = [v for v in range(1, n + 1) if degree[v] == 1]
if len(remaining) == 2:
edges.append((remaining[0], remaining[1]))
return edges
Complexity
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n) | Each node processed exactly once |
| Space | O(n) | Degree array and adjacency list |
Common Mistakes
Mistake 1: Wrong Leaf Selection Order
# WRONG - Using heap adds O(log n) overhead
import heapq
leaves = [v for v in range(1, n+1) if degree[v] == 1]
heapq.heapify(leaves) # Unnecessary!
Problem: Using a priority queue makes this O(n log n). Fix: Use the pointer technique shown above for O(n).
Mistake 2: Not Handling Immediate Leaf Creation
# WRONG - Always incrementing pointer
degree[p] -= 1
ptr += 1 # Wrong! Parent might be smaller leaf
Problem: When a parent becomes a leaf, it might be smaller than current pointer.
Fix: Check if degree[p] == 1 and p < ptr: leaf = p
Mistake 3: Forgetting Final Edge in Decoding
# WRONG - Missing final edge
for p in prufer:
edges.append((leaf, p))
# ... update degrees
return edges # Missing one edge!
Problem: Prufer code has n-2 elements but tree has n-1 edges. Fix: After processing, add edge between the two remaining degree-1 nodes.
Edge Cases
| Case | Input | Expected Behavior | Why |
|---|---|---|---|
| n = 3 (minimum) | Line: 1-2-3 | Code: [2] | Smallest valid tree for Prufer |
| Star graph | 1 connected to 2,3,4,5 | Code: [1,1,1] | Center appears n-2 times |
| Path graph | 1-2-3-4-5 | Code: [2,3,4] | Each internal node once |
| Node n is leaf | Any tree with n as leaf | n never in code | Leaves don’t appear |
When to Use This Pattern
Use Prufer Codes When:
- Counting labeled trees (Cayley’s formula applications)
- Generating random labeled trees uniformly
- Encoding tree structures compactly
- Proving bijective results in combinatorics
Don’t Use When:
- Working with unlabeled/rooted trees (use different encodings)
- Need to preserve parent-child relationships (use parent arrays)
- Tree has additional attributes (weights, colors)
Pattern Recognition Checklist:
- Need to count labeled trees? -> Cayley’s formula via Prufer
- Need unique tree encoding? -> Prufer sequence
- Need to generate random tree? -> Random Prufer code + decode
Related Problems
Easier (Do These First)
| Problem | Why It Helps |
|---|---|
| Tree Diameter | Basic tree traversal |
| Tree Distances I | Tree DFS/BFS fundamentals |
Similar Difficulty
| Problem | Key Concept |
|---|---|
| Counting Trees | Cayley’s formula application |
| New Roads Queries | Tree construction |
Harder (Do These After)
| Problem | New Concept |
|---|---|
| Tree Isomorphism I | Tree hashing, canonical forms |
| Tree Isomorphism II | Rooted tree isomorphism |
Key Takeaways
- The Core Idea: Prufer codes create a bijection between labeled trees and (n-2)-length sequences.
- Time Optimization: Pointer technique avoids repeated scanning for O(n) complexity.
- Mathematical Significance: Proves Cayley’s formula: n^(n-2) labeled trees exist.
- Pattern: Degree tracking + ordered leaf removal = efficient tree encoding.
Practice Checklist
Before moving on, make sure you can:
- Encode a tree to Prufer code without looking at solution
- Decode a Prufer code back to tree edges
- Explain why the bijection works (node appears degree-1 times)
- Derive Cayley’s formula from Prufer code properties
Additional Resources
- CP-Algorithms: Prufer Code
- Wikipedia: Cayley’s Formula
- CSES Prufer Code - Tree encoding