Word Combinations
Problem Overview
| Attribute | Value |
|---|---|
| Problem Link | CSES 1731 - Word Combinations |
| Difficulty | Medium |
| Category | String Algorithms / Dynamic Programming |
| Time Limit | 1 second |
| Key Technique | Trie + DP |
Learning Goals
After solving this problem, you will be able to:
- Build and traverse a Trie data structure for dictionary lookups
- Combine Trie traversal with dynamic programming for string segmentation
- Count ways to partition a string using dictionary words (modular arithmetic)
- Recognize when Trie-based DP is more efficient than naive approaches
Problem Statement
Problem: Given a string and a dictionary of words, count the number of ways to form the string by concatenating dictionary words. Each word can be used any number of times.
Input:
- Line 1: String s (the target string)
- Line 2: Integer k (number of words in dictionary)
- Next k lines: Dictionary words
Output:
- Number of ways to construct the string, modulo 10^9 + 7
Constraints:
-
1 <= s <= 5000 - 1 <= k <= 10^5
-
1 <= word <= 5000 - Total length of all dictionary words <= 10^6
Example
Input:
ababab
4
ab
abab
b
bab
Output:
6
Explanation: The six ways to form “ababab” are:
- “ab” + “ab” + “ab”
- “ab” + “abab”
- “abab” + “ab”
- “ab” + “b” + “ab” + “ab” (invalid - “b” alone doesn’t work here)
Wait - let’s verify: The valid combinations are based on how dictionary words can concatenate to form “ababab”.
Intuition: How to Think About This Problem
Pattern Recognition
Key Question: How do we efficiently check all possible word segmentations?
This is a word segmentation counting problem. At each position in the string, we ask: “Which dictionary words start here?” For each match, we recursively count ways to form the remaining suffix.
Breaking Down the Problem
- What are we counting? Number of ways to partition string into dictionary words.
- What makes this hard? Naive check at each position is O(k * max_word_len).
-
What’s the insight? Use a Trie to check all dictionary words starting at position i in O( s ) total.
Analogy
Think of building the string like climbing stairs where each step can be of varying heights (word lengths). The Trie tells us which step sizes are available at each position.
Solution 1: Naive DP with Hash Set
Idea
Use DP where dp[i] = number of ways to form s[i:]. For each position, check all possible word endings.
Algorithm
- Store dictionary words in a hash set
- For each position i from end to start
- Try all substrings s[i:j] and check if in dictionary
- If yes, add dp[j] to dp[i]
Code
def solve_naive(s, words):
"""
Naive DP solution using hash set.
Time: O(n^2 * max_word_len) - substring creation and hashing
Space: O(n + total_word_length)
"""
MOD = 10**9 + 7
n = len(s)
word_set = set(words)
max_len = max(len(w) for w in words) if words else 0
dp = [0] * (n + 1)
dp[n] = 1 # Base case: empty suffix has 1 way
for i in range(n - 1, -1, -1):
for j in range(i + 1, min(i + max_len + 1, n + 1)):
if s[i:j] in word_set:
dp[i] = (dp[i] + dp[j]) % MOD
return dp[0]
Complexity
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n^2 * L) | n positions, up to n lengths, L for hashing |
| Space | O(n + W) | DP array + word storage |
Why This Is Slow
Creating substrings and hashing them repeatedly is inefficient. For each of n positions, we might create O(n) substrings.
Solution 2: Optimal - Trie + DP
Key Insight
The Trick: Build a Trie from dictionary words. At each position i, traverse the Trie following characters s[i], s[i+1], … and collect DP contributions whenever we hit a word ending.
DP State Definition
| State | Meaning |
|---|---|
dp[i] |
Number of ways to form the suffix s[i:] using dictionary words |
In plain English: dp[i] answers “how many ways can I build the string starting from position i?”
State Transition
dp[i] = sum of dp[j] for all j where s[i:j] is a dictionary word
Why? If s[i:j] is a valid word, we can use it and then have dp[j] ways to complete the rest.
Base Case
| Case | Value | Reason |
|---|---|---|
dp[n] |
1 | Empty suffix can be formed in exactly 1 way (use no words) |
Algorithm
- Build Trie from all dictionary words
- Initialize dp[n] = 1
- For i from n-1 down to 0:
- Start at Trie root
- For j from i to n-1:
- Move to child node for character s[j]
- If no such child, break
- If current node is end of word, add dp[j+1] to dp[i]
- Return dp[0]
Dry Run Example
Input: s = "aba", words = ["a", "ab", "ba"]
Trie structure:
root
/ \
a b
| |
(end) a
| (end)
b
(end)
DP computation (right to left):
Initial: dp = [0, 0, 0, 1]
^ dp[3] = 1 (base case)
i = 2: Check s[2:] = "a"
- Traverse: root -> 'a' (is end!)
- dp[2] += dp[3] = 1
dp = [0, 0, 1, 1]
i = 1: Check s[1:] = "ba"
- Traverse: root -> 'b' -> 'a' (is end!)
- dp[1] += dp[3] = 1
dp = [0, 1, 1, 1]
i = 0: Check s[0:] = "aba"
- Traverse: root -> 'a' (is end!) -> dp[0] += dp[1] = 1
- Continue: 'a' -> 'b' (is end!) -> dp[0] += dp[2] = 2
- Continue: 'b' -> 'a'? No 'a' child from 'b' under 'a'. Stop.
dp = [2, 1, 1, 1]
Answer: dp[0] = 2
Ways: "a" + "ba" and "ab" + "a"
Visual Diagram
String: a b a
Index: 0 1 2 3
dp[3] = 1 (base)
Position 2: "a" matches
dp[2] = dp[3] = 1
Position 1: "ba" matches
dp[1] = dp[3] = 1
Position 0: "a" matches -> dp[1] = 1
"ab" matches -> dp[2] = 1
dp[0] = 1 + 1 = 2
Code
def solve_trie_dp(s, words):
"""
Optimal solution using Trie + DP.
Time: O(n^2) worst case, but typically much better
Space: O(n + total_word_length)
"""
MOD = 10**9 + 7
# Build Trie
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
root = TrieNode()
for word in words:
node = root
for c in word:
if c not in node.children:
node.children[c] = TrieNode()
node = node.children[c]
node.is_end = True
# DP
n = len(s)
dp = [0] * (n + 1)
dp[n] = 1
for i in range(n - 1, -1, -1):
node = root
for j in range(i, n):
c = s[j]
if c not in node.children:
break
node = node.children[c]
if node.is_end:
dp[i] = (dp[i] + dp[j + 1]) % MOD
return dp[0]
# Main I/O
import sys
input = sys.stdin.readline
def main():
s = input().strip()
k = int(input())
words = [input().strip() for _ in range(k)]
print(solve_trie_dp(s, words))
if __name__ == "__main__":
main()
Complexity
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n^2) | For each of n positions, traverse at most n characters |
| Space | O(n + W) | DP array + Trie with total word length W |
Common Mistakes
Mistake 1: Forgetting Modular Arithmetic
# WRONG
dp[i] = dp[i] + dp[j + 1] # Can overflow or give wrong answer
# CORRECT
dp[i] = (dp[i] + dp[j + 1]) % MOD
Problem: The count can be astronomically large. Fix: Always apply modulo at each addition.
Mistake 2: Wrong DP Direction
# WRONG - Forward DP without proper setup
dp[0] = 1
for i in range(n):
for j in range(i + 1, n + 1):
if s[i:j] in word_set:
dp[j] += dp[i] # Might miss some cases
# CORRECT - Backward DP is cleaner
dp[n] = 1
for i in range(n - 1, -1, -1):
# ... check words starting at i
Problem: Forward DP requires careful handling of which states are “complete”. Fix: Backward DP naturally builds on completed suffix counts.
Mistake 3: Not Breaking Early in Trie Traversal
# WRONG - Continues even when no match possible
for j in range(i, n):
word = s[i:j+1]
if word in word_set:
dp[i] += dp[j + 1]
# CORRECT - Stop when Trie path ends
for j in range(i, n):
if s[j] not in node.children:
break # No dictionary word continues this way
node = node.children[s[j]]
Problem: Wasting time on impossible prefixes. Fix: Trie naturally allows early termination.
Edge Cases
| Case | Input | Expected Output | Why |
|---|---|---|---|
| Empty match only | s="" |
1 | Empty string has one way (use nothing) |
| No valid segmentation | s="abc", words=["ab","cd"] |
0 | Cannot form “abc” |
| Single character | s="a", words=["a"] |
1 | One way |
| Overlapping words | s="aa", words=["a","aa"] |
3 | “a”+”a”, “aa” wait… |
| Large repeated pattern | s="aaa...a" (5000 a’s), words=["a"] |
1 | Only one way |
When to Use This Pattern
Use Trie + DP When:
- Dictionary contains many words with shared prefixes
- Need to check multiple words starting at each position
- String length is moderate (up to ~5000-10000)
- Words can be reused multiple times
Alternative Approaches:
- Aho-Corasick: When you need to find all occurrences of dictionary words in text (more complex setup)
- Hash Set DP: When dictionary is small and words are short
- Rolling Hash: When you need O(1) substring comparison
Pattern Recognition Checklist:
- Counting string segmentations? -> Trie + DP
- Dictionary words with shared prefixes? -> Trie helps
- Need all occurrences of multiple patterns? -> Aho-Corasick
- Just checking if segmentation exists? -> Simpler DP suffices
Related Problems
Easier (Do These First)
| Problem | Link | Why It Helps |
|---|---|---|
| String Matching | CSES 1753 | Basic pattern matching |
| Finding Borders | CSES 1732 | String prefix-suffix understanding |
Similar Difficulty
| Problem | Link | Key Difference |
|---|---|---|
| Finding Periods | CSES 1733 | Period detection in strings |
| Minimal Rotation | CSES 1110 | Lexicographically smallest rotation |
Harder (Do These After)
| Problem | Link | New Concept |
|---|---|---|
| Pattern Positions | CSES 1757 | Aho-Corasick for multiple patterns |
| Distinct Substrings | CSES 2105 | Suffix array techniques |
Key Takeaways
-
The Core Idea: Use a Trie to efficiently enumerate all dictionary words starting at each position, combined with DP to count segmentations.
-
Time Optimization: From O(n^2 * L) with hash set to O(n^2) with Trie (and often better in practice due to early termination).
-
Space Trade-off: Trie uses O(total word length) space but enables O(1) character transitions.
-
Pattern: This is the “Trie + DP for String Segmentation” pattern - useful for word break, word combinations, and similar problems.
Practice Checklist
Before moving on, make sure you can:
- Implement a basic Trie with insert and search
- Explain why backward DP works for this problem
- Trace through the algorithm on paper with a small example
- Implement the solution in under 15 minutes
- Handle the modular arithmetic correctly
Additional Resources
- CP-Algorithms: Trie
- CP-Algorithms: Aho-Corasick
- CSES Word Combinations - Trie-based DP