Word Combinations

Problem Overview

Attribute Value
Problem Link CSES 1731 - Word Combinations
Difficulty Medium
Category String Algorithms / Dynamic Programming
Time Limit 1 second
Key Technique Trie + DP

Learning Goals

After solving this problem, you will be able to:

  • Build and traverse a Trie data structure for dictionary lookups
  • Combine Trie traversal with dynamic programming for string segmentation
  • Count ways to partition a string using dictionary words (modular arithmetic)
  • Recognize when Trie-based DP is more efficient than naive approaches

Problem Statement

Problem: Given a string and a dictionary of words, count the number of ways to form the string by concatenating dictionary words. Each word can be used any number of times.

Input:

  • Line 1: String s (the target string)
  • Line 2: Integer k (number of words in dictionary)
  • Next k lines: Dictionary words

Output:

  • Number of ways to construct the string, modulo 10^9 + 7

Constraints:

  • 1 <= s <= 5000
  • 1 <= k <= 10^5
  • 1 <= word <= 5000
  • Total length of all dictionary words <= 10^6

Example

Input:
ababab
4
ab
abab
b
bab

Output:
6

Explanation: The six ways to form “ababab” are:

  1. “ab” + “ab” + “ab”
  2. “ab” + “abab”
  3. “abab” + “ab”
  4. “ab” + “b” + “ab” + “ab” (invalid - “b” alone doesn’t work here)

Wait - let’s verify: The valid combinations are based on how dictionary words can concatenate to form “ababab”.


Intuition: How to Think About This Problem

Pattern Recognition

Key Question: How do we efficiently check all possible word segmentations?

This is a word segmentation counting problem. At each position in the string, we ask: “Which dictionary words start here?” For each match, we recursively count ways to form the remaining suffix.

Breaking Down the Problem

  1. What are we counting? Number of ways to partition string into dictionary words.
  2. What makes this hard? Naive check at each position is O(k * max_word_len).
  3. What’s the insight? Use a Trie to check all dictionary words starting at position i in O( s ) total.

Analogy

Think of building the string like climbing stairs where each step can be of varying heights (word lengths). The Trie tells us which step sizes are available at each position.


Solution 1: Naive DP with Hash Set

Idea

Use DP where dp[i] = number of ways to form s[i:]. For each position, check all possible word endings.

Algorithm

  1. Store dictionary words in a hash set
  2. For each position i from end to start
  3. Try all substrings s[i:j] and check if in dictionary
  4. If yes, add dp[j] to dp[i]

Code

def solve_naive(s, words):
  """
  Naive DP solution using hash set.

  Time: O(n^2 * max_word_len) - substring creation and hashing
  Space: O(n + total_word_length)
  """
  MOD = 10**9 + 7
  n = len(s)
  word_set = set(words)
  max_len = max(len(w) for w in words) if words else 0

  dp = [0] * (n + 1)
  dp[n] = 1  # Base case: empty suffix has 1 way

  for i in range(n - 1, -1, -1):
    for j in range(i + 1, min(i + max_len + 1, n + 1)):
      if s[i:j] in word_set:
        dp[i] = (dp[i] + dp[j]) % MOD

  return dp[0]

Complexity

Metric Value Explanation
Time O(n^2 * L) n positions, up to n lengths, L for hashing
Space O(n + W) DP array + word storage

Why This Is Slow

Creating substrings and hashing them repeatedly is inefficient. For each of n positions, we might create O(n) substrings.


Solution 2: Optimal - Trie + DP

Key Insight

The Trick: Build a Trie from dictionary words. At each position i, traverse the Trie following characters s[i], s[i+1], … and collect DP contributions whenever we hit a word ending.

DP State Definition

State Meaning
dp[i] Number of ways to form the suffix s[i:] using dictionary words

In plain English: dp[i] answers “how many ways can I build the string starting from position i?”

State Transition

dp[i] = sum of dp[j] for all j where s[i:j] is a dictionary word

Why? If s[i:j] is a valid word, we can use it and then have dp[j] ways to complete the rest.

Base Case

Case Value Reason
dp[n] 1 Empty suffix can be formed in exactly 1 way (use no words)

Algorithm

  1. Build Trie from all dictionary words
  2. Initialize dp[n] = 1
  3. For i from n-1 down to 0:
    • Start at Trie root
    • For j from i to n-1:
      • Move to child node for character s[j]
      • If no such child, break
      • If current node is end of word, add dp[j+1] to dp[i]
  4. Return dp[0]

Dry Run Example

Input: s = "aba", words = ["a", "ab", "ba"]

Trie structure:
    root
    /  \
   a    b
   |    |
  (end) a
   |   (end)
   b
  (end)

DP computation (right to left):

Initial: dp = [0, 0, 0, 1]
                        ^ dp[3] = 1 (base case)

i = 2: Check s[2:] = "a"
  - Traverse: root -> 'a' (is end!)
  - dp[2] += dp[3] = 1
  dp = [0, 0, 1, 1]

i = 1: Check s[1:] = "ba"
  - Traverse: root -> 'b' -> 'a' (is end!)
  - dp[1] += dp[3] = 1
  dp = [0, 1, 1, 1]

i = 0: Check s[0:] = "aba"
  - Traverse: root -> 'a' (is end!) -> dp[0] += dp[1] = 1
  - Continue: 'a' -> 'b' (is end!) -> dp[0] += dp[2] = 2
  - Continue: 'b' -> 'a'? No 'a' child from 'b' under 'a'. Stop.
  dp = [2, 1, 1, 1]

Answer: dp[0] = 2
Ways: "a" + "ba" and "ab" + "a"

Visual Diagram

String: a b a
Index:  0 1 2 3

dp[3] = 1 (base)

Position 2: "a" matches
            dp[2] = dp[3] = 1

Position 1: "ba" matches
            dp[1] = dp[3] = 1

Position 0: "a" matches -> dp[1] = 1
            "ab" matches -> dp[2] = 1
            dp[0] = 1 + 1 = 2

Code

def solve_trie_dp(s, words):
  """
  Optimal solution using Trie + DP.

  Time: O(n^2) worst case, but typically much better
  Space: O(n + total_word_length)
  """
  MOD = 10**9 + 7

  # Build Trie
  class TrieNode:
    def __init__(self):
      self.children = {}
      self.is_end = False

  root = TrieNode()
  for word in words:
    node = root
    for c in word:
      if c not in node.children:
        node.children[c] = TrieNode()
      node = node.children[c]
    node.is_end = True

  # DP
  n = len(s)
  dp = [0] * (n + 1)
  dp[n] = 1

  for i in range(n - 1, -1, -1):
    node = root
    for j in range(i, n):
      c = s[j]
      if c not in node.children:
        break
      node = node.children[c]
      if node.is_end:
        dp[i] = (dp[i] + dp[j + 1]) % MOD

  return dp[0]


# Main I/O
import sys
input = sys.stdin.readline

def main():
  s = input().strip()
  k = int(input())
  words = [input().strip() for _ in range(k)]
  print(solve_trie_dp(s, words))

if __name__ == "__main__":
  main()

Complexity

Metric Value Explanation
Time O(n^2) For each of n positions, traverse at most n characters
Space O(n + W) DP array + Trie with total word length W

Common Mistakes

Mistake 1: Forgetting Modular Arithmetic

# WRONG
dp[i] = dp[i] + dp[j + 1]  # Can overflow or give wrong answer

# CORRECT
dp[i] = (dp[i] + dp[j + 1]) % MOD

Problem: The count can be astronomically large. Fix: Always apply modulo at each addition.

Mistake 2: Wrong DP Direction

# WRONG - Forward DP without proper setup
dp[0] = 1
for i in range(n):
  for j in range(i + 1, n + 1):
    if s[i:j] in word_set:
      dp[j] += dp[i]  # Might miss some cases

# CORRECT - Backward DP is cleaner
dp[n] = 1
for i in range(n - 1, -1, -1):
  # ... check words starting at i

Problem: Forward DP requires careful handling of which states are “complete”. Fix: Backward DP naturally builds on completed suffix counts.

Mistake 3: Not Breaking Early in Trie Traversal

# WRONG - Continues even when no match possible
for j in range(i, n):
  word = s[i:j+1]
  if word in word_set:
    dp[i] += dp[j + 1]

# CORRECT - Stop when Trie path ends
for j in range(i, n):
  if s[j] not in node.children:
    break  # No dictionary word continues this way
  node = node.children[s[j]]

Problem: Wasting time on impossible prefixes. Fix: Trie naturally allows early termination.


Edge Cases

Case Input Expected Output Why
Empty match only s="" 1 Empty string has one way (use nothing)
No valid segmentation s="abc", words=["ab","cd"] 0 Cannot form “abc”
Single character s="a", words=["a"] 1 One way
Overlapping words s="aa", words=["a","aa"] 3 “a”+”a”, “aa” wait…
Large repeated pattern s="aaa...a" (5000 a’s), words=["a"] 1 Only one way

When to Use This Pattern

Use Trie + DP When:

  • Dictionary contains many words with shared prefixes
  • Need to check multiple words starting at each position
  • String length is moderate (up to ~5000-10000)
  • Words can be reused multiple times

Alternative Approaches:

  • Aho-Corasick: When you need to find all occurrences of dictionary words in text (more complex setup)
  • Hash Set DP: When dictionary is small and words are short
  • Rolling Hash: When you need O(1) substring comparison

Pattern Recognition Checklist:

  • Counting string segmentations? -> Trie + DP
  • Dictionary words with shared prefixes? -> Trie helps
  • Need all occurrences of multiple patterns? -> Aho-Corasick
  • Just checking if segmentation exists? -> Simpler DP suffices

Easier (Do These First)

Problem Link Why It Helps
String Matching CSES 1753 Basic pattern matching
Finding Borders CSES 1732 String prefix-suffix understanding

Similar Difficulty

Problem Link Key Difference
Finding Periods CSES 1733 Period detection in strings
Minimal Rotation CSES 1110 Lexicographically smallest rotation

Harder (Do These After)

Problem Link New Concept
Pattern Positions CSES 1757 Aho-Corasick for multiple patterns
Distinct Substrings CSES 2105 Suffix array techniques

Key Takeaways

  1. The Core Idea: Use a Trie to efficiently enumerate all dictionary words starting at each position, combined with DP to count segmentations.

  2. Time Optimization: From O(n^2 * L) with hash set to O(n^2) with Trie (and often better in practice due to early termination).

  3. Space Trade-off: Trie uses O(total word length) space but enables O(1) character transitions.

  4. Pattern: This is the “Trie + DP for String Segmentation” pattern - useful for word break, word combinations, and similar problems.


Practice Checklist

Before moving on, make sure you can:

  • Implement a basic Trie with insert and search
  • Explain why backward DP works for this problem
  • Trace through the algorithm on paper with a small example
  • Implement the solution in under 15 minutes
  • Handle the modular arithmetic correctly

Additional Resources