Hash Map
Problems that leverage hash maps (dictionaries) for efficient lookups, counting, and data organization with average O(1) time complexity.
Problems
Camp Schedule
Problem Information
- Source: Codeforces
- Difficulty: Secret
- Time Limit: 1000ms
- Memory Limit: 512MB
Problem Statement
A programming camp has schedule s (binary string where ‘1’ = contest day, ‘0’ = day off). Gleb wants the optimal schedule t to appear as a substring as many times as possible, while keeping the same number of contest days and days off.
Rearrange the characters of s to maximize occurrences of t as a substring.
Input Format
-
First line: string s (1 ≤ s ≤ 500000) -
Second line: string t (1 ≤ t ≤ 500000) - Both strings contain only ‘0’ and ‘1’
Output Format
Print the rearranged schedule with maximum occurrences of t. Must have same count of 0s and 1s as s.
Solution
Approach
Use KMP to find the failure function of t, which tells us the overlap when placing t consecutively. Greedily place as many copies of t as possible using the available 0s and 1s.
Python Solution
def compute_lps(pattern):
"""KMP failure function"""
m = len(pattern)
lps = [0] * m
length = 0
i = 1
while i < m:
if pattern[i] == pattern[length]:
length += 1
lps[i] = length
i += 1
elif length != 0:
length = lps[length - 1]
else:
lps[i] = 0
i += 1
return lps
def solve():
s = input().strip()
t = input().strip()
# Count available 0s and 1s
zeros = s.count('0')
ones = s.count('1')
# Count 0s and 1s needed for one copy of t
t_zeros = t.count('0')
t_ones = t.count('1')
if t_zeros == 0 and t_ones == 0:
print(s)
return
# Compute LPS for overlap
lps = compute_lps(t)
overlap = lps[-1] if t else 0
# The non-overlapping suffix of t
suffix = t[overlap:]
suffix_zeros = suffix.count('0')
suffix_ones = suffix.count('1')
result = []
# Place first copy of t if possible
if zeros >= t_zeros and ones >= t_ones:
result.append(t)
zeros -= t_zeros
ones -= t_ones
# Place more copies using overlap
while zeros >= suffix_zeros and ones >= suffix_ones:
result.append(suffix)
zeros -= suffix_zeros
ones -= suffix_ones
# Append remaining characters
result.append('0' * zeros)
result.append('1' * ones)
print(''.join(result))
if __name__ == "__main__":
solve()
Alternative
def solve():
s = input().strip()
t = input().strip()
cnt0 = s.count('0')
cnt1 = s.count('1')
t0 = t.count('0')
t1 = t.count('1')
# KMP failure function
def kmp_fail(p):
n = len(p)
fail = [0] * n
j = 0
for i in range(1, n):
while j > 0 and p[i] != p[j]:
j = fail[j-1]
if p[i] == p[j]:
j += 1
fail[i] = j
return fail
fail = kmp_fail(t)
overlap = fail[-1] if t else 0
# After first t, each additional t adds only t[overlap:]
add = t[overlap:]
add0 = add.count('0')
add1 = add.count('1')
ans = []
# Try to place first t
if cnt0 >= t0 and cnt1 >= t1:
ans.append(t)
cnt0 -= t0
cnt1 -= t1
# Keep adding overlapped copies
while cnt0 >= add0 and cnt1 >= add1:
ans.append(add)
cnt0 -= add0
cnt1 -= add1
# Fill remaining
ans.append('0' * cnt0 + '1' * cnt1)
print(''.join(ans))
if __name__ == "__main__":
solve()
Complexity Analysis
-
Time Complexity: O( s + t ) -
Space Complexity: O( s + t )
Key Insight
Using KMP’s failure function, we know how much of t overlaps with itself. After placing the first copy of t, each additional copy only needs the non-overlapping suffix, saving characters for more copies.
Cipher
Problem Information
- Source: HackerRank
- Difficulty: Secret
- Time Limit: 1000ms
- Memory Limit: 512MB
Problem Statement
Jack and Daniel encrypt messages using a cipher. A message b is encoded by writing it k times (shifted by 0, 1, …, k-1 positions) and XORing all columns together.
Given encoded string s and parameter k, decode the original message.
Example: b = 1001011, k = 4
1001011 shift 0
01001011 shift 1
001001011 shift 2
0001001011 shift 3
----------
1110101001 XORed result s
Input Format
- First line: n and k (length of original, number of shifts)
- Second line: encoded string s of length n + k - 1
Constraints
- 1 ≤ n ≤ 10⁶
- 1 ≤ k ≤ 10⁶
Output Format
Decoded message of length n.
Solution
Approach
For each position i in the encoded string s:
- s[i] = XOR of b[i], b[i-1], …, b[i-k+1] (for valid indices)
We can decode left to right:
- b[0] = s[0]
- For i ≥ 1: b[i] = s[i] XOR b[i-1] XOR b[i-2] XOR … (up to k-1 previous bits)
Use a running XOR of the last min(k, i+1) bits of b.
Python Solution
def solve():
n, k = map(int, input().split())
s = input().strip()
b = []
xor_sum = 0 # Running XOR of last k bits of b
for i in range(n):
# s[i] = xor of b[max(0,i-k+1)] to b[i]
# So b[i] = s[i] XOR (xor of b[max(0,i-k+1)] to b[i-1])
# The "xor of previous" is our running xor_sum
bit = int(s[i]) ^ xor_sum
b.append(str(bit))
# Update running XOR
xor_sum ^= bit
# Remove contribution of b[i-k+1] if it exists (sliding window)
if i >= k - 1:
xor_sum ^= int(b[i - k + 1])
print(''.join(b))
if __name__ == "__main__":
solve()
Alternative
def solve():
n, k = map(int, input().split())
s = input().strip()
result = []
window_xor = 0
for i in range(n):
# Decode b[i]
# s[i] contains XOR of b[i-k+1..i], but we only have b[0..i-1]
# s[i] = window_xor ^ b[i], so b[i] = s[i] ^ window_xor
b_i = int(s[i]) ^ window_xor
result.append(b_i)
# Add b[i] to window
window_xor ^= b_i
# Remove b[i-k+1] from window if window exceeds k
if i - k + 1 >= 0:
window_xor ^= result[i - k + 1]
print(''.join(map(str, result)))
if __name__ == "__main__":
solve()
One-liner
def solve():
n, k = map(int, input().split())
s = input().strip()
b = [0] * n
xor = 0
for i in range(n):
b[i] = int(s[i]) ^ xor
xor ^= b[i]
if i >= k - 1:
xor ^= b[i - k + 1]
print(''.join(map(str, b)))
if __name__ == "__main__":
solve()
Complexity Analysis
- Time Complexity: O(N)
- Space Complexity: O(N)
Key Insight
The encoded string at position i is the XOR of a window of k bits from the original. By maintaining a sliding window XOR, we can decode each bit in O(1) time. When the window slides past position k, we XOR out the oldest bit.
Good Substrings
Problem Information
- Source: Codeforces
- Difficulty: Secret
- Time Limit: 2000ms
- Memory Limit: 512MB
Problem Statement
You’ve got string s, consisting of small English letters. Some of the English letters are good, the rest are bad.
| A substring s[l…r] (1 ≤ l ≤ r ≤ | s | ) of string s = s1s2…s | s | (where | s | is the length of string s) is string sl, sl+1, …, sr. |
The substring s[l…r] is good, if among the letters sl, sl+1, …, sr there are at most k bad ones.
Your task is to find the number of distinct good substrings of the given string s. Two substrings s[x…y] and s[p…q] are considered distinct if their content is different, i.e. s[x…y] ≠ s[p…q].
Input Format
- The first line of the input is the non-empty string s, consisting of small English letters, the string’s length is at most 1500 characters.
- The second line of the input is the string of characters 0 and 1, the length is exactly 26 characters. Character equals 1 means the i-th English letter is good, otherwise it’s bad.
-
The third line of the input consists a single integer k (0 ≤ k ≤ s ) - the maximum acceptable number of bad characters in a good substring.
Output Format
Print a single integer - the number of distinct good substrings of string s.
Solution
Approach
We need to count distinct substrings with at most k bad characters. Two approaches:
- Hash Set: Store all good substrings in a set and count unique ones
- Trie: Use a Trie to efficiently store and count unique substrings
The Trie approach is more efficient for this problem.
Python Solution
def solve():
s = input().strip()
good_bad = input().strip()
k = int(input())
def is_bad(c):
return good_bad[ord(c) - ord('a')] == '0'
n = len(s)
good_substrings = set()
for i in range(n):
bad_count = 0
substring = ""
for j in range(i, n):
substring += s[j]
if is_bad(s[j]):
bad_count += 1
if bad_count > k:
break
good_substrings.add(substring)
print(len(good_substrings))
if __name__ == "__main__":
solve()
Python Solution
class TrieNode:
def __init__(self):
self.children = {}
self.is_end = False
def solve():
s = input().strip()
good_bad = input().strip()
k = int(input())
def is_bad(c):
return good_bad[ord(c) - ord('a')] == '0'
n = len(s)
root = TrieNode()
count = 0
for i in range(n):
bad_count = 0
node = root
for j in range(i, n):
c = s[j]
if is_bad(c):
bad_count += 1
if bad_count > k:
break
if c not in node.children:
node.children[c] = TrieNode()
count += 1 # New unique substring found
node = node.children[c]
print(count)
if __name__ == "__main__":
solve()
Python Solution
def solve():
s = input().strip()
good_bad = input().strip()
k = int(input())
def is_bad(c):
return good_bad[ord(c) - ord('a')] == '0'
n = len(s)
MOD = 10**18 + 9
BASE = 31
good_substrings = set()
for i in range(n):
bad_count = 0
hash_val = 0
power = 1
for j in range(i, n):
if is_bad(s[j]):
bad_count += 1
if bad_count > k:
break
# Rolling hash
hash_val = (hash_val * BASE + ord(s[j]) - ord('a') + 1) % MOD
good_substrings.add((j - i + 1, hash_val)) # (length, hash) pair
print(len(good_substrings))
if __name__ == "__main__":
solve()
Complexity Analysis
- Time Complexity: O(n²) for generating all substrings
- Space Complexity: O(n²) in worst case for storing substrings
Key Insight
Using a Trie allows us to efficiently detect when a new unique substring is found - it happens exactly when we create a new node in the Trie.
N meetings in one room
Problem Information
- Source: GeeksforGeeks
- Difficulty: Secret
- Time Limit: 1000ms
- Memory Limit: 512MB
Problem Statement
There is one meeting room in Flipkart. There are n meetings in the form of (Sᵢ, Fᵢ) where Sᵢ is the start time and Fᵢ is the finish time of meeting i.
Find the maximum number of meetings that can be accommodated in the meeting room.
Input Format
- First line: T (number of test cases)
- For each test case:
- First line: N (number of meetings)
- Second line: N start times Sᵢ
- Third line: N finish times Fᵢ
Output Format
Print the order in which meetings should take place (1-indexed), separated by spaces.
Solution
Approach
This is the classic Activity Selection problem. Greedy approach:
- Sort meetings by finish time
- Select the meeting with earliest finish time
- Skip all meetings that overlap with selected meeting
- Repeat until no meetings left
Python Solution
def solve():
t = int(input())
for _ in range(t):
n = int(input())
starts = list(map(int, input().split()))
ends = list(map(int, input().split()))
# Create list of (end_time, start_time, original_index)
meetings = [(ends[i], starts[i], i + 1) for i in range(n)]
# Sort by end time
meetings.sort()
result = []
last_end = -1
for end, start, idx in meetings:
if start > last_end:
result.append(idx)
last_end = end
print(' '.join(map(str, result)))
if __name__ == "__main__":
solve()
Alternative
def solve():
t = int(input())
for _ in range(t):
n = int(input())
starts = list(map(int, input().split()))
ends = list(map(int, input().split()))
# Create meetings with 1-based index
meetings = []
for i in range(n):
meetings.append((ends[i], starts[i], i + 1))
# Sort by end time, then by start time
meetings.sort(key=lambda x: (x[0], x[1]))
selected = []
prev_end = 0
for end, start, idx in meetings:
if start >= prev_end:
selected.append(idx)
prev_end = end
print(' '.join(map(str, selected)))
if __name__ == "__main__":
solve()
Complexity Analysis
- Time Complexity: O(N log N) for sorting
- Space Complexity: O(N) for storing meetings
Key Insight
The greedy choice is to always pick the meeting that ends earliest among non-conflicting meetings. This leaves maximum room for subsequent meetings. Sorting by end time and greedily selecting non-overlapping meetings gives the optimal solution.
Suffix Equal Prefix
Problem Information
- Source: SPOJ
- Difficulty: Secret
- Time Limit: 2000ms
- Memory Limit: 512MB
Problem Statement
There is a curious boy in Noapara village, his name is “MODON”. He wrote a line of small letters only. Now, he wants to know how many proper suffixes are also prefixes of that line. A proper suffix is a suffix of a text whose length is less than the original text.
Help “MODON” find the answer.
Input Format
- First line: T (T ≤ 50), the number of test cases
- Each case: A string S where 1 ≤ length(S) ≤ 10^6
Output Format
For each case, print the case number and the answer.
Solution
Approach
Use the KMP prefix function (failure function). The prefix function pi[i] gives the length of the longest proper prefix of s[0..i] that is also a suffix.
To count all proper suffixes that are also prefixes:
- Compute prefix function for the string
- Start from
pi[n-1]and follow the chain:pi[n-1],pi[pi[n-1]-1], etc. - Each non-zero value in this chain represents a valid suffix=prefix length
Python Solution
def compute_prefix(s):
"""KMP prefix function"""
n = len(s)
pi = [0] * n
for i in range(1, n):
j = pi[i - 1]
while j > 0 and s[i] != s[j]:
j = pi[j - 1]
if s[i] == s[j]:
j += 1
pi[i] = j
return pi
def count_suffix_prefix(s):
"""Count proper suffixes that are also prefixes"""
n = len(s)
if n == 0:
return 0
pi = compute_prefix(s)
# Follow the chain from pi[n-1]
count = 0
length = pi[n - 1]
while length > 0:
count += 1
length = pi[length - 1]
return count
def solve():
t = int(input())
for case in range(1, t + 1):
s = input().strip()
result = count_suffix_prefix(s)
print(f"Case {case}: {result}")
if __name__ == "__main__":
solve()
Alternative
def z_function(s):
"""Z-function: z[i] = length of longest substring starting from i that matches prefix"""
n = len(s)
z = [0] * n
z[0] = n
l, r = 0, 0
for i in range(1, n):
if i < r:
z[i] = min(r - i, z[i - l])
while i + z[i] < n and s[z[i]] == s[i + z[i]]:
z[i] += 1
if i + z[i] > r:
l, r = i, i + z[i]
return z
def count_suffix_prefix_z(s):
"""Count using Z-function"""
n = len(s)
if n == 0:
return 0
z = z_function(s)
# A suffix starting at position i is also a prefix if z[i] == n - i
count = 0
for i in range(1, n):
if z[i] == n - i:
count += 1
return count
def solve():
t = int(input())
for case in range(1, t + 1):
s = input().strip()
result = count_suffix_prefix_z(s)
print(f"Case {case}: {result}")
if __name__ == "__main__":
solve()
Complexity Analysis
-
Time Complexity: O( S ) for computing prefix/Z function -
Space Complexity: O( S ) for the prefix/Z array
Key Insight
The KMP prefix function naturally encodes information about suffix-prefix matches. If pi[n-1] = k, then the suffix of length k equals the prefix of length k. Following the chain pi[k-1], pi[pi[k-1]-1], etc. gives us all such matching lengths.
Example: For “abcab”, pi = [0,0,0,1,2], so pi[4]=2 means “ab” is both suffix and prefix. Following: pi[1]=0, so we stop. Answer = 1.
The Monk and Prateek
Problem Information
- Source: HackerEarth
- Difficulty: Secret
- Time Limit: 1000ms
- Memory Limit: 256MB
Problem Statement
The Monk is testing Prateek with a hash function called the r3gz3n function:
f(n) = N ⊕ (sum of digits of N)
For example, f(81) = 81 ⊕ (8 + 1) = 81 ⊕ 9 = 88.
Given a list of N numbers, find:
- The value of the r3gz3n function which occurs the maximum number of times
- The number of collisions with the hash function
Notes:
- If all values are unique, print the maximum hashed value
- If multiple hashed values have the same maximum count, print the smallest one
Input Format
- First line: N (number of numbers)
- Second line: N integers separated by space
Constraints
- 1 ≤ N ≤ 10^5
- 1 ≤ Nᵢ ≤ 10^7
Output Format
Print two integers: the value with maximum occurrences (or max value if all unique) and the number of collisions.
Solution
Approach
- Compute the hash value for each number:
f(n) = n XOR digit_sum(n) - Use a hash map to count occurrences of each hash value
- Count collisions (when a hash value already exists)
- Find the most frequent hash value (smallest if tie)
Python Solution
def digit_sum(n):
total = 0
while n:
total += n % 10
n //= 10
return total
def r3gz3n(n):
return n ^ digit_sum(n)
def solve():
n = int(input())
numbers = list(map(int, input().split()))
hash_count = {}
collisions = 0
max_hash = float('-inf')
for num in numbers:
h = r3gz3n(num)
max_hash = max(max_hash, h)
if h in hash_count:
collisions += 1
hash_count[h] += 1
else:
hash_count[h] = 1
# Find most frequent hash value
if collisions == 0:
# All unique - return max hash value
print(max_hash, 0)
else:
# Find value with maximum count (smallest if tie)
max_count = max(hash_count.values())
most_frequent = min(h for h, c in hash_count.items() if c == max_count)
print(most_frequent, collisions)
if __name__ == "__main__":
solve()
Alternative
from collections import Counter
def solve():
n = int(input())
nums = list(map(int, input().split()))
def hash_func(x):
return x ^ sum(int(d) for d in str(x))
hashes = [hash_func(x) for x in nums]
count = Counter(hashes)
collisions = len(hashes) - len(count)
if collisions == 0:
result = max(hashes)
else:
max_freq = max(count.values())
result = min(h for h, c in count.items() if c == max_freq)
print(result, collisions)
if __name__ == "__main__":
solve()
One-liner
from collections import Counter
def solve():
n = int(input())
nums = list(map(int, input().split()))
h = lambda x: x ^ sum(map(int, str(x)))
hashes = [h(x) for x in nums]
cnt = Counter(hashes)
colls = len(hashes) - len(cnt)
if colls == 0:
print(max(hashes), 0)
else:
mx = max(cnt.values())
print(min(k for k, v in cnt.items() if v == mx), colls)
if __name__ == "__main__":
solve()
Complexity Analysis
- Time Complexity: O(N × D) where D is the average number of digits (at most 8 for 10^7)
- Space Complexity: O(N) for the hash count map
Key Insight
A collision occurs when two different input numbers produce the same hash value. The total number of collisions is total_numbers - unique_hash_values. When finding the most frequent hash value with ties, we need to return the smallest one.
Watto and Mechanism
Problem Information
- Source: Codeforces
- Difficulty: Secret
- Time Limit: 1000ms
- Memory Limit: 256MB
Problem Statement
Watto has a mechanism with n strings in memory. For each query string s, determine if there exists a string t in memory that:
- Has the same length as s
- Differs from s in exactly one position
Strings contain only letters ‘a’, ‘b’, ‘c’.
Input Format
- First line: n and m (number of strings and queries)
- Next n lines: strings in memory
- Next m lines: query strings
- Total length ≤ 6×10⁵
Output Format
For each query, print “YES” if such string exists, “NO” otherwise.
Solution
Approach
Use hashing. For each string, compute hash values for all possible single-character changes. Store original hashes in a set. For queries, check if any single-character change produces a hash in the set.
Python Solution
def solve():
import sys
input = sys.stdin.readline
n, m = map(int, input().split())
# Store hashes of all strings in memory, grouped by length
from collections import defaultdict
memory = defaultdict(set)
BASE = 31
MOD = 10**18 + 9
def compute_hash(s):
h = 0
for c in s:
h = (h * BASE + ord(c) - ord('a') + 1) % MOD
return h
# Precompute powers of BASE
max_len = 600001
pw = [1] * max_len
for i in range(1, max_len):
pw[i] = (pw[i-1] * BASE) % MOD
for _ in range(n):
s = input().strip()
h = compute_hash(s)
memory[len(s)].add(h)
results = []
for _ in range(m):
s = input().strip()
length = len(s)
h = compute_hash(s)
found = False
current_hash = h
# Try changing each position
for i in range(length):
pos = length - 1 - i # Position from right
old_val = ord(s[i]) - ord('a') + 1
for c in 'abc':
new_val = ord(c) - ord('a') + 1
if new_val != old_val:
# Compute new hash
diff = (new_val - old_val) * pw[pos]
new_hash = (current_hash + diff) % MOD
if new_hash in memory[length]:
found = True
break
if found:
break
results.append("YES" if found else "NO")
print('\n'.join(results))
if __name__ == "__main__":
solve()
Alternative
def solve():
n, m = map(int, input().split())
from collections import defaultdict
BASE = 31
MOD = 2**61 - 1
strings_by_len = defaultdict(set)
def poly_hash(s):
h = 0
for c in s:
h = (h * BASE + ord(c)) % MOD
return h
# Read and hash all strings
for _ in range(n):
s = input().strip()
strings_by_len[len(s)].add(poly_hash(s))
# Precompute powers
pow_base = [1]
for _ in range(600005):
pow_base.append((pow_base[-1] * BASE) % MOD)
for _ in range(m):
s = input().strip()
L = len(s)
h = poly_hash(s)
found = False
for i in range(L):
old_c = ord(s[i])
for new_c in [ord('a'), ord('b'), ord('c')]:
if new_c != old_c:
diff = (new_c - old_c) * pow_base[L - 1 - i]
new_h = (h + diff) % MOD
if new_h in strings_by_len[L]:
found = True
break
if found:
break
print("YES" if found else "NO")
if __name__ == "__main__":
solve()
Complexity Analysis
- Time Complexity: O(N×L + M×L×3) where L is average string length
- Space Complexity: O(N) for storing hashes
Key Insight
Instead of comparing strings directly (too slow), use polynomial hashing. For each query, generate hashes of all strings that differ by exactly one character and check if any exists in our hash set. With only 3 possible characters, each position has 2 alternative values to try.