Distinct Values Queries II

Problem Overview

Attribute	Value
CSES Link	Distinct Values Queries II
Difficulty	Hard
Category	Range Queries
Time Limit	1.00 s
Key Technique	Segment Tree / Sqrt Decomposition

Learning Goals

After solving this problem, you will be able to:

Apply segment trees for range distinctness queries with updates
Use coordinate compression to handle large value ranges
Implement efficient data structures for “all distinct in range” queries
Optimize between segment tree and sqrt decomposition based on constraints

Problem Statement

Problem: Given an array of n integers, process q queries of two types:

Update: Change the value at position k to u
Query: Check if every value in range [a, b] is distinct (no duplicates)

Input:

Line 1: Two integers n and q (array size and number of queries)
Line 2: n integers x1, x2, …, xn (array values)
Next q lines: Either “1 k u” (update) or “2 a b” (query)

Output:

For each type-2 query: Print “YES” if all values in range [a, b] are distinct, “NO” otherwise

Constraints:

1 <= n, q <= 2 x 10^5
1 <= xi, u <= 10^9
1 <= k <= n
1 <= a <= b <= n

Example

Input:
5 4
3 2 7 2 8
2 3 5
2 2 5
1 2 9
2 2 5

Output:
YES
NO
YES

Explanation:

Initial array: [3, 2, 7, 2, 8]
Query 1: Range [3,5] = [7, 2, 8] - all distinct -> YES
Query 2: Range [2,5] = [2, 7, 2, 8] - value 2 appears twice -> NO
Update: Change position 2 to 9, array becomes [3, 9, 7, 2, 8]
Query 3: Range [2,5] = [9, 7, 2, 8] - all distinct -> YES

Intuition: How to Think About This Problem

Pattern Recognition

Key Question: How can we efficiently check if a range contains duplicates, while supporting point updates?

The key insight is that instead of checking if all values are distinct (which requires O(range) time naively), we can track for each position i the previous occurrence of the same value. If prev[i] >= a (start of query range), then there is a duplicate.

Breaking Down the Problem

What are we looking for? Whether any value appears more than once in range [a, b]
What information do we have? Array values, with point updates
What’s the relationship between input and output? A range has all distinct values if and only if for every position i in [a, b], the previous occurrence of arr[i] is before position a.

Core Insight

For each position i, define prev[i] = the largest index j < i where arr[j] = arr[i], or 0 if no such j exists.

A range [a, b] has all distinct values if and only if: max(prev[a], prev[a+1], …, prev[b]) < a

This transforms the problem into: range maximum query with point updates.

Solution 1: Brute Force

Idea

For each query, iterate through the range and use a set to check for duplicates.

Algorithm

For update query: Simply update arr[k] = u
For distinctness query: Use a set to track seen values in range [a, b]

Code

def solve_brute_force():
  """
  Brute force solution - check each range with a set.

  Time: O(q * n) per query
  Space: O(n)
  """
  import sys
  input = sys.stdin.readline

  n, q = map(int, input().split())
  arr = [0] + list(map(int, input().split()))  # 1-indexed

  results = []
  for _ in range(q):
    query = list(map(int, input().split()))

    if query[0] == 1:  # Update
      k, u = query[1], query[2]
      arr[k] = u
    else:  # Query
      a, b = query[1], query[2]
      seen = set()
      is_distinct = True
      for i in range(a, b + 1):
        if arr[i] in seen:
          is_distinct = False
          break
        seen.add(arr[i])
      results.append("YES" if is_distinct else "NO")

  print('\n'.join(results))

Complexity

Metric	Value	Explanation
Time	O(q * n)	Each query may scan entire range
Space	O(n)	Set for checking duplicates

Why This Works (But Is Slow)

Correctness is guaranteed since we explicitly check each value in the range. However, with n, q up to 2 x 10^5, worst case is 4 x 10^10 operations - far too slow.

Solution 2: Optimal Solution - Segment Tree with prev[] Array

Key Insight

The Trick: Transform “all distinct” into “range max of prev[] < a” and use segment tree for efficient range max queries with point updates.

Data Structure Design

Component	Purpose
`arr[i]`	Current value at position i
`prev[i]`	Previous occurrence index of arr[i], or 0
`pos[v]`	Current positions where value v appears (sorted set)
Segment Tree	Range maximum query on prev[] array

In plain English: We maintain for each position where the same value last appeared. A range has duplicates iff any position in that range has its previous occurrence within the range.

Algorithm

Preprocessing:
- Build pos[v] = sorted set of positions containing value v
- Compute prev[i] = predecessor of i in pos[arr[i]]
- Build segment tree on prev[] for range max
Update (position k to value u):
- Remove k from pos[arr[k]], update prev[] for k’s successor
- Add k to pos[u], update prev[k] and prev[] for k’s successor
- Update segment tree at affected positions
Query (range [a, b]):
- Return max(prev[a..b]) < a ? “YES” : “NO”

Dry Run Example

Let’s trace through with input n=5, arr=[3,2,7,2,8]:

Initial Setup:
  arr:  [_, 3, 2, 7, 2, 8]  (1-indexed)

  pos[3] = {1}
  pos[2] = {2, 4}
  pos[7] = {3}
  pos[8] = {5}

  prev[1] = 0  (no previous 3)
  prev[2] = 0  (no previous 2 before index 2)
  prev[3] = 0  (no previous 7)
  prev[4] = 2  (previous 2 is at index 2)
  prev[5] = 0  (no previous 8)

  prev array: [_, 0, 0, 0, 2, 0]

Query 1: Range [3, 5]
  max(prev[3], prev[4], prev[5]) = max(0, 2, 0) = 2
  Is 2 < 3? YES -> all distinct

Query 2: Range [2, 5]
  max(prev[2], prev[3], prev[4], prev[5]) = max(0, 0, 2, 0) = 2
  Is 2 < 2? NO -> has duplicates (value 2 at positions 2 and 4)

Update: arr[2] = 9
  Remove 2 from pos[2]: pos[2] = {4}
  Update prev[4] = 0 (no previous 2 now)
  Add 2 to pos[9]: pos[9] = {2}
  prev[2] = 0 (no previous 9)

  New prev array: [_, 0, 0, 0, 0, 0]

Query 3: Range [2, 5]
  max(prev[2], prev[3], prev[4], prev[5]) = max(0, 0, 0, 0) = 0
  Is 0 < 2? YES -> all distinct

Visual Diagram

Array:    [3]  [2]  [7]  [2]  [8]
Index:     1    2    3    4    5

prev[]:    0    0    0    2    0
                         ^
                         |
              duplicate of index 2

Query [2,5]: max(prev[2..5]) = 2 >= 2 -> NO (has duplicates)
Query [3,5]: max(prev[3..5]) = 2 >= 3? No, 2 < 3 -> YES (all distinct)

Code (Python)

import sys
from collections import defaultdict
from sortedcontainers import SortedList

def solve():
  """
  Optimal solution using segment tree for range max queries.

  Time: O((n + q) * log n)
  Space: O(n)
  """
  input = sys.stdin.readline

  n, q = map(int, input().split())
  arr = [0] + list(map(int, input().split()))  # 1-indexed

  # Segment tree for range maximum
  tree = [0] * (4 * n)

  def build(node, start, end):
    if start == end:
      tree[node] = prev[start]
      return
    mid = (start + end) // 2
    build(2 * node, start, mid)
    build(2 * node + 1, mid + 1, end)
    tree[node] = max(tree[2 * node], tree[2 * node + 1])

  def update(node, start, end, idx, val):
    if start == end:
      tree[node] = val
      return
    mid = (start + end) // 2
    if idx <= mid:
      update(2 * node, start, mid, idx, val)
    else:
      update(2 * node + 1, mid + 1, end, idx, val)
    tree[node] = max(tree[2 * node], tree[2 * node + 1])

  def query(node, start, end, l, r):
    if r < start or end < l:
      return 0
    if l <= start and end <= r:
      return tree[node]
    mid = (start + end) // 2
    return max(query(2 * node, start, mid, l, r),
        query(2 * node + 1, mid + 1, end, l, r))

  # pos[v] = sorted list of positions where value v appears
  pos = defaultdict(SortedList)
  prev = [0] * (n + 2)  # prev[i] = previous occurrence of arr[i]

  # Initialize
  for i in range(1, n + 1):
    v = arr[i]
    # Find predecessor in pos[v]
    idx = pos[v].bisect_left(i)
    if idx > 0:
      prev[i] = pos[v][idx - 1]
    pos[v].add(i)

  build(1, 1, n)

  results = []
  for _ in range(q):
    line = list(map(int, input().split()))

    if line[0] == 1:  # Update arr[k] = u
      k, u = line[1], line[2]
      old_val = arr[k]

      if old_val == u:
        continue  # No change

      # Remove k from pos[old_val]
      pos[old_val].remove(k)
      idx = pos[old_val].bisect_left(k)
      # Update successor's prev if exists
      if idx < len(pos[old_val]):
        succ = pos[old_val][idx]
        if idx > 0:
          prev[succ] = pos[old_val][idx - 1]
        else:
          prev[succ] = 0
        update(1, 1, n, succ, prev[succ])

      # Add k to pos[u]
      arr[k] = u
      idx = pos[u].bisect_left(k)
      # Update prev[k]
      if idx > 0:
        prev[k] = pos[u][idx - 1]
      else:
        prev[k] = 0
      update(1, 1, n, k, prev[k])

      pos[u].add(k)
      idx = pos[u].bisect_left(k)
      # Update successor's prev if exists
      if idx + 1 < len(pos[u]):
        succ = pos[u][idx + 1]
        prev[succ] = k
        update(1, 1, n, succ, prev[succ])

    else:  # Query: check if range [a, b] has all distinct values
      a, b = line[1], line[2]
      max_prev = query(1, 1, n, a, b)
      results.append("YES" if max_prev < a else "NO")

  print('\n'.join(results))

if __name__ == "__main__":
  solve()

Complexity

Metric	Value	Explanation
Time	O((n + q) log n)	Each query/update is O(log n)
Space	O(n)	Segment tree and position sets

Common Mistakes

Mistake 1: Not Handling Update Cascades

# WRONG - Only updating the changed position
def update_wrong(k, u):
  arr[k] = u
  prev[k] = find_prev(k)
  update_tree(k, prev[k])  # Missing: update successor's prev!

Problem: When you change arr[k], both k’s prev AND the successor’s prev need updating. Fix: After removing k from old value’s set and adding to new value’s set, update the successor in both sets.

Mistake 2: Wrong Distinctness Condition

# WRONG - Using <= instead of <
if max_prev <= a:  # Should be max_prev < a
  print("YES")

Problem: If max_prev == a, it means some element at position >= a has its duplicate exactly at position a, so there IS a duplicate in range. Fix: The condition must be strictly less than: max_prev < a.

Mistake 3: Off-by-One in 1-Indexing

# WRONG - Using 0-indexed array with 1-indexed queries
arr = list(map(int, input().split()))  # 0-indexed
# Query asks for range [1, n], but arr[0] is first element

Problem: CSES uses 1-indexed positions in queries. Fix: Pad array with dummy element at index 0: arr = [0] + list(...).

Edge Cases

Case	Input	Expected Output	Why
Single element	`arr=[5], query [1,1]`	YES	Single element is always distinct
All same values	`arr=[3,3,3], query [1,3]`	NO	All duplicates
Update to same value	`update 2 to arr[2]`	(no change)	Optimization: skip if value unchanged
Adjacent duplicates	`arr=[1,1], query [1,2]`	NO	prev[2] = 1 >= 1
Large values	`arr=[10^9, 10^9]`	NO	Handle values up to 10^9

When to Use This Pattern

Use This Approach When:

You need to check range distinctness with point updates
You can transform “all distinct” into “range max of prev[] < start”
Updates are frequent and queries span varying ranges

Don’t Use When:

No updates (offline Mo’s algorithm may be simpler)
Only need count of distinct values (different problem)
Memory is extremely limited (sqrt decomposition uses less)

Pattern Recognition Checklist:

Need to check for duplicates in range? -> Consider prev[] array
Range queries with point updates? -> Consider segment tree
Transform “all X” into “range min/max satisfies condition”? -> This pattern applies

Easier (Do These First)

Problem	Why It Helps
Static Range Minimum Queries	Basic segment tree range queries
Dynamic Range Minimum Queries	Segment tree with point updates

Similar Difficulty

Problem	Key Difference
Distinct Values Queries	Count distinct (no updates), use Mo’s algorithm
Range Update Queries	Segment tree with lazy propagation

Harder (Do These After)

Problem	New Concept
Polynomial Queries	Complex segment tree operations
Range Queries and Copies	Persistent segment tree

Key Takeaways

The Core Idea: Transform “all distinct in range” into “range max of prev[] < start”
Time Optimization: From O(n) per query to O(log n) using segment tree
Space Trade-off: O(n) extra space for prev[] array and position tracking
Pattern: Range property queries can often be reduced to range min/max queries

Practice Checklist

Before moving on, make sure you can:

Explain why max(prev[a..b]) < a implies all values are distinct
Implement segment tree for range max with point updates
Handle the cascade updates when modifying a position
Solve this problem without looking at the solution

Distinct Values Queries II

Problem Overview

Learning Goals

Problem Statement

Example

Intuition: How to Think About This Problem

Pattern Recognition

Breaking Down the Problem

Core Insight

Solution 1: Brute Force

Idea

Algorithm

Code

Complexity

Why This Works (But Is Slow)

Solution 2: Optimal Solution - Segment Tree with prev[] Array

Key Insight

Data Structure Design

Algorithm

Dry Run Example

Visual Diagram

Code (Python)

Complexity

Common Mistakes

Mistake 1: Not Handling Update Cascades

Mistake 2: Wrong Distinctness Condition

Mistake 3: Off-by-One in 1-Indexing

Edge Cases

When to Use This Pattern

Use This Approach When:

Don’t Use When:

Pattern Recognition Checklist:

Related Problems

Easier (Do These First)

Similar Difficulty

Harder (Do These After)

Key Takeaways

Practice Checklist

Additional Resources