Longest Substring Without Repeating Characters
Problem Overview
| Attribute | Value |
|---|---|
| Difficulty | Medium |
| Category | Sliding Window / Two Pointers |
| Time Limit | 1 second |
| Key Technique | Sliding Window with Hash Set |
| LeetCode | Problem #3 |
Learning Goals
After solving this problem, you will be able to:
- Recognize when to apply the sliding window technique
- Use a hash set to track distinct elements in a window
- Implement variable-size sliding window with character constraints
- Shrink windows efficiently when constraints are violated
Problem Statement
Problem: Given a string s, find the length of the longest substring without repeating characters.
Input:
- A string
scontaining characters
Output:
- A single integer: the length of the longest substring with all unique characters
Constraints:
-
0 <= s <= 10^5 - s consists of English letters, digits, symbols, and spaces
Example
Input:
abcabcbb
Output:
3
Explanation: The longest substring without repeating characters is “abc”, which has length 3. Other valid substrings of length 3 include “bca” and “cab”.
Intuition: How to Think About This Problem
Pattern Recognition
Key Question: How can we efficiently find the longest contiguous segment with all unique elements?
When you see “longest/shortest substring/subarray” with some constraint, think sliding window. The constraint here is “no repeating characters,” which we can track with a hash set.
Breaking Down the Problem
- What are we looking for? Maximum length of a contiguous substring
- What constraint must be satisfied? All characters in the substring must be unique
- What changes as we scan? The window expands when we see a new character, shrinks when we see a duplicate
Analogy
Think of this like a caterpillar crawling along a string. The caterpillar’s body (window) can stretch to include new unique characters, but when it encounters a character already in its body, it must pull up its tail until the duplicate is removed.
Solution 1: Brute Force
Idea
Check every possible substring and verify if it contains all unique characters.
Algorithm
- For each starting position
ifrom 0 to n-1 - For each ending position
jfromito n-1 - Check if substring
s[i..j]has all unique characters - Track the maximum length found
Code (Python)
def longest_substring_brute(s: str) -> int:
"""
Brute force: check all substrings.
Time: O(n^3), Space: O(n)
"""
n = len(s)
max_len = 0
for i in range(n):
for j in range(i, n):
substring = s[i:j+1]
if len(set(substring)) == len(substring): # All unique
max_len = max(max_len, j - i + 1)
return max_len
Complexity
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n^3) | O(n^2) substrings, O(n) to check each |
| Space | O(n) | Set to store substring characters |
Why This Works (But Is Slow)
Correctness is guaranteed because we check every possible substring. However, we do redundant work: when expanding from s[i..j] to s[i..j+1], we rebuild the entire set instead of incrementally updating it.
Solution 2: Optimal - Sliding Window with Hash Set
Key Insight
The Trick: Use two pointers to maintain a window of unique characters. When we find a duplicate, shrink from the left until the duplicate is removed.
Algorithm
- Initialize
left = 0,max_len = 0, and an empty hash setseen - For each
rightfrom 0 to n-1:- While
s[right]is already inseen:- Remove
s[left]fromseen - Increment
left
- Remove
- Add
s[right]toseen - Update
max_len = max(max_len, right - left + 1)
- While
- Return
max_len
Dry Run Example
Let’s trace through with input s = "abcabcbb":
Initial: left=0, seen={}, max_len=0
right=0, char='a':
'a' not in seen
seen = {'a'}
window = "a", max_len = 1
right=1, char='b':
'b' not in seen
seen = {'a','b'}
window = "ab", max_len = 2
right=2, char='c':
'c' not in seen
seen = {'a','b','c'}
window = "abc", max_len = 3
right=3, char='a':
'a' IN seen! Shrink from left
remove 'a', left=1, seen = {'b','c'}
'a' not in seen now
seen = {'b','c','a'}
window = "bca", max_len = 3
right=4, char='b':
'b' IN seen! Shrink from left
remove 'b', left=2, seen = {'c','a'}
'b' not in seen now
seen = {'c','a','b'}
window = "cab", max_len = 3
right=5, char='c':
'c' IN seen! Shrink from left
remove 'c', left=3, seen = {'a','b'}
'c' not in seen now
seen = {'a','b','c'}
window = "abc", max_len = 3
right=6, char='b':
'b' IN seen! Shrink from left
remove 'a', left=4, seen = {'b','c'}
'b' still in seen!
remove 'b', left=5, seen = {'c'}
'b' not in seen now
seen = {'c','b'}
window = "cb", max_len = 3
right=7, char='b':
'b' IN seen! Shrink from left
remove 'c', left=6, seen = {'b'}
'b' still in seen!
remove 'b', left=7, seen = {}
'b' not in seen now
seen = {'b'}
window = "b", max_len = 3
Final: max_len = 3
Visual Diagram
String: a b c a b c b b
Index: 0 1 2 3 4 5 6 7
Step 3: [a b c] max=3
L R
Step 4: [b c a] max=3 (shrink left, add 'a')
L R
Step 5: [c a b] max=3
L R
Step 6: [a b c] max=3
L R
Final answer: 3 (substrings "abc", "bca", "cab" all work)
Code (Python)
def longest_substring_optimal(s: str) -> int:
"""
Sliding window with hash set.
Time: O(n), Space: O(min(n, alphabet_size))
"""
seen = set()
left = 0
max_len = 0
for right in range(len(s)):
# Shrink window while duplicate exists
while s[right] in seen:
seen.remove(s[left])
left += 1
# Add current character and update max
seen.add(s[right])
max_len = max(max_len, right - left + 1)
return max_len
Complexity
| Metric | Value | Explanation |
|---|---|---|
| Time | O(n) | Each character is added and removed from set at most once |
| Space | O(min(n, m)) | Set stores at most min(n, alphabet_size) characters |
Alternative: Optimized Jump with Hash Map
Instead of shrinking one character at a time, we can jump directly to the position after the last occurrence of the duplicate character.
Code (Python)
def longest_substring_jump(s: str) -> int:
"""
Optimized: jump left pointer directly using last index map.
Time: O(n), Space: O(min(n, alphabet_size))
"""
last_index = {} # char -> last seen index
left = 0
max_len = 0
for right, char in enumerate(s):
if char in last_index and last_index[char] >= left:
left = last_index[char] + 1 # Jump past the duplicate
last_index[char] = right
max_len = max(max_len, right - left + 1)
return max_len
Common Mistakes
Mistake 1: Forgetting to Remove Character Before Incrementing Left
# WRONG
while s[right] in seen:
left += 1 # Moved left but didn't remove s[left-1]!
seen.remove(s[left]) # Wrong index
# CORRECT
while s[right] in seen:
seen.remove(s[left]) # Remove BEFORE incrementing
left += 1
Problem: The set becomes inconsistent with the actual window.
Fix: Always remove the character at left before incrementing left.
Mistake 2: Not Checking if Duplicate is Within Current Window
# WRONG (for the jump optimization)
if char in last_index:
left = last_index[char] + 1 # May jump backwards!
# CORRECT
if char in last_index and last_index[char] >= left:
left = last_index[char] + 1
Problem: The duplicate might be from before the current window started.
Fix: Only update left if the duplicate is within the current window.
Mistake 3: Using Dictionary Instead of Set (Overcomplicated)
# OVERCOMPLICATED
char_count = {}
char_count[s[right]] = char_count.get(s[right], 0) + 1
while char_count[s[right]] > 1:
char_count[s[left]] -= 1
if char_count[s[left]] == 0:
del char_count[s[left]]
left += 1
# SIMPLER (use a set when you only need presence/absence)
seen = set()
while s[right] in seen:
seen.remove(s[left])
left += 1
seen.add(s[right])
Problem: Tracking counts when you only need to track presence. Fix: Use a set for simpler “contains duplicate” checks.
Edge Cases
| Case | Input | Expected Output | Why |
|---|---|---|---|
| Empty string | "" |
0 | No characters |
| Single character | "a" |
1 | One unique character |
| All same characters | "aaaa" |
1 | Only one unique at a time |
| All unique | "abcd" |
4 | Entire string is the answer |
| Duplicate at end | "abca" |
3 | “abc” or “bca” |
| Space character | " " |
1 | Space counts as a character |
| Mixed characters | "a1!a" |
3 | “a1!” or “1!a” |
When to Use This Pattern
Use Sliding Window + Hash Set When:
- Finding longest/shortest substring with distinct elements
- Counting subarrays/substrings with unique constraint
- Problems involving “at most K distinct” elements
Don’t Use When:
- Looking for subsequences (not contiguous)
- Order of elements matters beyond position
- Need to track exact counts, not just presence
Pattern Recognition Checklist:
- Contiguous subarray/substring? --> Consider sliding window
- Constraint on distinct/unique elements? --> Use hash set
- “At most K distinct”? --> Track count of distinct, shrink when > K
- “Longest without repeating”? --> This exact pattern
Related Problems
Easier (Do These First)
| Problem | Why It Helps |
|---|---|
| Maximum Subarray | Intro to sliding/expanding window concept |
| Contains Duplicate | Basic hash set usage |
Similar Difficulty
| Problem | Key Difference |
|---|---|
| Longest Substring with At Most Two Distinct | Allow 2 distinct instead of 0 duplicates |
| Longest Substring with At Most K Distinct | Generalized to K distinct characters |
| Fruit Into Baskets | Same as “at most 2 distinct” |
Harder (Do These After)
| Problem | New Concept |
|---|---|
| Minimum Window Substring | Shrinking to find minimum |
| Longest Repeating Character Replacement | Window with modification budget |
| Substring with Concatenation of All Words | Fixed-size word matching |
CSES Related Problems
| Problem | Concept |
|---|---|
| Playlist | Same pattern - longest with unique elements |
| Subarray Distinct Values | Count subarrays with at most K distinct |
Key Takeaways
-
The Core Idea: Maintain a window of unique characters using a hash set, expanding right and shrinking left when duplicates appear.
-
Time Optimization: From O(n^3) brute force to O(n) by avoiding redundant set rebuilding.
-
Space Trade-off: O(min(n, alphabet_size)) space for O(n) time.
-
Pattern: This is the classic “sliding window with hash set” pattern for distinct element constraints.
Practice Checklist
Before moving on, make sure you can:
- Solve this problem in under 10 minutes without looking at the solution
- Explain why each character is processed at most twice (O(n) time)
- Implement both the set-based and map-based approaches
- Adapt this pattern for “at most K distinct” problems
- Identify this pattern when you see “longest substring with unique/distinct”
Complete Solution Template
# Python - Copy-paste ready
def lengthOfLongestSubstring(s: str) -> int:
seen = set()
left = 0
max_len = 0
for right in range(len(s)):
while s[right] in seen:
seen.remove(s[left])
left += 1
seen.add(s[right])
max_len = max(max_len, right - left + 1)
return max_len