Stick Lengths

Problem Overview

Attribute Value
Source CSES Problem Set - Stick Lengths
Difficulty Easy
Category Sorting and Searching
Time Limit 1 second
Key Technique Median / Sorting

Learning Goals

After solving this problem, you will be able to:

  • Understand why the median minimizes the sum of absolute deviations
  • Apply sorting to find the optimal target value in optimization problems
  • Recognize problems where median is the optimal choice over mean
  • Handle potential integer overflow when summing large values

Problem Statement

Problem: Given n sticks with various lengths, modify all sticks to have the same length. You can increase or decrease each stick’s length by 1 unit at a cost of 1 unit. Find the minimum total cost.

Input:

  • Line 1: Integer n (number of sticks)
  • Line 2: n integers representing stick lengths

Output:

  • The minimum total cost to make all sticks equal length

Constraints:

  • 1 <= n <= 2 x 10^5
  • 1 <= a[i] <= 10^9

Example

Input:
5
2 3 1 5 2

Output:
5

Explanation:

  • Target length = 2 (the median)
  • Stick 1: 2 - 2 = 0
  • Stick 2: 3 - 2 = 1
  • Stick 3: 1 - 2 = 1
  • Stick 4: 5 - 2 = 3
  • Stick 5: 2 - 2 = 0
  • Total cost: 0 + 1 + 1 + 3 + 0 = 5

Intuition: How to Think About This Problem

Pattern Recognition

Key Question: Given a set of numbers, what single value minimizes the sum of absolute differences to all numbers?

The answer is the median. This is a fundamental property in statistics and optimization. The median is the “balance point” that minimizes total movement.

Why Median, Not Mean?

Consider this simple example: [1, 2, 100]

Target Cost Calculation Total
Mean (34.3) |1-34| + |2-34| + |100-34| = 33 + 32 + 66 131
Median (2) |1-2| + |2-2| + |100-2| = 1 + 0 + 98 99

The median wins because it is robust to outliers. The mean gets “pulled” toward extreme values.

Mathematical Proof (Intuitive)

Imagine moving the target point t along a number line:

  • Moving t one unit right: Cost increases by (count of values <= t) and decreases by (count of values > t)
  • At the median: roughly half the values are on each side, so moving in either direction increases total cost
Values: 1  2  2  3  5
        |  |  |  |  |
        ←-2-→ ←--2--→   (2 values left of median, 2 values right)
              ↑
           median (2)

Analogies

Think of this problem like finding the optimal meeting point for a group of friends on a street. Each friend wants to minimize their travel distance. The median location ensures the total travel for everyone is minimized.


Solution 1: Brute Force

Idea

Try every possible target length (from minimum to maximum stick length) and calculate the total cost for each. Return the minimum.

Algorithm

  1. Find the range of stick lengths [min, max]
  2. For each possible target in this range:
    • Calculate sum of stick[i] - target for all sticks
  3. Return the minimum cost found

Code

def solve_brute_force(sticks):
  """
  Brute force: try all possible target lengths.

  Time: O(n * (max - min))
  Space: O(1)
  """
  min_len = min(sticks)
  max_len = max(sticks)
  min_cost = float('inf')

  for target in range(min_len, max_len + 1):
    cost = sum(abs(s - target) for s in sticks)
    min_cost = min(min_cost, cost)

  return min_cost

Complexity

Metric Value Explanation
Time O(n * R) R = max - min, can be up to 10^9
Space O(1) Only tracking minimum cost

Why This Works (But Is Slow)

This guarantees finding the optimal solution by exhaustive search, but with stick lengths up to 10^9, iterating through all possible targets is far too slow.


Solution 2: Optimal Solution (Median)

Key Insight

The Trick: The median minimizes the sum of absolute deviations. Sort the array and pick the middle element as the target.

Why This Works

For any set of numbers, the median has this special property:

  • Moving the target away from the median always increases total cost
  • At the median, the number of values below roughly equals the number above
  • Any movement benefits one side but hurts the other side more

Algorithm

  1. Sort the stick lengths
  2. Find the median (middle element for odd n, either middle element for even n)
  3. Calculate total cost: sum of stick[i] - median

Dry Run Example

Let’s trace through with input sticks = [2, 3, 1, 5, 2]:

Step 1: Sort the sticks
  Original: [2, 3, 1, 5, 2]
  Sorted:   [1, 2, 2, 3, 5]
            indices: 0  1  2  3  4

Step 2: Find median
  n = 5 (odd)
  median_index = n // 2 = 2
  median = sorted[2] = 2

Step 3: Calculate cost
  |1 - 2| = 1
  |2 - 2| = 0
  |2 - 2| = 0
  |3 - 2| = 1
  |5 - 2| = 3
  ─────────────
  Total = 5

Visual Diagram

Sorted sticks: [1, 2, 2, 3, 5]

Number line:
    1     2     3     4     5
    |     |           |     |
    *    **           *     *
          ↑
       median = 2

Cost visualization:
    1 ←─1─→ 2 (median)
          2 ←─0─→ 2
          2 ←─0─→ 2
          2 ←──1──→ 3
          2 ←────3────→ 5
                        ─────
                        Total: 5

Code

Python Solution:

def solve(sticks):
  """
  Optimal solution using median.

  Time: O(n log n) - dominated by sorting
  Space: O(n) - for sorted array (or O(1) if sorting in-place)
  """
  sticks.sort()
  n = len(sticks)
  median = sticks[n // 2]

  return sum(abs(s - median) for s in sticks)


def main():
  n = int(input())
  sticks = list(map(int, input().split()))
  print(solve(sticks))


if __name__ == "__main__":
  main()

Complexity

Metric Value Explanation
Time O(n log n) Sorting dominates; sum calculation is O(n)
Space O(n) For sorted array; O(1) if sorting in-place

Note on Even-Length Arrays

For arrays with even length, both middle elements are valid medians. Any value between them (inclusive) gives the same minimum cost. We simply pick sticks[n // 2].


Common Mistakes

Mistake 1: Using Mean Instead of Median

# WRONG
target = sum(sticks) // len(sticks)  # This is the mean!
cost = sum(abs(s - target) for s in sticks)

Problem: The mean minimizes sum of SQUARED differences, not absolute differences.

Fix: Use the median (middle element of sorted array).

Mistake 2: Integer Overflow

Problem: With n = 2 x 10^5 sticks and values up to 10^9, the sum can reach ~2 x 10^14.

Fix: Use long long for the cost variable.

Mistake 3: Forgetting to Sort

# WRONG
def solve(sticks):
  n = len(sticks)
  median = sticks[n // 2]  # This is NOT the median!
  return sum(abs(s - median) for s in sticks)

Problem: The middle element of an unsorted array is not the median.

Fix: Sort the array first.

Mistake 4: Wrong Median Index for Even Length

# Not wrong, but worth understanding
sticks = [1, 2, 3, 4]  # n = 4
# Both sticks[1] = 2 and sticks[2] = 3 are valid medians
# Any target in [2, 3] gives the same minimum cost

Note: For this problem, using sticks[n // 2] always works. You don’t need to average two middle values.


Edge Cases

Case Input Expected Output Why
Single stick n=1, [5] 0 Already equal to itself
Two sticks n=2, [1, 100] 99 Either median works
All same n=4, [7,7,7,7] 0 No modification needed
Large values n=2, [1, 10^9] 999999999 Test overflow handling
Already sorted n=3, [1,2,3] 2 Median is 2: |1-2|+|2-2|+|3-2|=2

When to Use This Pattern

Use Median When:

  • Minimizing sum of absolute differences/deviations
  • Finding optimal meeting point in 1D
  • Robust estimation needed (resistant to outliers)
  • Problem asks for minimum total “movement” or “cost”

Use Mean When:

  • Minimizing sum of squared differences
  • Variance/standard deviation calculations

Pattern Recognition Checklist:

  • Need to find a target value that minimizes total distance? -> Consider median
  • Problem involves making elements equal with unit cost? -> Median is optimal
  • Seeing “minimum operations” or “minimum cost” language? -> Think median vs mean

Easier (Do These First)

Problem Why It Helps
Distinct Numbers Basic sorting practice

Similar Difficulty

Problem Key Difference
Minimum Moves to Equal Array Elements II Same problem, different platform
Apartments Sorting + matching
Ferris Wheel Sorting + greedy pairing

Harder (Do These After)

Problem New Concept
Best Position in 2D Extend to 2D grid (x and y medians independently)
Array Division Binary search on answer
Allocate Mailboxes Multiple medians with DP

Key Takeaways

  1. The Core Idea: The median minimizes the sum of absolute differences - this is a mathematical fact worth memorizing.

  2. Time Optimization: We improved from O(n * R) brute force to O(n log n) by using the median property directly.

  3. Space Trade-off: O(n) for sorting is acceptable; can be O(1) with in-place sort.

  4. Pattern: This is a classic “minimization with absolute differences” problem - always think median!

  5. Watch for Overflow: Sum of differences can be large; use appropriate data types.


Practice Checklist

Before moving on, make sure you can:

  • Explain why median minimizes sum of absolute deviations
  • Implement the solution in under 5 minutes
  • Handle the integer overflow edge case
  • Recognize this pattern in disguised problems

Additional Resources