Skip to content

C# Core Optimization - Part 4: The "Performance Trap" of Boxing and Unboxing ​

Of course. This time, we'll dissect a classic performance trap in C# that directly relates to our lessons on the Stack, Heap, struct, and class: Boxing and Unboxing.


The Scenario πŸ“ ​

  • System: A piece of code needs to process a large list of integers (int). Due to old habits or working with legacy code, a developer uses ArrayList instead of List<int>.
  • The Problem: This code runs surprisingly slowly and puts a lot of pressure on the Garbage Collector (GC).

What Are Boxing and Unboxing? 🧐 ​

This is the process of converting between a value type (like int, double, struct) and a reference type (object).

  1. Boxing:

    • This is the process of converting a value type (which lives on the Stack) into a reference type (object).
    • Analogy: You have a small diamond (int). To send it through a shipping service that only accepts standard packages (object), you first have to put the diamond inside a box.
    • The Cost:
      • Heap Allocation: This "box" is a new object that gets allocated on the Heap.
      • Data Copying: The value of the diamond is copied into the box.
      • GC Pressure: This box later becomes garbage that the GC has to clean up.
  2. Unboxing:

    • This is the reverse process: converting the object (the box on the Heap) back into a value type.
    • The Cost:
      • Type Checking: The runtime has to check if the thing inside the box is actually a diamond.
      • Data Copying: The value is copied from the Heap back to the Stack.

The Problematic Code (Using ArrayList) ​

ArrayList is an old, non-generic collection that only works with object.

csharp
// ArrayList stores items of type `object`.
var list = new ArrayList();

for (int i = 0; i < 1_000_000; i++)
{
    // BOXING happens here!
    // Each `int` (a value type) must be "boxed" into an `object`
    // to be added to the ArrayList.
    // -> This creates 1 million objects on the HEAP.
    list.Add(i);
}

long sum = 0;
foreach (object item in list)
{
    // UNBOXING happens here!
    // Each `object` must be "unboxed" to get the `int` value out.
    // -> This requires 1 million type checks and data copies.
    sum += (int)item;
}

The Solution: Always Use Generic Collections βœ… ​

  • The Logic: Generic collections (like List<T>), introduced in .NET 2.0, were created specifically to solve this problem. They are strongly-typed and know exactly what kind of data they are storing.

  • The Optimized Code:

    csharp
    // List<int> knows it is storing `int` values.
    var list = new List<int>();
    
    for (int i = 0; i < 1_000_000; i++)
    {
        // NO BOXING. The `int` is stored directly.
        list.Add(i);
    }
    
    long sum = 0;
    foreach (int item in list)
    {
        // NO UNBOXING. The `item` is already an `int`.
        sum += item;
    }

Analyzing the Results ✨ ​

  1. No Heap Allocations: The generic version does not create 1 million "boxes" on the heap.
  2. No GC Pressure: Because no garbage is created, the GC doesn't have to do any work, which significantly reduces CPU usage.
  3. Faster Execution: The code runs much faster because it completely eliminates the cost of allocating, copying, and type-checking.

Conclusion:

  • The Golden Rule: "Avoid boxing and unboxing in performance-sensitive code."
  • The easiest way to do this is to always prefer generic collections (List<T>, Dictionary<TKey, TValue>) over their old, non-generic counterparts (ArrayList, Hashtable).
  • Be wary of any API that requires you to pass a value type into a parameter of type object, as this is where boxing will silently occur.