HashSet

Intro

HashSet<T> stores unique values with O(1) average membership checks. Use it when uniqueness and lookup speed matter more than ordering. A concrete use case: deduplicating 500K event IDs in a message processing pipeline — HashSet<string>.Contains checks each ID in sub-microsecond time, while List<string>.Contains would scan up to 500K entries per check, turning a 200 ms job into a 40-minute one.

Deeper Explanation

HashSet<T> is hash-based: GetHashCode determines the bucket; Equals resolves collisions within a bucket. Core operations (Add, Contains, Remove) are O(1) average.

HashSet<T> also exposes the full set algebra in O(n) time:

These are in-place mutations. IsSubsetOf, IsSupersetOf, and SetEquals test structural relationships without modifying the set. A common production pattern: accumulate processed IDs in a HashSet<int>, then call ExceptWith against the full batch to find unprocessed items in O(n) instead of O(n²).

Structure

graph TD
    H1[value dotnet hash] --> B0[bucket zero]
    H2[value csharp hash] --> B1[bucket one]
    B0 --> V1[dotnet]
    B1 --> V2[csharp]

Example

var tags = new HashSet<string>(StringComparer.OrdinalIgnoreCase)
{
    "dotnet",
    "csharp"
};

var added = tags.Add("DOTNET"); // false, already exists by comparer

Pitfalls

Tradeoffs

Questions

Hash-Based Collections Comparison

Type Stores Thread-safe When to use
HashSet<T> Values only No Unique membership checks, set operations
Dictionary<TKey,TValue> Key-value pairs No Key-based lookup
SortedSet<T> Values only No Sorted uniqueness, O(log n) ops

Decision rule: use HashSet<T> when you only need to track membership or perform set operations (union, intersect, except). Use Dictionary when you need to associate a value with each key.

HashSet class — API reference with set operation methods (UnionWith, IntersectWith, ExceptWith). ISet interface — interface contract for set semantics; useful for abstracting over HashSet and SortedSet.
  • Collections overview and complexity — Microsoft overview of all collection types with complexity guidance.
  • HashSet implementation in dotnet runtime — source code for internal bucket and slot layout.

  • Whats next

    Parent
    02 Computer Science

    Pages