HashMap
Intro
A hash map stores key-value pairs and uses hashing to find a bucket quickly. The goal is fast insert, lookup, and delete by key — O(1) average, O(n) worst case when all keys collide into one bucket. In .NET, the primary hash map implementation is Dictionary<TKey, TValue>, while Hashtable is the older non-generic version. A concrete example: a user session cache storing 50K active sessions by session ID achieves sub-microsecond lookups with Dictionary<string, Session>, where the same lookup in a List<Session> would scan 25K entries on average.
Deeper Explanation
Hash maps use two rules together: hash distribution and equality checks.
- The key hash chooses an index or bucket.
- If multiple keys land in the same bucket, equality checks resolve collisions.
- Good hash distribution keeps buckets short and operations fast.
Structure
graph TD
K1[key user one] --> H1[hash to bucket one]
K2[key user two] --> H2[hash to bucket three]
H1 --> E1[user one value ann]
H2 --> E2[user two value bob]Example
var usersById = new Dictionary<int, string>
{
[1001] = "Ann",
[1002] = "Bob"
};
if (usersById.TryGetValue(1002, out var name))
{
Console.WriteLine(name);
}
Pitfalls
- Mutable key fields break lookups — if you insert a key, then mutate a field that participates in
GetHashCode, the entry becomes orphaned in the wrong bucket. Lookups returnfalseeven though the entry exists. Use immutable key types (string,int, records withinitproperties) or never mutate key fields after insertion. - Poor
GetHashCodecreates O(n) degradation — aGetHashCodethat returns the same value for all instances (e.g.,return 0;) puts every entry in one bucket, turning the hash map into a linked list. In .NET, the defaultGetHashCodefor value types uses reflection-based field hashing which is slow; always override it for custom struct keys. - Hash map when sorted iteration is needed —
Dictionarydoes not guarantee enumeration order. If you insert keys 3, 1, 2 and iterate, you might get 3, 1, 2 or any permutation. UseSortedDictionaryfor ordered keys, or sort after retrieval.
Tradeoffs
- Hash map vs sorted map: hash map favors fast point lookups, sorted map favors ordered iteration.
- Hash map vs list scan: hash map wins for repeated lookups by key, list scan can be simpler for very small fixed data sets.
Questions
Excessive collisions put many keys in the same bucket, so operations must compare more entries.
GetHashCode implementation create correctness and performance risk?Hash maps rely on hash and equality contracts. If equal keys do not produce equal hashes, lookups can fail. If hashes are poorly distributed, bucket chains grow and performance drops.
Dictionary<TKey, TValue> is the default hash map in modern .NET.
Hash-Based Collections Comparison
| Type | Key type | Thread-safe | When to use |
|---|---|---|---|
Dictionary<TKey,TValue> |
Generic | No | Default hash map in modern .NET |
Hashtable |
object |
No | Legacy interop only |
HashSet<T> |
N/A (values only) | No | Unique value membership |
ConcurrentDictionary<TKey,TValue> |
Generic | Yes | Concurrent read/write |
Decision rule: use Dictionary<TKey,TValue> by default. Hashtable only for legacy API compatibility.
Links
- Dictionary TKey TValue class — API reference; the primary hash map in modern .NET.
- Selecting a collection class — Microsoft decision guide for choosing between hash-based and sorted collections.
- Anatomy of the .NET dictionary — practitioner deep-dive into bucket layout, collision handling, and resize behavior.
Parent
02 Computer Science
Pages