Caching

Intro

Caching stores a copy of data closer to where it is consumed — in process memory, in a shared out-of-process store like Redis, or both — so that repeated reads skip the slower origin. The mechanism is simple: check the cache first; on a miss, fetch from the source, store the result, and return it. On a hit, return the stored copy without touching the source at all.

Most systems layer two cache tiers. An in-process cache (L1) sits inside the application and returns data in nanoseconds with no network hop. A distributed cache (L2) like Redis or SQL Server sits outside the process, survives restarts, and is shared across instances — but every read costs a network round-trip and deserialization. When both tiers are present, L1 is checked first; an L1 miss falls through to L2; an L2 miss falls through to the origin.

The hard part is never the read path — it is deciding when cached data is no longer valid. Every caching bug in production traces back to invalidation: serving stale prices, leaking one tenant's data to another, or slamming the database when a hot key expires across all instances simultaneously. The rest of this note covers how to choose patterns, invalidation strategies, and operational guardrails that keep the cache correct.

flowchart TD
  A[Request] --> B{Cache hit}
  B -->|Yes| C[Return cached]
  B -->|No| D[Fetch from source]
  D --> E[Store in cache]
  E --> F[Return]

Cache Patterns

Cache-aside — the application reads and writes the cache explicitly. On a miss, the app fetches from the source, writes to the cache, and returns. The app owns both paths.

Read-through / write-through — a cache layer sits between the app and the source. On a read miss, the cache itself fetches from the source. On a write, the cache writes through to the source synchronously. The app talks only to the cache.

Write-behind (write-back) — like write-through, but the cache writes to the source asynchronously. This reduces write latency but risks data loss if the cache fails before flushing.

Cache-aside with IDistributedCache:

public static async Task<string> GetUserName(
    string userId,
    IDistributedCache cache,
    Func<string, Task<string>> loadFromDb,
    CancellationToken ct)
{
    var key = $"user-name:{userId}";
    var cached = await cache.GetStringAsync(key, ct);
    if (cached is not null)
        return cached;

    var value = await loadFromDb(userId);
    await cache.SetStringAsync(
        key,
        value,
        new DistributedCacheEntryOptions { AbsoluteExpirationRelativeToNow = TimeSpan.FromMinutes(5) },
        ct);
    return value;
}

The same operation with HybridCache (.NET 9+) — stampede protection and L1/L2 layering are built in:

public class UserService(HybridCache cache)
{
    public async Task<string> GetUserNameAsync(string userId, CancellationToken ct)
    {
        return await cache.GetOrCreateAsync(
            $"user-name:{userId}",
            async cancel => await LoadFromDbAsync(userId, cancel),
            token: ct);
    }
}

Invalidation Strategies

Invalidation strategy is a correctness decision, not an optimization detail. Start by writing down your staleness contract, then pick the simplest strategy that meets it.

Decision rule of thumb:

flowchart TD
  A[Need cached reads] --> B{Max staleness is small}
  B -->|Yes| C{Can emit change events}
  B -->|No| D[TTL only]
  C -->|Yes| E[Event-driven plus TTL]
  C -->|No| F[Versioned keys plus TTL]
  D --> G{Hot keys exist}
  G -->|Yes| H[Add jitter and coalescing]
  G -->|No| I[Simple cache-aside]

Correctness and Staleness

Treat cached data as a replica with its own consistency model.

Stale-while-revalidate sketch — dual-TTL with background refresh:

// Envelope wraps the value with a freshness timestamp.
var json = await cache.GetStringAsync(key, ct);
var envelope = JsonSerializer.Deserialize<Envelope<T>>(json);

if (envelope is not null)
{
    if (DateTimeOffset.UtcNow <= envelope.FreshUntilUtc)
        return envelope.Value; // Fresh — serve immediately.

    // Soft-expired: serve stale, trigger background refresh.
    _ = Task.Run(() => RefreshAsync(key));
    return envelope.Value;
}

// Hard miss: block and refill.
var value = await LoadFromSourceAsync(ct);
await WriteCacheAsync(key, value, softTtl, hardTtl, ct);
return value;

Notes:

Cache Stampede

Cache stampede (thundering herd, dogpile) happens when many requests miss at once and all recompute the same expensive value. The result is a burst that can overwhelm the database or downstream service — often right when the cache is least helpful.

Mitigations:

sequenceDiagram
  participant C as Clients
  participant A as Api
  participant R as Cache
  participant D as Db

  Note over C,A: Stampede
  loop Many requests
    C->>A: Read item
    A->>R: Get key
    R-->>A: Miss
    A->>D: Load item
  end

  Note over C,A: Coalescing
  C->>A: Read item
  A->>R: Get key
  R-->>A: Miss
  A->>A: Singleflight join
  A->>D: Load item once
  D-->>A: Item
  A->>R: Set key
  A-->>C: Serve all callers

Tradeoffs

Dimension IMemoryCache (L1) IDistributedCache (L2) HybridCache (.NET 9+)
Latency Nanoseconds, no network Milliseconds, network round-trip + deserialization Nanoseconds for L1 hit, milliseconds for L2 miss
Capacity Bounded by app process memory Bounded by cache cluster (Redis, SQL) L1 bounded by process, L2 by cluster
Sharing Per-instance, no sharing across pods Shared across all instances Shared L2, per-instance L1
Stampede protection Manual (singleflight pattern) Manual (distributed lock) Built-in
Survivability Wiped on restart or deploy Survives app restarts L1 wiped, L2 survives
Tag-based invalidation Not supported Not supported Built-in
Best for Single-instance apps, hot-path data Multi-instance apps, shared state Default choice for new .NET 9+ apps

Decision rule: start with HybridCache for new .NET 9+ projects — it handles L1/L2 layering, stampede protection, and serialization out of the box. Fall back to IDistributedCache when you need explicit control over cache writes, or IMemoryCache for single-instance scenarios where distributed state is unnecessary.

Pitfalls

Questions

References


Whats next

Parent
Software Engineering

Topics

Pages