Loading...
āœ“

12-Hour Money-Back Guarantee

šŸ“˜ Correctness Under Concurrency

šŸ“˜ Correctness Under Concurrency

šŸ“˜ Correctness Under Concurrency

3 Apr 20223 min read

When your system is fast, scalable — and still wrong

Most production bugs are not caused by slowness.
They are caused by multiple things happening at the same time.

Concurrency bugs exist:

  • At low traffic

  • On one machine

  • Even with perfect hardware

They are logic bugs, not capacity bugs.

1ļøāƒ£ What ā€œCorrectnessā€ Actually Means

A system is correct if it behaves as expected under all valid interleavings of operations.

Not:

  • ā€œWorks on my machineā€

  • ā€œPasses unit testsā€

  • ā€œWorks at low trafficā€

But:

Works when operations overlap in unpredictable ways.

2ļøāƒ£ Why Concurrency Breaks Correctness

Because time is not linear in distributed systems.

You think this happens:

A → B → C

Reality:

A ā†˜
     B
A ↗     C

Operations interleave.

3ļøāƒ£ The Simplest Concurrency Bug: Lost Update

āŒ Naive Code

async function likePost(postId) {
  const post = await db.get(postId);
  post.likes += 1;
  await db.save(post);
}

What you expect

2 likes → likes + 2

What actually happens

2 likes → likes + 1 āŒ

Concurrency overwrote correctness.

4ļøāƒ£ Why This Happens (Key Insight)

The bug is not ā€œmissing a lockā€.
The bug is read-modify-write without protection.

This pattern is always unsafe under concurrency.

5ļøāƒ£ Naive Fix #1 — Application Locks āŒ

await mutex.lock();
const post = await db.get(postId);
post.likes += 1;
await db.save(post);
await mutex.unlock();

Why this fails

  • Only works in one process

  • Fails with multiple instances

  • Kills throughput

6ļøāƒ£ Correct Fix #1 — Atomic Operations (Best)

UPDATE posts
SET likes = likes + 1
WHERE id = ?

Why this works

  • Atomic at DB level

  • No read-modify-write

  • Scales correctly

Push correctness down to the lowest layer possible.

7ļøāƒ£ Correct Fix #2 — Optimistic Concurrency Control (OCC)

Version-based update

const post = await db.get(postId);

await db.update(
  { id: postId, version: post.version },
  { likes: post.likes + 1, version: post.version + 1 }
);

If version mismatches → retry.

8ļøāƒ£ Correctness Bug #2 — Check-Then-Act

āŒ Broken Logic

if (balance >= amount) {
  balance -= amount;
}

Why it fails

  • Condition can change

  • Assumption becomes false

9ļøāƒ£ Correct Fix — Combine Check + Act

UPDATE accounts
SET balance = balance - 100
WHERE id = ? AND balance >= 100

Rows affected:

  • 1 → success

  • 0 → insufficient funds

šŸ”Ÿ Correctness Bug #3 — Idempotency

āŒ Duplicate Request Bug

createOrder(order);
createOrder(order); // retry

Result:

  • Two orders āŒ

1ļøāƒ£1ļøāƒ£ Correct Fix — Idempotency Keys

if (seen(idempotencyKey)) return cachedResult;

const result = createOrder(order);
store(idempotencyKey, result);

Correctness is about recognizing repeated intent.

1ļøāƒ£2ļøāƒ£ Why Correctness Is Harder Than Performance

Performance Correctness
Gradual failure Binary failure
Can degrade Cannot be ā€œalmost rightā€
Measurable Often invisible
Testable Needs reasoning

1ļøāƒ£3ļøāƒ£ Why Load Doesn’t Matter Here

Concurrency bugs:

  • Happen at 2 requests

  • Reproduce rarely

  • Pass load tests

  • Fail in prod

This is why senior engineers fear them.

šŸŽÆ Golden Rules for Correctness Under Concurrency

  1. Avoid read-modify-write

  2. Use atomic primitives

  3. Prefer database-enforced correctness

  4. Make operations idempotent

  5. Assume retries happen

  6. Design for reordering