DDIA Notes - Transactions - Weak Isolation Level

Notes for Designing Data-Intensive Applications - Chapter 7 - Transactions - Topic Weak Isolation Level 

Weak Isolation Levels

Two transactions T1 and T2 are running concurrently then issues can arise when:

  1. T1 reads data being modified by T2.
  2. T1 and T2 both modify data concurrently.
Database uses transaction isolation to hide concurrency issues.

Serializability

Serializability helps in concurrency issues by ensuring that transactions running concurrently acts as they are being run serially.

In practice, it's not simple. Serializable isolation has a performance cost. Weaker isolation levels are then used to protect against some concurrency issues.

Types of Weak Isolation Levels:

Read Committed

Most common isolation level which guarantees:

  1. No dirty reads - Read only committed data.
  2. No dirty writes - Overwrite only committed data.

Implementing Read Commited

  • Row-level locks - Write transactions acquires a lock to perform operations. After commit/abort the lock is released Other write transactions have to wait for the lock, this ensures only committed data is read/overwritten.
    Downside is for read locks which gets blocked by long running write queries.

  • Save old and new value - To prevent read queries to get locked on long running write queries, both old committed value and new value are preserved per Write transaction. Any Read transaction reads the old value.

Snapshot Isolation

Concurrent bugs still arise in Read committed, one such issue happens in case of NonRepeatable Reads.

Nonrepeatable Read or Read Skew happens when the data which is read changes over time.

This can cause issues when:
  • Taking backups.
  • Performing Analytics queries and integrity checks.
Solution to this is Snapshot isolation. The idea is to have a consistent snapshot of the database for each transaction - this means transaction has a snapshot all the data committed at the start of the transaction.

Implementing Snapshot Isolation

Write locks are used to prevent concurrent writes accessing the same data.

Read transactions does not use locks. Several versions of the objects are maintained by the database, this is called MVCC. This ensures reads are not blocked by writes.

Readers never block writers and writers never block readers.

Indexes points to all versions of objects. Then they filter out the versions which are not visible by the current transaction.

Preventing Lost Updates

Lost updates can occur when read-modify-write cycle occurs. Two transaction doing rmw cycle can overwrite the modification done by other.

Atomic Write Operations

Instead of doing read-modify-write cycle, we can do atomic writes which prevents reading data into application and modifying it there.

MongoDB does atomic operations for local modifications. Redis provides atomic operations for modifying data structures.

Atomic operations can be implemented by:
  • Exclusive lock on the object so no transaction can read it until update has been applied. This is called cursor stability.
  • Or Atomic operations can be done by a single thread.

Compare-and-set

If the database doesn't provide transactions, then alternative is compare-and-set.

This avoids lost update by allowing an update to happen only when the value is not changed since it was last read.

Comments