Database interview questions
Table of Contents
Lock-Based Concurrency Control
Lock-based protocols in DBMS ensure that a transaction cannot read or write a data item until it obtains the appropriate lock. By controlling access in this way, these protocols maintain serializability and prevent concurrency problems such as lost updates or dirty reads.
A lock is a mechanism used to manage concurrent access to database items. Each transaction must acquire the necessary lock before performing read or write operations. There are two main types of locks:
lock-S.lock-X.A transaction can acquire a lock only if it is compatible with existing locks held by other transactions:
Types of Lock-Based Protocols
insert,delete, or update operations. The lock is released only when the transaction completes.The moment a transaction has acquired all the locks it requires is called the lock point. After this point, it transitions from the growing to the shrinking phase.
Lock Upgrading and Downgrading
By enforcing these rules, the Two-Phase Locking protocol ensures consistency and prevents conflicts between concurrent transactions. Restricting when locks can be acquired, upgraded, or released helps maintain a serializable schedule, preserving the integrity of the database.
Optimistic vs Pessimistic Locking
Pessimistic locking is a concurrency control mechanism that assumes data conflicts are likely when multiple users or processes access the same resource simultaneously. To prevent these conflicts, it acquires locks on resources before performing operations, ensuring exclusive access for the duration of a transaction. Once a lock is obtained, no other transaction can modify or read the locked resource until the lock is released. This approach guarantees consistency but can lead to waiting times, reduced concurrency, and potential performance bottlenecks under heavy load.
Pessimistic locking can be applied at different levels of granularity — from database-level and table-level locks to row-level or even column-level locks. It is commonly used in environments where data integrity is critical and the cost of conflicts is high, such as in banking systems and financial transactions.
Optimistic locking takes the opposite approach to traditional locking mechanisms. It assumes that write conflicts are rare and that most transactions can proceed concurrently without interfering with one another. Instead of preventing access to shared data, the database allows multiple transactions to read and modify the same row at the same time.
Rather than locking rows during reads, optimistic locking detects conflicts only at the time an update is attempted. Each row is associated with a version identifier, such as a numeric counter, a timestamp, or a hash. This version is read together with the row and later used to verify that the data has not been modified by another transaction.
When an update occurs, the database checks whether the version stored in the table is still the same as the version that was originally read. If the versions match, the update succeeds and the version is updated atomically along with the data. If the versions do not match, it means another transaction has modified the row in the meantime, and the update is rejected to prevent lost updates.
Version identifiers can be implemented in several ways. A common approach is an integer version field that increments on every successful update. Some systems use a timestamp field such asupdated_at, which is overwritten on modification, while others use a checksum or hash derived from the row contents that is recomputed whenever the data changes.
The optimistic locking workflow begins by reading the row together with its current version.
1SELECT id, balance, version
2FROM accounts
3WHERE id = 1;The application receives the row data along with the version value.
1id = 1, balance = 100, version = 5When updating the row, the application includes the previously read version in the update condition. This ensures that the update will only succeed if no other transaction has modified the row since it was read.
1UPDATE accounts
2SET balance = 120, version = version + 1
3WHERE id = 1 AND version = 5;If exactly one row is affected, the update succeeds and the version is incremented. If zero rows are affected, the version check has failed, indicating a conflict. The application can then decide whether to retry the operation using the latest data or abort the transaction altogether.
Optimistic locking minimizes locking overhead and maximizes concurrency, making it well suited for high-read systems and workloads with low write contention, such as REST APIs, microservices, and distributed applications.
In summary, pessimistic locking prioritizes safety and consistency by restricting access through locks, while optimistic locking prioritizes performance and scalability by assuming that conflicts are infrequent and resolving them only when necessary.
Write-Ahead Logging (WAL)
Write-Ahead Logging (WAL) is a core durability mechanism in databases. Before any change is made to the actual data files on disk, the change must first be written to a log that is persisted safely. This design allows PostgreSQL to guarantee durability without slowing down every transaction with expensive disk writes.
When you commit a transaction, PostgreSQL does not immediately flush the modified data pages to disk. Instead, it follows a carefully optimized sequence. The database first records all changes made by the transaction in the WAL. These WAL records are then flushed to disk using fsync, ensuring the log is safely persisted. Once this flush completes, PostgreSQL reports a successful commit back to the application. The actual data pages are written to disk later by background processes.
The key insight is that the WAL contains enough information to fully reconstruct every committed transaction. As long as the WAL is durable, the database can always recover to a consistent state, even if the data files themselves are temporarily out of date.
If a crash occurs after the commit has been acknowledged but before the modified data pages are written to disk, PostgreSQL will replay the WAL during startup. This recovery process re-applies all committed changes, restoring the database to the exact state it was in before the crash. Although recovery may take longer in this scenario, crashes are relatively rare, making this trade-off highly effective in practice.
From a performance perspective, WAL provides several important advantages. Writes to the WAL are sequential, which is significantly faster than random writes to data pages scattered across disk. Multiple transactions can also be grouped together during WAL flushes, reducing the number of expensive disk sync operations. As a result, commit latency becomes more predictable and primarily depends on WAL write throughput rather than complex data page updates.
In short, WAL allows PostgreSQL to decouple transaction durability from data page persistence. By making the log durable first and deferring data writes, PostgreSQL achieves both strong crash safety and high transaction throughput.