Concurrent Access and Locking — Chapter 8
In continuation to below post
A proper locking mechanism is necessary to ensure data consistency when there is a possibility of multiple clients accessing and possibly modifying the same data at the same time. There are three main approaches to solving this problem: table-level locks, page-level locks, and row-level locks. Each approach has its advantages and disadvantages.
Table-level locks have the simplest logic, which results in fewer bugs and better performance in the area of lock acquisition. Deadlocks can be fairly easily avoided. On the other hand, locking the entire table results in poor performance for applications that do a large number of concurrent reads and writes.
Row-level locks allow high performance at a very high level of concurrency at the cost of greater complexity in the implementation. This results in slower performance for applications that have a low probability of lock contention as well as a higher probability of bugs. It is also very difficult to completely avoid deadlocks, and many implementations do deadlock detection instead.
As the granularity of a lock decreases, the amount of memory required to lock the same amount of data generally increases. So does the complexity of the algorithm, and the potential for a deadlock. However, the decrease in the granularity of the lock increases the potential for concurrent access, which can delay the unfortunate application that has to wait for the lock.
The row-level lock has the smallest granularity; the table-level lock has the greatest; and the page-level lock comes in between, offering a compromise.
MyISAM and MEMORY storage engines can only work with table-level locks.
InnoDB supports row-level locks, and Berkeley DB supports page-level locks.
When a parser processes a query, it determines somewhat simplistically what type of table locks need to be acquired based on the query type. Once the execution reaches the lock manager, it allows each associated storage engine to update the type of lock for each table. Then the table locks are acquired according to a generic storage-engine-independent algorithm. The storage engines that support finer granularity locks request a lock that permits concurrent reads and writes and then handles the locking issues internally.
Table Lock Manager
As explained earlier, all queries involving tables of all storage engines go through the table lock manager regardless of the granularity levels of the locks supported by the storage engine. For example, even if row-level locks are supported, a special table lock is acquired that permits concurrent write access. Table locks are divided into two groups: read locks and write locks. The table lock manager maintains four queues for each table:
• Current read-lock queue ( lock->read )
• Pending read-lock queue ( lock->read_wait )
• Current write-lock queue ( lock->write )
• Pending write-lock queue ( lock->write_wait )
Threads that currently hold a read lock are found in the current read-lock queue in the order of lock acquisition. Threads currently waiting for a read lock are found in the pending read-lock queue.
Read Lock Request
A read lock is always granted as long as there are no current write locks on the table and no higher-priority write locks in the pending write-lock queue. If the lock request can be granted immediately, the corresponding lock descriptor is placed in the current read-lock queue. Otherwise, the lock request enters the pending read lock queue, and the requesting thread suspends itself to wait for the lock. The requested read and pending write locks are prioritized according to the following rules:
• A TL_WRITE lock in the pending write-lock queue has precedence over all read locks except for TL_READ_HIGH_PRIORITY .
• A request for TL_READ_HIGH_PRIORITY has precedence over any pending write lock.
• All write locks in the pending write-lock queue that are not a TL_WRITE have a lower priority than a read lock. The presence of a current write lock causes the requesting thread to suspend itself and wait for the lock to become available except in the cases below:
• With the approval of the storage engine, accomplished through the check_
status( ) function pointer call in the THR_LOCK descriptor, all read locks except TL_READ_NO_INSERT permit one TL_WRITE_CONCURRENT_INSERT lock.
• TL_WRITE_ALLOW_WRITE permits all read and write locks except TL_WRITE_ONLY.
• TL_WRITE_ALLOW_READ permits all read locks except TL_READ_NO_INSERT.
• TL_WRITE_DELAYED permits all read locks except TL_READ_NO_INSERT.
• TL_WRITE_CONCURRENT_INSERT permits all read locks except TL_READ_NO_INSERT.
• The conflicting write lock belongs to the requesting thread.
Write Lock Request
When a write lock is requested, the table lock manager first checks whether there are any write locks already in the current write-lock queue. If there are none, the pending write-lock queue is checked. If the pending write-lock queue is not empty, the request is placed in the write-lock queue and the thread suspends itself to wait for the lock. Otherwise, with the empty pending write-lock queue, the current read-lock queue is checked.
Storage engine interaction with the table lock manager
The locking mechanism provided by the table lock manager is insufficient for several storage engines. MyISAM, InnoDB, NDB, and Berkeley DB storage engines provide some form of an internal locking mechanism.
MyISAM. MyISAM mostly depends on the table lock manager to ensure proper concurrent access. However, there is one exception: a concurrent insert. If the insert operation results in writing the record at the end of the datafile, reading can be done without a lock. In this case, the table lock manager permits one concurrent insert lock, and many read locks. The storage engine ensures consistency by remembering the old end of the file before the start of the concurrent insert, and by not permitting the reads to read past the old end of the file until the concurrent insert is complete.
InnoDB. InnoDB asks the table lock manager to defer locking to the storage engine by changing the lock type to TL_WRITE_ALLOW_WRITE for write locks. Internally, it implements a complex row-level locking system that includes deadlock detection.
NDB. NDB is a distributed storage engine that also supports row-level locks. It deals with the table locks like InnoDB.
Berkeley DB. Berkeley DB internally supports page-level locks, and thus needs the write locks to become TL_WRITE_ALLOW_WRITE just like NDB and InnoDB.
InnoDB Locking
Although InnoDB is not the only storage engine that supports some internal locking mechanism, it is perhaps the most interesting. Being the most stable and mature of all the transactional storage engines in MySQL
Lock types
There are two types of row-level locks: shared and exclusive. InnoDB supports both. To support the coexistence of row- and table-level locks, InnoDB also uses so-called intention locks on a table. There are also two types of intention table locks, shared and exclusive.
As the name intention locks suggest, another transaction can acquire another shared lock if one is holding a shared lock already. However, only one transaction can hold an exclusive lock at any one time.
A transaction must acquire the appropriate intention lock on the table before locking a row in it. Shared row locks are possible after acquiring an exclusive intention lock. However, only an exclusive intention lock allows a transaction to acquire an exclusive row lock.
Record locking
Record or row locking occurs as InnoDB is searching for records requested by the optimizer. What InnoDB locks is the index entry, the space before it, and the space after the last record. This method is called next-key locking.
The next-key locking is necessary to avoid the phantom row problem in transactions. If we did not lock the space before the record, it would be possible for another transaction to insert another record in between. Thus, if we were to run the same query again, we would see the record that was not there the first time we ran the query.
This would make it impossible to meet the requirement of the serializable read transaction isolation level.
Dealing with deadlocks
What happens if transaction A locks record R1, and then tries R2, while transaction B simultaneously locks record R2 first, and then tries to lock R1? Row-level locking naturally introduces the problem of deadlocks.
InnoDB has an automatic deadlock detection algorithm. It will usually roll back the last transaction involved in a deadlock. The deadlock detection algorithm fails in some cases; for example, if tables from other storage engines are used, or if some tables were locked with LOCK TABLES. Additionally, some transactions may be considered to be in a virtual deadlock. For example, if a query is written in such a way that it examines several billion records, it may not release its locks for weeks, although from a theoretical point of view it eventually will. For such situations, InnoDB uses a lock timeout, which is controlled by the configuration variable innodb_lock_wait_
timeout.
Any transaction can potentially be caught in a deadlock. The application programmer needs to write code that deals with this possibility. Usually, retrying a rolled-back transaction is sufficient. It is also possible to minimize the chance of a deadlock by careful programming. Accessing records always in the same index order, writing properly optimized queries, and committing transactions frequently are some of the techniques that help prevent potential deadlocks.