Transaction in Database
- First, we need to understand the Transaction in the Database. A Transaction in the database has 4 properties ACID stands for: atomicity, consistency, isolation and durability.
- One transaction contains many queries that run by sequence. When a query has an error, the entire transaction will be an error, and the database will roll back.
Isolation levels in Database
Read Uncommitted Isolation Level
- This is quite a loose level, transaction will read data that haven’t been committed yet. Any updates and inserts that are not committed will be reflected in our transaction.
- This isolation level finds heavy use in the booking systems, where if any other transaction is trying to update the availability of a seat, even though that transaction is not committed, we should be able to see those changes.
- This isolation level does not ensure the integrity of data, however, if the deviation is acceptable, we can use this to decrease the deadlock situation, … (for example in case count all the Viet Nam population, we can accept the deviation is 100-200 people)
Read Committed Isolation Level
- This isolation level guarantees 2 things:
- No Dirty Reads: The database will not allow reading data that haven’t been committed yet to the database
- No Dirty Writes: This means that the database will accept any transaction on a particular row that already has a transaction running on it. The other transaction has to wait till the point the previous transaction on the rows is committed and only after that, any other transaction will be able to perform a write operation for the specific rows.
- In Spring, if using @Transactional without any config, the isolation level is Read Committed.
- No Dirty Reads: The database will not allow reading data that haven’t been committed yet to the database
Repeatable Reads Isolation Level
- For example, we have a table containing the employee’s salary information, and we have a transaction performing 2 tasks:
- Query 1: Count the number of employees.
- Query 2: Count the sum of the employee’s salary.
-
Now, let’s assume a scenario where any value was inserted on this table after the execution of Query 1. Now, the result of Query 2 will be impacted by this. In such scenarios, we prefer the use of the Repeatable reads Isolation Level which guarantees that the number of rows that are participating in the transaction will retain their current values throughout the transaction. Databases implement the same using the Multi-Version Concurrency Control.
- This guarantees the number of employee does not change from the start of transaction, so this new employee (1 new row) will not be count, and guarantees the value of task 2 remain true.
Serializable Isolation Level
- This is the strongest of all the isolation levels and guarantees pure isolation.
- No other transaction will be able to read or write value until this transaction is complete.
- This isolation level solves all the problems that were in all 3 previously discussed isolation levels but as this allows only 1 query to execute on the database, it restricts the scalability of the system.
Example
-
Assume that we are building a snowman, we add in sequence its head, its hand, then its leg.
-
Read Uncommitted Isolation Level: do not have any limit, example, we build its head at minute 0, and at minute 10, transaction be rolled back (delete query build its head). So at minute 5, we still read the uncommited query(building its head).
-
Read Committed Isolation Level: if we’ve done the job of building its head, and committed it (or rolled back). That make other transaction can read the change from this transaction.
- Repeatable Read Isolation Level: Assume we have 2 transactions:
- Transaction 1: At minute 0, read all the database, and do it again at minute 10, close the transaction at minute 11
- Transaction 2: Add data to the database at minute 5, and close the transaction at minute 6
- In the previous isolation level, because of the transaction 2 completion, so of course, data will be different between minute 0 and minute 10 of transaction 1.
- But, in this isolation level, the data in the reading area will always be guaranteed to be the same in each read in the session.
- Serializable Isolation Level:
- First, we must understand that data in Repeatable Read is not locked, which means it only ensures that the data will always be the same when we read it at different times in a session but does not guarantee that other sessions cannot change it, which means that even if the other party changes the data, the database still allows changes, but in Repeatable Read it only ensures that the data read in the session remains unchanged (even though it has actually changed) but does not prevent it.
- At this level, we will ensure that the data area is locked, and the other party is not allowed to change anything anymore.
- It sounds like Serializable helps us feel comfortable and ensures that the data is correct, but in essence, it prevents all multi-threaded actions in the database on a data area, and we can only run 1 session on 1 area at a time.