0%
Exam Sidemann Logo

DATABASE ADMINISTRATION

Learning Outcome 5

LEARNING OUTCOME 5: MANAGE DATA CONCURRENCY

Database Transactions: Maintaining Data Integrity

In the world of databases, transactions are the fundamental unit of work that ensures data consistency and integrity. They act as a single, indivisible unit of operations that modify the database. Here's a breakdown of what transactions are and the core principles that govern them.

What is a Database Transaction?

A database transaction is a logical sequence of database operations treated as a single unit. It either succeeds completely, updating the database as intended, or fails entirely, leaving the database in its original state. This ensures data consistency by preventing partial or incomplete modifications.

ACID Properties: The Pillars of Transaction Reliability

To guarantee data integrity and reliability, database transactions adhere to the ACID properties, an acronym that stands for:

  1. Atomicity: This property ensures that a transaction is treated as an indivisible unit. Either all the operations within the transaction are completed successfully, or none of them are. Imagine a transaction transferring funds between two accounts. Atomicity ensures that either both accounts are updated (debiting one and crediting the other), or neither is modified, preventing inconsistencies.
  2. Consistency: This property guarantees that a transaction moves the database from one valid state to another. It enforces data integrity rules and constraints defined within the database schema. For instance, a transaction updating a customer's age might have a consistency rule that the age cannot be negative.
  3. Isolation: This property ensures that concurrent transactions do not interfere with each other's data. It guarantees that the outcome of a transaction is the same as if it were executed alone, even if multiple transactions are happening simultaneously. Isolation mechanisms like locking prevent one transaction from "seeing" or modifying data being used by another transaction until the first one completes.
  4. Durability: This property ensures that once a transaction commits (successfully finishes), the changes made to the database are permanent and persist even in the event of system failures like crashes or power outages. This is often achieved through techniques like transaction logging.

Database Transaction States

Database transactions undergo various states as they progress through their execution. Understanding these states is crucial for troubleshooting issues and ensuring data consistency.

  • Active State: This is the initial state where the database operations are being executed. Changes are not yet permanent.
  • Partially Committed State: This state exists in some systems where some, but not all, operations have completed. However, most modern databases skip this state, following an "all-or-nothing" approach.
  • Committed State: This signifies a successful transaction. All operations have been executed correctly, and the changes are permanently reflected in the database.
  • Failed State (or Aborted State): This indicates an error or violation. The database rolls back any changes made, ensuring the database remains in a consistent state.
  • Terminated State: This signifies an abnormal termination (e.g., system crash). The database might need to perform recovery procedures to determine the transaction's outcome.

Concurrency Control in Databases

In the fast-paced world of databases, multiple users might need to access and modify data concurrently. While this concurrency can improve efficiency, it can also lead to data inconsistencies if not managed properly. This is where concurrency control comes in.

Why Concurrency Control?

Imagine a scenario where two users are updating the same bank account balance simultaneously. Without concurrency control, one user's update might overwrite the other's, leading to inaccurate data. Concurrency control mechanisms prevent such issues by ensuring Data Consistency and Serializability.

Database Concurrency Control Problems

Even with concurrency control, potential problems can arise:

  • Lost Update: Occurs when two transactions update the same data item, and one update is entirely lost.
  • Uncommitted Dependency (Dirty Read): Arises when one transaction reads uncommitted data written by another transaction. If the second transaction fails, the first one has read incorrect data.
  • Inconsistent Retrievals/Analysis (Non-Repeatable Read): Occurs when a transaction reads data that is being modified by another concurrent transaction, resulting in an inconsistent view of the data if read multiple times.

Database Concurrency Control Protocols

Concurrency control protocols establish guidelines for managing data access and preventing inconsistencies. Two prominent types are lock-based protocols and the Two-Phase Locking (2PL) protocol.

Lock-Based Protocols: Securing Data Access

  • Exclusive Lock (X Lock): Grants a transaction exclusive access. No other transaction can read or write the data item.
  • Shared Lock (S Lock): Allows multiple transactions to read the same data item concurrently, but none can modify it.

Two-Phase Locking (2PL) Protocol

The 2PL protocol is a specific type of lock-based protocol that enforces a structured approach to acquiring and releasing locks.

  • Growing Phase: A transaction can acquire locks but cannot release any.
  • Shrinking Phase: Once the transaction has acquired all needed locks, it enters this phase, where it can only release locks and cannot acquire new ones.

Alternative Concurrency Control Protocols

While lock-based protocols are a cornerstone, alternatives exist.

1. Timestamp-Based Protocols

These protocols assign unique timestamps to transactions and use them to order transactions and ensure serializability. This is a form of Optimistic Concurrency Control (OCC) where transactions proceed without locks, and validation occurs at commit time.

2. Validation-Based Protocols

Also known as pessimistic concurrency control, these protocols rely on data validation to ensure consistency. Before a transaction modifies data, it checks for any conflicting modifications from other transactions.

Deadlock vs. Starvation

Both deadlocks and starvation can hinder the smooth operation of a database system, but they represent distinct problems.

Deadlock

A deadlock occurs when two or more transactions are permanently blocked, waiting for resources held by each other, creating a circular dependency. Four conditions must be met for a deadlock: Mutual Exclusion, Hold and Wait, No Preemption, and Circular Wait.

Starvation

Starvation occurs when a transaction is continuously denied access to resources, often because higher-priority transactions are always chosen first. The starved transaction waits indefinitely, never able to complete its execution.

Key Differences Between Deadlock and Starvation

Feature Deadlock Starvation
Nature of Blocking Circular dependency One-sided blocking
Resource Acquisition All involved transactions hold resources Starved transaction cannot acquire resources
Recovery Strategies Rollback involved transactions Prioritize starved transaction, adjust resource allocation

End of Outcome Quiz

1 of 20

Question text will load here.

    Quiz Results

    Score: 0%

    Answered: 0 of 0

    Correct: 0

    Review Incorrect/Skipped Answers: