BCA / B.Tech 20 min read

Normalization

What is Normalization in DBMS?

Normalization is a database design technique whose main objective is to reduce data redundancy and ensure data integrity. Through the normalization process, a large table is divided into smaller, well-structured tables, and they are linked together with appropriate relationships. This process organizes the database structure, preventing unnecessary complexities and anomalies when inserting, updating, or deleting data. The main goal is to ensure that data for each entity is stored only once, which improves database performance and saves storage space.

Benefits of Normalization:
  • Reduces Data Redundancy: Normalization reduces the duplication of data in the database, which saves storage space.
  • Improves Data Integrity: By properly defining relationships between tables, data integrity is maintained.
  • Ensures Data Consistency: Anomalies do not occur when updating data in a normalized database, thus maintaining data consistency.
  • Better Performance: The size of the database is reduced, which makes query processing faster.

Normal Forms:

Normalization is divided into several stages, called Normal Forms. Each normal form has specific rules for dividing tables to make the database more efficient and organized.

  • First Normal Form (1NF): A table is in 1NF if:
    • All columns in the table contain atomic values (i.e., each column holds a single value).
    • There are no repeating groups of columns.
    • Example: A table that stores a customer's information and their multiple purchases in a single column would not be in 1NF. In 1NF, each column must contain only a single value.
  • Second Normal Form (2NF): A table is in 2NF if:
    • It is in 1NF.
    • All non-prime attributes are fully functionally dependent on the entire primary key, meaning there is no partial dependency.
  • Third Normal Form (3NF): A table is in 3NF if:
    • It is in 2NF.
    • There is no transitive dependency for non-prime attributes. This means a non-prime attribute does not depend on another non-prime attribute.
    • Example: If a table has `Employee ID`, `Department ID`, and `Department Name`, where `Department Name` depends on `Department ID` (which is not the primary key), it violates 3NF. This should be split into two tables: one for employees and one for departments.
  • Boyce-Codd Normal Form (BCNF): BCNF is a stricter version of 3NF. A table is in BCNF if:
    • It is in 3NF.
    • For every functional dependency (X → Y), X must be a superkey.
  • Fourth Normal Form (4NF): A table is in 4NF if:
    • It is in BCNF.
    • It has no multi-valued dependencies. 4NF aims to eliminate multi-valued dependencies, where one attribute depends on multiple other attributes.
  • Fifth Normal Form (5NF): A table is in 5NF if:
    • It is in 4NF.
    • No data anomalies arise due to join dependencies. 5NF aims to divide tables in such a way that they can be rejoined without any data loss.