Normalization is a systematic approach of decomposing tables to eliminate data redundancy and undesirable characteristics like Insertion, Update and Deletion Anamolies. It is a multi-step process that puts data into tabular form by removing duplicated data from the relation tables.
Database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy. Normalization usually involves dividing large tables into smaller tables and defining relationships between them. It also deals with the insertion, update and deletion anomalies.
First Normal Form:
Deals with atomicity. Columns should not have multiple values.
Student Age Subject Adam 15 Biology, Maths Alex 14 Maths
Second Normal Form:
Removes partial dependency.
a table is in 2NF if and only if it is in 1NF and no non-prime attribute is dependent on any proper subset of any candidate key of the table.
As per the Second Normal Form there must not be any partial dependency of any column on primary key. It means that for a table that has concatenated primary key, each column in the table that is not part of the primary key must depend upon the entire concatenated key for its existence. If any column depends only on one part of the concatenated key, then the table fails Second normal form.
Student Age Subject Adam 15 Biology Adam 15 Maths Alex 14 Maths
Candidate Key is {Student, Subject}. Age is dependent only on Student.
Third Normal Form:
Removes transitive dependency. Table is in 3NF if
– It is in second normal form
– There is no transitive functional dependency
By transitive functional dependency, we mean we have the following relationships in the table: A is functionally dependent on B, and B is functionally dependent on C. In this case, C is transitively dependent on A via B
Student_id Student_name Student_DOB Student_Zip City State
Student Zip is dependent on Student_id and City / State are dependent on Zip.
BCNF (Boyce Codd Normal Form):
– Strict form of 3NF
–
Fourth Normal Form:
-Under fourth normal form, a record type should not contain two or more independent multi-valued facts about an entity. In addition, the record must satisfy third normal form.
Consider employees, skills, and languages, where an employee may have several skills and several languages. We have here two many-to-many relationships, one between employees and skills, and one between employees and languages. Under fourth normal form, these two relationships should not be represented in a single record such as
——————————-
| EMPLOYEE | SKILL | LANGUAGE |
===============================
Fifth Normal Form: