Database Design table structure normalization explained
Normalization in simple terms, is the process of evaluating and correcting database table's structure in order to minimize data redundancy. The normalization process involves assigning attributes to tables based on the concept of determination.
In order to normalize a database table's structure, we need to follow series of stages called normal forms. They are three major normal forms; these are common in a business environment database. However, some database can require a forth normal form.
Although normalization is a very important database design ingredient it's not recommended to achieve the highest level of normalization. Because, the more you normalize the table, the more it's require the use of join statements in order to return results to end user. And this can lead to the consumption of computer resources by the RDBMS. A successful database design must also consider end-user demand for fast performance.
Process that involves normalization
The main goal of normalization is to reduce data redundancy in the tables. To achieve this, we need to follow stages; these stages will depend on the need of the database designer and how the tables are structured.
The first step is to identify functional dependency: functional dependency occurs when one attribute value which is not a primary key can be used to identify other attributes values in a row.
They are two types of functional dependencies that are special interest in normalization process, these are partial dependencies and transitive dependencies.
- Partial dependency exists when there is a functional dependence in which the determinant is only part of the primary key.
- Transitive dependency exists when there is a functional dependence in which the determinant is not part of the primary key.
Converting to the first normal form
In order to convert database table's structure into a first normal form, first we need to identify and remove repeating groups. Repeating groups are collection of multiple entries of the same type that are grouped under one attribute. Example of repeating group is when name is used in the table. This attribute can take one name as a single value, or two or even three names. To correct this, we need to create two other attributes in the table and five each kind of name its own attribute. Like first_name, middle_name and last_name.
The second step of converting database table into first normal form is by identifying the primary key. After removing repeating groups, we need to identify, which attribute or attributes (composed primary key) will be used to locate other attributes' value in the row.
The third step is to identify all dependencies. In this step we need to make sure only primary key can be used to identify other attributes' values in the row. When we find dependency in the table, we need to remove those attributes that are dependent to that attribute, and create another table with that attribute as the primary key of the newly created table; and also this attribute will remain in the main table and act as the foreign key for the table.
In the third step we remove transitive dependency from the table structure. When all this is achieved now we can say the table is in the first normal form.
Converting to second normal form
Converting in second normal form is very easy. When the table is in first normal it can be converted to second normal form is one step.
The step that involves converting a table structure into a second normal form is by removing partial dependency. This is when one part of compound primary key can be used to identify other attributes' value on its own.
Process that involves eliminating partial dependencies is by creating a new table and transfer all attributes that their values can be identified by the part of compound primary key and assign the attributes as the table primary key. And also, the attribute must remain in the main table and acts as foreign key to the newly created table.
You can skip second normal when your table structure does not use compound primary key. The table will be automatically in the second normal form.
Converting to third normal form
To convert table structure into third normal we need to locate and remove any remaining transitive dependency. This can be achieved by first remove all attributes that are dependent to another attribute in a row, which is not primary key. Second, we create a new table with all dependent attributes and the determinant attribute in the main table to acts as the foreign key to the newly created table.
A determinant, this is the attribute whose value can be used to determine other attributes values in a row.
You can achieve more than third normal form, but it's uncommon in a business data environment. And also more than third normal form can be a drawback to database performance in general; because in order for database to return result to the end user it will require more join statements to do so, which required more computer resources.