007 Computing Jargon of the Week

Normalize

Normalization is the process of reorganizing data in a database so that it meets two basic requirements: (1) There is no redundancy of data (all data is stored in only one place), and (2) data dependencies are logical (all related data items are stored together)

https://www.techopedia.com/definition/1221/normalization

What does this mean?

Consider an online retailer which has a database of customer purchases. The database may consist of:

  • a Customer table containing details including customer name, address, account ID number; and
  • a Purchases table containing item details, order number and customer details (as per the details in the Customer table)

The Purchases table could have details such as customer name, address and ID number. However this would be duplicating data and would require extra effort to a maintain these tables. Entries stored in the Customer table would have to be replicated in the Purchases table. Also, if a change was required to some of the customer information then it would need to be maintained across multiple tables.

Whilst this may not sound too bad in this small example, it can grow tedious and troublesome for larger scale databases.

As such, normalization is the process of reducing as much of the duplication as possible. In this instance we can take out a customers name and address from the Purchases table and only reference the customer’s unique ID number. This way when we want to query for a customer’s purchases, we can JOIN the tables together, using the customer ID as a reference between the two tables.

Normalization can be desirable as it reduces disk space required as it eliminates the need of holding duplicated data. However, for complex queries, joining many tables together can be expensive in terms of the time it takes to execute a query.

Further reading

https://www.techopedia.com/definition/1221/normalization

https://www.studytonight.com/dbms/database-normalization.php

http://agiledata.org/essays/dataNormalization.html


If you enjoyed reading this post, consider reading some of my other definition posts:

Propagate

Tautology

Idempotent