Why separate data?

Novice database designers tend to put all their information into one table on the grounds that "it makes things easier" or something similar. To illustrate the problem with this approach, I created a table of people in companies, where each row contains one person, their home address, the company they work for, and the company address.

Looking at the table, what happens if, say BloggsCo (the company Joe Bloggs and Anna Carter work for, as well as potentially thousands of other people), moves address? We'd have to go through each record, and adjust the data. Or, worse, what happens if an operator mistypes the company name and enters "BlogsCo"?

By having everything in one table, we experience large amounts of data duplication because the company address is duplicated for each worker, and we're also leaving ourselves wide open to human error - if an office worker has to type in the same address a hundred (thousand?) times over, the chances of them making a mistake is quite high.

Furthermore, we experience what is called a "deletion anomalies" when we delete the last person from a company - that is, if we delete Laura Bond, we not only lose her record, but we also lose the last copy of Microsoft's address.

 

Want to learn PHP 7?

Hacking with PHP has been fully updated for PHP 7, and is now available as a downloadable PDF. Get over 1200 pages of hands-on PHP learning today!

If this was helpful, please take a moment to tell others about Hacking with PHP by tweeting about it!

Next chapter: So, what is the solution here? >>

Previous chapter: Normalisation

Jump to:

 

Home: Table of Contents

Copyright ©2015 Paul Hudson. Follow me: @twostraws.