So, what is the solution here?

Normalisation is the process of producing a set of tables that have the same properties we would have had in our large table, except split neatly up into grouped elements. This is how our master table looks right now:

  • ID

  • Name

  • Address 1

  • Address 2

  • Company Name

  • Company Address 1

  • Company Address 2

  • Company City

When you think about it, the city a company is in really has no relation to the city a person lives in - the company and the person are different things entirely, linked merely by the fact that a person works for a company. So, we could split our table into two smaller tables, People and Companies, like this:

People:

  • ID

  • Name

  • Address1

  • Address2

  • CompanyNum (new)

Companies

  • ID

  • Name

  • Address1

  • Address2

  • City

Structured like this, companies are stored separately to people, with only an ID number linking the two - each person has a CompanyNum, which should be set to the ID of the company (in the Companies table) that they work for.

Now if we change the address of Microsoft, we only need to update it in one place. Or if we're adding a new person to BloggsCo, we just need to enter "1" into the CompanyNum box rather than copy all the information across. This eliminates both the data duplication and the chance of error - operators can be restricted to selecting a company by name from the Companies table.

 

Want to learn PHP 7?

Hacking with PHP has been fully updated for PHP 7, and is now available as a downloadable PDF. Get over 1200 pages of hands-on PHP learning today!

If this was helpful, please take a moment to tell others about Hacking with PHP by tweeting about it!

Next chapter: Why not separate data? >>

Previous chapter: Why separate data?

Jump to:

 

Home: Table of Contents

Copyright ©2015 Paul Hudson. Follow me: @twostraws.