![]() |
FoxPop |
www.foxpop.co.uk |
POWERBASE TUTORIAL 11B
Rules of Normalization.
We left off last time, looking at the different ways we could organize the tables in our Contact Database. I also introduced you to a new term: Normalization. In this lesson we shall learn about the first rule of Normalization.
But before we start I think we must first define a new concept know as an "Entity". This concept helps us to understand the rules of normalization. To begin with, the purpose of database design is to sufficiently describe chosen aspects of real world things such that some desired function can be effectively performed. A function might be finding the phone number of your favorite Pizza shop, paying an employee their correct salary, or ensuring that a customer receives and pays for the goods they have ordered. The issue is of course what things do you need in your database, what aspects of these things do you need and how are these things related? To assist this process a distinct type of "real world thing" is often referred to as an "entity"; the various aspects of an individual entity that distinguish it from others of the same type or further describe it are referred to as "attributes". For example an "Employee" entity might have descriptive attributes of Title, Given name, Family name and Gender and a Social Security number attribute that uniquely distinguishes different employees.
Now that you have learned this new concept "entity", we can use it in our discussion below.......
First Normal Form - also known as Rule Number One The first normal form dictates that every instance of an entity referenced in a Table will be unique and that all fields in the Table contain data that is nondecomposable.
Whew, that is very complicated, but what does it mean? So let's look at each part of the rule:
Every record much be unique. This means that every time you enter information about an entity, that instance will be unique to the Table it is being entered into. The easiest way to identify a record as being unique is to declare a primary key for the table. Primary keys must meet several different conditions. Any data entered into a primary key field (or fields, as a primary key can be made up of 1 or more fields) any where in that specific Table MUST be unique to that table. NO DUPLICATION is allowed and there must be a primary key for every record.
OK, so each record in the Table should contain a unique identifier such that you have a way of safely identifying the record. This unique identifier is called a primary key - but how do you create a primary key? Databases like Access, FoxPro, Oracle, SQL Server and the like, have the DBA to determine the primary key. Once this key is declared they set some buttons or flags that actually let the database system know that "these fields are the primary key". From then on, any violation of the key will cause an error. Such a violation may be trying to enter the same information twice or not entering any data in those fields. For an order entry system the primary key may be Order Number. For an employee database, many companies in the USA use the employee's social security number as the primary key for the employee table. Though, in the past few years, with the proliferation of stolen and fake social security numbers, many companies have given up on the social security number being the primary key as they may not be unique.
The primary key is a field or fields that uniquely identify the record. Sometimes you can assign a natural primary key. Using the example above - the Order Number or the Social Security number might uniquely identify the record. At other times, you might have to create that primary key. In a customer table, what do you do if you have 2 customers with the same name? In that case, you should create an identifier - thus the Customer ID field is added to the Customer Table. The Employee Table may have an employee number or other ID that is used in place of the social security number.
The primary key should be short, stable and simple. Short means it should be small in size. It should not be a key consisting of 9 fields, nor should it be a 50-character field either. Stable means the primary key should be a field whose value does not change very often, if at all. An employee number does not usually change but could an employee change their name? And finally simple means it should be easy for a user to work with - again, if it is a composite key, 2 or 3 fields are easy to work with than 9.
Now for the next part - nondecomposable. - What does that mean?
nondecomposable data is data that cannot be broken down into smaller pieces. For example a field called Name - in which the first, middle and last name is entered, is a violation of this rule. This data can be broken down further.
Now, decomposable not only applies to fields, but to the record also. If the fields that make up a record are "mixed" (like having home and work addresses in a record) we can break this down further. Another example is for an orders table: you can have fields for order number, order date, customer number; and fields like product1, product2, product3. This is in violation of the rule because the record could be decomposed further (what would happen if someone wanted to order 4 products, would you add another field?) In this Orders table, each product ordered should be in a separate record. Again, information should not repeat in a record.
Let's now move on to testing our 3 cases to see if they meet the first rule of normalization. Check carefully -
![]() | Do we have primary keys defined? I don't think we do. We should have an "ID" in each Table that would be unique so we can identify each record from another. We don't want to use a composite key of first name/last name because what if we have 2 people with the same name? For case 3 we would also have a primary key for the Business Table. |
![]() | Have we decomposed our fields? Case1, Case 2 and Case 3 all meet this rule. Using this method or rule makes the data much easier to work with. Going back to our fields for the last and first name, you now have the ability to sort or search by first name, by last name or by both. |
![]() | Have we decomposed our records? Case 1 - NO, we have personal and company fields. Case 2 - well, we kinda have decomposed it. This Table either has a record for a person or a company. It is better than Case 1 but it is still not great - and what about those phone fields? Case 3 - This also fails for several reasons. In the Contact Table we have repeating information for the email fields and in the Contacts Table and the Company Table we have all these telephones again: phone, fax, mobile.... these have to go. |
Keep all of this in the back of your brain. Next time we move on to the Second Normal Form and the Third Normal Form will follow that. When we are all done, we will actually design our database tables, and even break some of the rules (I will explain why later!)
See you all next time!
Case 1: Personal and Business fields ALL in one Table |
Case 2: A combined Contact/Business Table |
Case 3: Separate People and Business Tables. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|
|
© LJKern and FoxPop 1999