Commonwealth Computer Navigator's Certificate/Milestone 4/Re-structure of template/Example Tutorial How a database is organised
Database Concepts |
What is a Database | How a Database is Organized | Data Types | Normalization | Self Assessment | Summary & FAQs |
Tutorial 5.1.1.2 - How a Database is Organized
Upon completion of this tutorial the learner will:
|
Tables, Records and Fields (or Entities, Data and Attributes)
A database is a collection or tables. The tables contain records that are organized into fields. Or using database terminology; A database is a collection of entities. The tables contain data that are organized into attributes. The use of these two terminologies is frequent when working with databases; become familiar with the terms table and entity, data and record and field and attribute being used interchangeably. Another important concept is the idea of a design view and a query view. Consider the two tables below, one being design view the second being query view;
A design view is a look at the structure of a table as it is being designed.
field name | data type | length | allow nulls |
---|---|---|---|
first_name | Text | 32 | No |
last_name | Text | 64 | No |
birth_date | Date | 10 | No |
gender | Char | 1 | Yes |
- field name is the name given to the field as it will be stored in the database.
- data type is the type of data stored in the database. Usually; Numeric, Text, Date or Char.
- length is the number of characters long the field will be.
- allow nulls indicates if the field can be left blank or empty
A query view is a look at the data in rows and columns. Notice the field names from design view are the column headers in query view.
first_name | last_name | birth_date | gender |
---|---|---|---|
Peter | Rawsthorne | 28/12/1963 | M |
David | Rawsthorne | 26/10/1961 | |
Lisa | Rawsthorne | 11/03/1965 | F |
Malcolm | Rawsthorne | 03/05/1987 | M |
Hannah | Rawsthorne | 23/09/1989 | F |
Note: it is common for field names to not contain spaces. There are a number of different conventions for naming fields which can change depending on the standards defined by the organization you are working. The convention used in this course is to be clear with your field names and use underscores for spaces. (i.e. 'first name' becomes first_name)
Design an address entity;
|
Identifying entities and attributes
This is where we start to use the small school scenario introduced at the beginning of this module. From now on all the analysis and design we do will be based upon this scenario.
Identifying entities and attributes could be considered both and art and a science. It is an art for you need to bring together and analyze information from many different sources. The ability to understand a subject domain and identify the entities with their attributes challenges you to see something virtual from many perspectives and groupings, this can be a very creative process. It is also a science for we use proven analysis processes based upon relational algebra, this process refines how our database is structured into entities and attributes. Analyzing a subject domain for the purpose of designing a database means you identify three things; the entities, which are objects you seek information, the attributes, which are the data collected for an entity and the relationships among entities.
This simple diagram shows the relationship between the students and addresses entities.
And once we begin to add attributes to the entities, the diagram will look like this;
The symbols used to create these entity relationship diagrams are as follows;
Read over the small school scenario and using a paper and pencil identify all the entities described.
|
Data types
When designing a database table you will need to decide on the data type for each field. There are three primary data types available, they are;
- Text
- Numbers
- Date Time
- Binary
These data types then break down into more specific types within each primary type. For the OpenOffice Database the following types are available;
Text - character data
- fixed - a fixed length string of text characters
- varchar - a variable length string of text characters
Numbers - numeric data
- Boolean - single digit representing either yes or no
- Integer - and integer or whole number (i.e. without decimal places)
- Float - a number with decimal places
Date Time date and time data
- Date - a character string specific for storing dates
- Time - a character string specific for storing time
- Date/Time (timestamp) - a field specifically designed to store a date and time stamp
Binary - large amounts of binary data
- Image - a large amount of binary data in the form of an image (i.e. gif, jpg, etc.)
- Memo - a large amount of text
- Binary - similar to an image field, used to store anything binary (i.e. mp3, ogg, avi, etc.)
It should be noted that this is not a full description of the available data types and how they can be used. It is the beginning of an extensive topic and beyond the scope of this course.
Mandatory and optional fields
When designing a database table you should always ask yourself if the field (or attribute) is mandatory or optional. In other words, can the field be left empty? The answer to this question varies from database to database depending on the information needs. An example would be a persons birth date; if the situation requires that decisions are made on the age of the person, then birth date would be a mandatory field. The other side of this issue would be when would you like a field to be optional? Answer: When not all data is available at time of data entry... Remember to give the mandatory and optional issue some thought as you build database tables.
Entity relationships
An understanding of a subject domain increases as entities and their attributes are identified. The relationships among the entities becomes an essential part of your database design. When identifying entities and their relationships, think in terms of nouns and verbs. The entities are nouns and relationships are verbs. In the previous example of the Students and their Addresses; "Students" and "Addresses" are the entities (nouns) and "live at" and "contain" are the relationships (verbs).
Wikipedia has a number of entity relationship and normalization based entries, here are a few of the most relevant;
A review of object role modeling can also provide a good method for identifying entities and relationships. |
Exploring first and second normal form
A solid understanding of normalization and normals forms is essential for database design. During database normalization you analyze entities and attributes through a series of rules known as normal forms. There are five basic normal forms, and each builds upon the rules of the previous. In other words, you cannot have a database design in third normal form without also meeting all the rules of first and second normal form. Consider the process of normalization as the process of simplification and the removal of redundancy. For our first review of normalization we are only going to look at the first and second normal forms. What you learn from these two normal forms will provide enough foundation to begin normalizing a database design.
First Normal Form
A database is said to be in first normal form when the tables;
- are guaranteed to not have any duplicate records (i.e. have a way to identify uniqueness within each record)
- do not have any repeating groups (i.e. {grade-id, course1, course2, course3} where course is the repeating group)
Second Normal Form
A database is said to be in second normal form when the tables;
- are in first normal form
- have fields (or attributes) that create a logical entity. And the fields relate back to the unique identifier of the record. (i.e. Students consist of first-name, last-name, gender, and birth-date. A phone number field would NOT be a part of the Student)
Normalize this
Normalize the following table into second normal form.
person_id | first_name | last_name | birth_date | gender | phone_numbers |
---|---|---|---|---|---|
001 | Bill | Smith | 28/12/1963 | M | home: (604) 555-1234; fax : (604) 555-2345 |
001 | Bill | Smith | 28/12/1963 | M | cell: (778) 555-2345 |
005 | Fiona | Jones | 26/10/1961 | F | home: (604) 555-1234; pager (605) 555-9988 |
006 | John | Morrison | 14/07/1951 | M | home: (604) 555-0987; fax: (604) 555-2345; cell: (877) 555-7654 |
012 | Lisa | Ballard | 11/03/1965 | F | home: (604) 555-1234 |
Using these 5 steps modify the above table into first, then second normal form;
|
After Step 1 (entity relationship diagram)
After Step 2 (first normal form)
person_id | first_name | last_name | birth_date | gender | phone_type | phone_number |
---|---|---|---|---|---|---|
001 | Bill | Smith | 28/12/1963 | M | home | (604) 555-1234 |
001 | Bill | Smith | 28/12/1963 | M | fax | (604) 555-2345 |
001 | Bill | Smith | 28/12/1963 | M | cell | (778) 555-2333 |
005 | Fiona | Jones | 26/10/1961 | F | home | (604) 555-4455 |
005 | Fiona | Jones | 26/10/1961 | F | pager | (604) 555-9988 |
006 | John | Morrison | 14/07/1951 | M | home | (604) 555-0987 |
006 | John | Morrison | 14/07/1951 | M | fax | (604) 555-0096 |
006 | John | Morrison | 14/07/1951 | M | cell | (877) 555-7654 |
012 | Lisa | Ballard | 11/03/1965 | F | home | (604) 555-6543 |
After Step 5 (second normal form)
person_id | first_name | last_name | birth_date | gender |
---|---|---|---|---|
001 | Bill | Smith | 28/12/1963 | M |
005 | Fiona | Jones | 26/10/1961 | F |
006 | John | Morrison | 14/07/1951 | M |
012 | Lisa | Ballard | 11/03/1965 | F |
phone_type | phone_number | person_id |
---|---|---|
home | (604) 555-1234 | 001 |
fax | (604) 555-2345 | 001 |
cell | (778) 555-2333 | 001 |
home | (604) 555-4455 | 005 |
pager | (604) 555-9988 | 005 |
home | (604) 555-0987 | 006 |
fax | (604) 555-0096 | 006 |
cell | (877) 555-7654 | 006 |
home | (604) 555-6543 | 012 |
- You will notice that the person_id is present in both the People and PhoneNumbers tables. This allows you to join the phone numbers to their people 'owners'. More on keys in the next section.
If your getting bored with all this theory
Well we've covered enough material where you could jump into actually building a database. So go to tutorial 5.2 (Designing and creating tables) and get your hands dirty. I would strongly recommend you come back to this section as we are going to get deeper into normalization and how to build a database in third normal form.