Considerations When Designing A Database
(Jul 13 2007 - 05:21:43 PM
by Timothy Khouri
) - [print article
The most powerful applications, whether web or windows apps, usually utilize large amounts of relational data. However, many developers skip the design process of their database, or design as they code which often paints them into a corner. This article will explain steps all developers should take when designing a database.
While there are many aspects of database design that can be covered, and there are many types of databases that can be used, we are going to focus on general concepts that should be thought of first before choosing a technology.
We will assume that we already have the desired application in mind, and that the feature set is at least roughly planned out. Often times though, features and code are decided upon "in concrete" and the database is detrimentally "squeezed" into the mold.
In light of this, we'll talk about the three areas that should be taken into account during the planning phase. These are:
- Labeling and defining all objects that will need to be stored in the database.
- Mapping all of the possible relationships for each object in writing.
- Defining how your application will be retrieving, editing and reporting on the data.
Labeling and Defining All Objects
The first and most critical piece to database design, is clearly defining what you consider the data to be. There should be no surprises as you are in the middle of the code as to what you are dealing with. Instead of thinking of a database as some binary and text data stored in a file, think of it as a house of entities; each entity representing an object (or a class in programming terms).
For instance, in the case of a simple e-commerce web site, your database would logically contain 'customers', 'products' and 'orders' (or sales receipts). Now, with this knowledge of what you want to build, you could go down one of two routes. Either 1) starting to build your tables or 2) defining all aspects of your entities.
Many people will go down the first choice because they can quickly rattle off field names: "FirstName, LastName, Address, City, State, Zip - Done!". But this kind of 'quick code' thinking is very near-sighted.
For instance, have you given consideration to non-U.S. orders, such as Canada whose postal-codes are 6 characters (not 5 like U.S. zip codes)? Are you going to accommodate different billing addresses verses shipping addresses? For that matter, will the website allow a customer have multiple orders and multiple shipping addresses?
You don't have to cover every possible question, but you do need to decide on things like the above. Notice, those questions above are important because the answers would change the structure of the database all together. This is why it's important to define your entities (which will ultimately be tables). Once you have decided on the semantic meaning and schema of your tables, we move on to the next part - mapping relationships.
Mapping Relationships in Writing
Now that the structure of the website is created, you need to define all the possible relationships (or connections) between the tables. Using the example of the e-commerce website above we can make the following assumptions:
- Customers are unique and they are at the top.
- A customer can have zero or more orders (sales receipts).
- Orders consist of one or more products.
This means that there is no direct relationship between a customer and a product. However, there might be a need to report on a customer to product basis. We'll talk about that a bit more in the third section of this article. Until then it's important to remember to put the data relationships in writing (making it more concrete).
Once you've defined the above relationships you can understand better how to make your indexes and foreign keys. For instance, the 'orders' table will need to have a CustomerID field which will be part of a foreign key constraint that will require that a 'customer' record exists in the 'customers' table with that same ID.
Also, to associate what products were ordered, we would need a 'OrderProducts' table consisting of two fields: OrderID and ProductID. This would act as a bridge between the Orders table and the Products table. It would be wise to make another foreign key constraint to ensure that the order and the product exists before they can be tied together. The last area of consideration we will discuss is how the application (web site in this case) will use the data.
Querying, Editing and Reporting on Data
While we can define all of our foreign key constraints above just by looking at the schema of the tables, we need to look at our application to define the next crucial part - indexes. It's logical to assume that each table will have a simple primary key column called "ID". By default (and habit) most database designers would give this field a clustered index.
If you are not already familiar with clustered indexes, you might want to read this article. But basically a clustered index is tells the database how to group items together. Grouping items by a unique key makes no sense as every record will be in a group of only one record (which can hardly be called a group).
Again, focus on the application and the business need. It is likely that we will want to display all sales for a customer; therefore grouping (adding a clustered index) on the "CustomerID" field would make more sense. Not only does this make sense, but this will also drastically increase any queries involving sales for a particular customer or set of customers.
By keeping these three points in mind, you'll have a stronger database design that is likely able to grow with future updates. You won't be caught off guard by incredibly slow queries once your database begins to grow. And you'll be able to build your application in a more object oriented way as you are already thinking of the data in terms of objects (entities).
As a side note: the development world in general is aiming towards more logical designs and methods of viewing / retrieving data. The introduction to LINQ and the Entity Framework in Visual Studio 2008 (.NET 3.5) will even more so make you think of your data in an object oriented way.