SingingEels : Development Community & Resource

Login

Articles

  • ADO.NET (2)
  • ASP.NET (35)
  • LINQ (4)
  • Security (2)
  • Silverlight (2)
  • SQL (7)
  • Standards (5)
  • WCF (2)

Syndication

  • Articles RSS
  • Blogs RSS

Contribute

  • Our Authors List
  • Member Sign-Up
  • Suggestions Box
ASP.NET Hosting with MS SQL 2008 – Click Here!

Understanding SQL: Many to Many Relationships

(Aug 03 2007 - 08:18:15 PM by Timothy Khouri) - [print article]
ASP.NET Hosting with MS SQL 2008 – Click Here!

In the world of relational data, many-to-many relationships are one of the hardest concepts to understand and implement correctly. Quite likely the scenario will arise for a developer to decide whether to support a one-to-many or a many-to-many schema, and out of fear we crumble to an "easy" design. This article will use real world scenarios to show how and why to use a many-to-many relationship as well as how to achieve perfect performance.

There are many situations that would call for a many-to-many schema. You might have a table of "Stores" and a table of "Managers" and your company might to allow managers to switch between stores. This would require that a store could have more than one manger, and a manger could belong to more than one store (thus, many to many). Or you could consider this scenario:

You're building an application for a customer support center that takes phone calls for a cable company. When a customer calls in to complain about their fuzzy T.V. reception and also they want to upgrade their high-speed internet access, the customer service representative will need to open up two "tickets"; one for the fuzzy reception problem, and one for the high-speed upgrade.

In this scenario we are dealing with two things: phone calls and support tickets. So when the customer calls back the next day about the internet upgrade, we will want to associate that phone call with the second ticket we opened. Likewise, when they call back again the next day about the fuzzy reception, we want to associate that phone call with the first ticket we opened. Let's see how we can accomplish this many-to-many relationship in SQL. We'll learn:

  • How to create tables that support a many-to-many relationship.
  • What indexes should be created for the best performance.
  • How to report by phone calls or by tickets (either side of the relationship).

How to Make a Many-to-Many Schema

The first piece in making our tables is to define the two entities (or tables) and then we can join them together by their IDs in a third "linker" table. For the example we are going to use we'll need to very simple tables: PhoneCalls and Tickets. We'll keep it simple so as not to get lost in the illustration, but instead to focus on the many-to-many relationship concept. Here is the schema of the two tables:

-- This table will hold our phone calls.
CREATE TABLE dbo.PhoneCalls
(
   ID INT IDENTITY(1, 1) NOT NULL,
   CallTime DATETIME NOT NULL DEFAULT GETDATE(),
   CallerPhoneNumber CHAR(10) NOT NULL
)

-- This table will hold our "tickets" (or cases).

CREATE TABLE dbo.Tickets
(
   ID INT IDENTITY(1, 1) NOT NULL,
   CreatedTime DATETIME NOT NULL DEFAULT GETDATE(),
   Subject VARCHAR(250) NOT NULL,
   Notes VARCHAR(8000) NOT NULL,
   Completed BIT NOT NULL DEFAULT 0
)

If we only wanted to support one-to-one functionality we would simply put a "PhoneCallID" field in the Tickets table and make the rule that you can't have a ticket without a phone call. But instead, we want to allow multiple tickets to be created on one phone call, and we want to allow many phone calls to be associated with a single ticket. To do that we'll need to build this linker table:

-- This table will link a phone call with a ticket.
CREATE TABLE dbo.PhoneCalls_Tickets
(
   PhoneCallID INT NOT NULL,
   TicketID INT NOT NULL
)

That's all we need to make our many-to-many schema work. First we would insert a record into the PhoneCalls table. Then, when the caller explains the situation, the customer service rep would enter a record into the Tickets table and the system would insert a link (the ID of the phone call and the ID of the ticket) into the linker table. Eventually though this table is going to be big, and will begin to perform slowly in views and reports. Let’s now look at how to make the perfect indexes for performance on both sides of the relationship:

Perfect Many-to-Many Indexes

Without a proper understanding of SQL indexes, most people would either add an arbitrary column to the linker table and make that the primary key (which doesn't help us at all), or they might make a composite primary key of PhoneCallID and TicketID. This index will help a lot; it makes sure that there are no duplicates and it gives us a lot of speed when searching or joining to the PhoneCalls table. But what if we wanted to query by the TicketID?

The problem with just one index when dealing with a many-to-many linker table is that you are only tuning performance for one side of the relationship (which makes sense: one index only helps one side). So how do we add an index that will give us the best performance possible? Well, a regular index is ok, but nothing beats a unique index in terms of speed. Here's the trick!

A linker table (a table of IDs from two other tables for the purpose of a many-to-many relationship) is always unique whether you look at it from left to right or right to left. Meaning, PhoneCallID_TicketID is unique and TicketID_PhoneCallID is also unique, so we can add another unique index to the table!

Perfect SQL indexes for a many-to-many linker table Another unique SQL index for a many-to-many linker table

How Does Another Unique Index Help?

Well, think about your table as if it was a library filled with thousands of books. How would you find a book called "SQL Performance Tips"? Well, you would use the library's index and skip ahead to the letter "S" and continue in alphabetical order until you found your book. But what if you wanted to find a book by the author "Billy McGee"? You can't use the index that you were just in because that is sorted by book title (which doesn't help you because you only know the author's name). So you would use the library's other index, the one that is sorted by author's name.

The same is true for SQL. If you do a query like this:

SELECT * FROM dbo.PhoneCalls_Tickets WHERE PhoneCallID = 123

Then SQL will realize that it should use the PhoneCallID_TicketID index to find your results. But if you change the query only slightly to do this:

SELECT * FROM dbo.PhoneCalls_Tickets WHERE TicketID = 123

Then SQL will understand that the first index doesn't help, so it will use the TicketID_PhoneCallID index to find your results. That's how to achieve the best performance with many-to-many relationships.

How Do I JOIN These Tables Together?

When you are dealing with JOINing more than two tables in a query it can sometimes get confusing. So when you're dealing with a many-to-many query (which requires three tables), it helps to first break up your FROM/JOIN clauses and explain them out loud to yourself as you type. The JOINs should make sense to SQL and to you. Let's look at a query that would JOIN both of our tables together with the help of the linker table.

-- It doesn't matter which table we start with, so I'll just
-- pick the PhoneCalls table.


SELECT
   *
FROM
   dbo.PhoneCalls

-- If I stop here, then all I have is the phone call data. Now

-- we can JOIN to the helper table to get the IDs of all tickets

-- associated with each phone call in the query above.


INNER JOIN dbo.PhoneCalls_Tickets ON
   PhoneCalls.ID = PhoneCalls_Tickets.PhoneCallID

-- That's simple enough. Now that we've joined to the linker

-- table we are able to use the PhoneCallID or the TicketID

-- fields from the linker table. Now let's get the tickets!


INNER JOIN dbo.Tickets ON
   PhoneCalls_Tickets.TicketID = Tickets.ID

You can also use the linker table in your WHERE clause. An example of this would be if you wanted to see all calls associated with a ticket, or all tickets associated with a call. Here is how you would achieve those two queries:

-- This will give you all calls for ticket number 123.
SELECT * FROM dbo.PhoneCalls WHERE ID IN
   (SELECT PhoneCallID FROM dbo.PhoneCalls_Tickets WHERE TicketID = 123)

-- This will give you all tickets for phone call number 123.

SELECT * FROM dbo.Tickets WHERE ID IN
   (SELECT TicketID FROM dbo.PhoneCalls_Tickets WHERE PhoneCallID = 123)

Closing Remarks

Now that we have examined the facts we realize that many-to-many relationships are not difficult to build, and they can perform very well when indexed properly. Also, if you name your tables appropriately, then you're queries will also be easy to read. So the next time you are designing a database schema for one of your projects, try to squeeze in some of this powerful functionality. Remember, you can easily support a one-to-many relationship with a many-to-many schema, but you can't go the other way around. So if you think you might need it, why not support it from the ground up.

  • Aug 31 2007 - 06:42:47 AM scottm

    Great article, What would be the best way to insert, update and delete data from this structure?

  • Aug 31 2007 - 10:09:37 PM Timothy Khouri

    It's nothing too complicated... You'd probably first insert a record into the "PhoneCalls" table first. Then, when the caller opens a "ticket", you'd insert a record into the "Tickets" table, retrieve the ID of that record (by selecting SCOPE_IDENTITY()) and then inserting a record of the two ID's into the linker table, "PhoneCalls_Tickets".

    If you wanted to do a DELETE, then you'd just have to make sure to delete the records out of all three tables. Let's say you wanted to delete all data for PhoneCall ID #7, then you'd do this:

    DELETE FROM dbo.Tickets WHERE ID IN (SELECT TicketID FROM dbo.PhoneCall_Tickets WHERE PhoneCallID = @PhoneCallID)

    DELETE FROM dbo.PhoneCall_Tickets WHERE PhoneCallID = @PhoneCallID

    DELETE FROM dbo.PhoneCalls WHERE ID = @PhoneCallID

    And that's it :)

  • Sep 05 2007 - 06:19:31 AM scottm

    Thanks for the info.

  • Nov 05 2007 - 07:29:59 PM hvolcy

    This article is just what I've been looking for. I tried the examples but it seems that the last select statment may be incorrect. Shouldn't it say "(SELECT TicketID..." instead of "(SELECT PhoneCallID..."

  • Mar 22 2008 - 08:22:51 PM Timothy Khouri

    I realize that this is way late for me to be fixing this... but you're right... I've fixed the typo.

You must be logged in to add comments. If you have not already done so, you can create an account here. If you already are a member, you first need to login before you can comment.

Developer / Architect / Author

People to Follow

Experts in the categories related to this article.

There are no experts related to this article.

Related Blogs

These are the most recent blog posts related to this article.

There are no blog posts related to this article.

Related Ads

SingingEels.com as of Nov 21 2008 - 08:02:33 PM - (0.0624984)