Earlier this week, I attended a meeting of the National Capital Oracle Users Group (
NATCAP-OUG) at which I saw a presentation about validating your database design. It got me to thinking about database design, why we should do it, and how to use it to improve applications. And it all starts with an Entity-Relationship Diagram (ERD).
ERD
So, here's an ERD for the Oracle HR sample schema:
I do an ERD very early in the requirements definition phase of an application, and I refine it throughout the process. Why? Because this can help you discover things about your subject area and use cases that you don't get any other way. After all, the application is designed to help your business collect information about the things they do, and to use that information to inform and govern what they do next. Without data, your application is just a game. Not that I have a problem with games. And even games need to collect information about score, progress, items the player's avatar is carrying, etc.
Unfortunately, a very important part of an ERD is missing from this one - the names of the relationships. Relationship names help you to read and understand your diagram, and particularly to read the diagram to your users. What? You don't read your ERDs to your users? You SHOULD, if just because your users shouldn't be expected to read and understand the diagram for themselves. So the rest of this post is to explain why and how to read an ERD to your users. A lot of what I'm going to say is not original with me. I read an article a long time ago called "Why we say Each" which has guided my database designs ever since. Unfortunately, I have somehow lost my copy of the article, and have forgotten who wrote it. So to the author of this article - thank you.
Reading the ERD
I'll start at the very top with the Departments entity:
- Each Department may be staffed by one or more Employees.
- Each Employee may work for one and only one Department.
"staffed by" and "work for" are two of the missing relationship names. Notice that I have a name for both sides of the relationship. This is important, because it will help us discover when we have a relationship wrong. The word, "may" is also important because it indicates that the relationship is optional. The use of "may" in the above two statements means that it is possible to have a Department that has no Employees, and it is possible to have an Employee who is not part of the staff of any Department.
Here you may begin to see why it is useful to read the ERD to your future application users. I would use this as an opportunity to ask them, "Is this correct - is it true that you can have a Department with no Employees? Is it possible to have an Employee who doesn't work for a Department?" Finally, there are the words, "one or more" and "one and only one". Once again, I can ask the users, "Can an Employee work for more than one Department at the same time?" Let's continue:
- Each Department must be managed by one and only one Employee.
- Each Employee may be the manager of one or more Departments.
Once again, I have added "managed by" and "manager of" as relationship names and I name each side of the relationship. I've arbitrarily decided that it is mandatory that there be a Department Manager, but I need to check this with my users. By the way, by making this a mandatory relationship, I seem to have decided that a Department must have at least one Employee, but notice that the manager of a Department does NOT need to "work for" that Department. In fact, since an Employee may manage one or more Departments, we can't force the manager to "work for" each Department he/she manages, UNLESS we change the relationship so that Employees may work for one or more Departments. Finally, we'll look at the self-referencing relationship in the Employee entity:
- Each Employee may manage one or more other Employees.
- Each Employee may be managed by one and only one other Employee.
These relationships are fairly straight forward, except we may want to ask the users if every Employee MUST have a manager. That would be possible, EXCEPT that there is probably at least one Employee, the President, who has no manager. Exceptions to the rules are possible, but we need to know about them and clearly define them. Also, see that I have added a rule in these relationship statements, the word "other". This is to make clear that an Employee may not manage her/himself. Or maybe that is the solution to the problem about the company President - he manages himself. By the way, did you notice that manager_id in the Employee entity and manager_id in the Department entity are not really the same thing, even though they probably will contain employee_ids as a foreign key? We may want to give these attributes different names.
Further Study
See what you can do with reading the rest of the diagram. Hope this has convinced you that database design is an important part of requirements definition, and that you need to read your ERDs to your users.
No comments:
Post a Comment