University of Cincinnati
College of Evening and Continuing Education
Database Management Systems I
(30-IT-451-002)
Database Design Methodology Summary
This page gives a summary of the
Database Design Methodology for Relational Databases and key success factors.
Critical Success Factors in Database Design
The following guidelines may
prove to be critical to the success of database design:
- Work interactively with the users as much as possible.
- Follow a structured methodology throughout the data modeling process.
- Employ a data-driven approach.
- Incorporate structural and integrity considerations into the data models.
- Combine conceptualization, normalization, and transaction validation
techniques into the data modeling methodology.
- Use diagrams to represent as much of the data models as possible.
- Use a Database Design Language (DBDL) to represent additional data
semantics.
- Build a data dictionary to supplement the data model diagrams.
- Be willing to repeat steps.
Database Design Phases
- conceptual,
- logical, and
- physical database design.
The steps involved in the main phases of
the database design methodology are outlined below.
Step 1 Build Local Conceptual Data Model
for Each User View
Build a local conceptual data model of an enterprise for each specific user view.
Step 1.1 Identify entity types
Identify the main entity types in
the users view of the enterprise. Document entity types.
Step 1.2 Identify relationship types
Identify the important
relationships that exist between the entity types that we have identified.
Determine the cardinality and participation constraints of relationship types.
Document relationship types. Use Entity-Relationship (ER) modelling, when
necessary.
Step 1.3 Identify and associate attributes with entity or relationship
types
Associate attributes with the appropriate entity or relationship
types. Identify simple/composite attributes, single valued, multi-valued
attributes, and derived attributes. Document attributes capturing information
such as:
- Attribute name and description.
- Any aliases, or synonyms that the attribute is known by.
- Data type and length.
- Default values for the attribute (if specified).
- Whether the attribute must always be specified (in other words, whether
the attribute allows or disallows nulls).
- Whether the attribute is composite and if so, what are the simple
attributes that make up the composite attribute.
- Whether the attribute is derived and if so, how it should be computed.
- Whether the attribute is multi-valued.
Step 1.4 Determine attribute domains
Determine domains for the
attributes in the local conceptual model. Document attribute domains.
Step 1.5 Determine candidate and primary key attributes
Identify the
candidate key(s) for each entity and, if there is more than one candidate key,
choose one to be the primary key. Document primary and alternate keys for each
strong entity.
A candidate key is an attribute or minimal set of attributes of an
entity that uniquely identifies each occurrence of that entity. We may
identify more than one candidate key. However, in this case, we must
choose one to be the primary key; the remaining candidate keys are called
alternate keys. When choosing a primary key from among the candidate keys,
use the following guidelines to help make the selection:
- The candidate key with the minimal set of attributes.
- The candidate key that is less likely to have its values changed.
- The candidate key that is less likely to lose uniqueness in the future.
- The candidate key with fewest characters (for those with textual
attribute(s)).
- The candidate key that is easiest to use from the users' point of view.
Step 1.6 Consider use of enhanced modeling concepts
Identify
superclass and subclass entity types, where appropriate.
Step 1.7 Check model for redundancy
Reexamine one-to-one relationships
and remove redundant relationships.
Step 1.8 Validate local conceptual model against user transactions.
Step 1.9 Review local conceptual data model with user
Review the local
conceptual data model with the user to ensure that the model is a 'true'
representation of the user's view of the enterprise.
Before completing Step 1, we should review the local conceptual data model
with the user. The conceptual data model includes the ER diagram and the
supporting documentation that describes the data model. If any anomalies are
present in the data model, we must make the appropriate changes, which may
require repeating the previous step(s).
We repeat this process until the user is prepared to 'sign off' the model as
being a 'true' representation of the part of the enterprise that we are
attempting to model.
Step 2 Build and Validate Local Logical
Data Model for each User View
Build a logical data model based on the
conceptual data model for each user view of the enterprise, and then validate
the model using the technique of normalization and against the required
transactions.
- A Logical database schema is a model of the structures in a DBMS.
- Logical design is the process of defining a system's data requirements and
grouping elements into logical units.
Step 2.1 Map local conceptual data model to local logical data
model
Refine the local conceptual data model to remove undesirable features
and to map this model to a local logical data model. Remove the following by
decomposing the attribute, entity, or relationship into an intermediate entity:
- M:N relationships.
- Complex relationships.
- Recursive relationships.
- multi-valued attributes,
- relationships with attributes
Re-examine relationships to see if
there are redundancies.
Step 2.2 Derive relations from local logical data model
Derive relations
from the local logical data model to represent the entity and relationships
described in the user's view of the enterprise. Document relations and foreign
key attributes. Also, document any new primary or candidate keys that have been
formed as a result of the process of deriving relations from the logical data
model. [Turn the relationships into entities]
Step 2.3 Validate model using normalization
Validate a local logical
data model using the technique of normalization. The objective of this step is
to ensure that each relation derived from the logical data model is in at least
Boyce-Codd Normal Form (BCNF). [Most just validate to 3rd normal form, 1 normal
- remove repeating groups, 2nd normal - remove partial dependencies, 3rd normal
- remove transitive dependencies]
Step 2.4 Validate model against user transactions
Ensure that the
logical data model supports the transactions that are required by the user view.
The transactions that are required by each user view can be determined from the
user's requirements specification. Using the ER diagram, the data dictionary and
the primary key/foreign entity links shown in the relations, attempt to perform
the operations manually. [Use CASE tool to prototype db tables, use sample data
to validate]
Step 2.5 Define integrity constraints
Identify the integrity constraints
given in the user's view of the enterprise. These include specifying the
required data, attribute domain constraints, entity integrity, referential
integrity, and enterprise constraints. Document all integrity constraints.
[Result is a high level view of all constraints]
How to ensure Referential Integrity?
Cases 1 to 6 (Page 458-459)
Step 2.6 Review local logical data model with user
Ensure that the local
logical data model is a true representation of the user view.
Step 3 Build and Validate Global Logical Data Model
Combine the
individual local logical data models into a single global logical data model
that can be used to represent the part of the enterprise that we are interested
in modeling.
Step 3.1 Merge local logical data models into global model
Merge the
individual local logical data models into a single global logical data model of
the enterprise. Some typical tasks in this approach are as follows:
- Review names of entities and their primary keys.
- Review the names of relationships.
- Merge entities from the local views.
- Include (without merging) entities unique to each local view.
- Merge relationships from the local views.
- Include (without merging) relationships unique to each local view.
- Check for missing entities and relationships.
- Check foreign keys.
- Draw the global logical data model.
- Update the documentation.
Step 3.2 Validate global logical data model
Validate the global logical
data model using normalization and against the required transactions, if
necessary. This step is equivalent to Steps 2.3 and 2.4, where validated each
local logical data model.
Step 3.3 Check for future growth
Determine whether there are any
significant changes likely in the foreseeable future, and assess whether the
global logical data model can accommodate these changes.
Step 3.4 Review global logical data model with users
Ensure that the
global logical data model is a true representation of the enterprise.
Step 4 Translate Global Logical Data Model for Target DBMS
Produce a
basic working relational database schema from the global logical data model.
In this step you will find out:
- Whether the system supports the definition of primary keys, foreign keys,
and alternate keys.
- Whether the system supports the definition of required data (that is,
whether the system allows attributes to be defined as NOT NULL).
- Whether the system supports the definition of enterprise constraints.
- How to create base relations.
Step 4.1 Design base relations for target DBMS
Decide how to represent
the base relations we have identified in the global logical data model in the
target DBMS. Document design of relations. Collate and assimilate the
information about relations produced during logical data modeling. For each
relation identified
in the global logical data model, we have a definition
consisting of:
- The name of the relation.
- A list of simple attributes in brackets.
- The primary key and, where appropriate, alternate keys (AK) and
foreign keys (FK).
- Integrity constraints for any foreign keys identified.
- From the data dictionary, we also have for each attribute:
- Its domain, consisting of a data type, length, and any constraints on the
domain.
- An optional default value for the attribute.
- Whether the attribute can hold nulls.
- Whether the attribute is derived and, if so, how it should be computed.
Step 4.2 Design representation of derived data
Step 4.3 Design enterprise constraints for target DBMS
Design the
enterprise constraint rules for the target DBMS. Document design of enterprise
constraint. Particular ways to create relations and integrity constraints are:
- The 1992 ISO SQL standard (SQL2).
- Triggers.
- INGRES 6.4.
- Unique indexes.
- Application code (stored procedures, application code, gui, ...)
Step 5 Design Physical
Representation
Determine the file organizations and access methods that will
be used to store the base relations: that is, the way in which relations and
tuples will be held on secondary storage. A physical database schema is a plan
of how to store data on a particular system.
Step 5.1 Analyze transactions
Understand the functionality of the
transactions that will run on the database and analyze the important
transactions.
For each transaction, we should determine:
- The expected frequency at which the transaction will run.
- The relations and attributes accessed by the transaction and the type of
access; that is, query, insert, update, or delete.
- The attributes used in any predicates (in SQL, the predicates are the
conditions specified in the WHERE clause). Check whether the predicates
involve pattern matching, range searches, or exact match key retrieval.
- For a query, the attributes that are involved in the join of two or more
relations.
- The time constraints imposed on the transaction
Step 5.2 Choose file organizations
Determine an efficient file
organization for each base relation.
Selections a file organization include
the following types:
- Heap.
- Hash.
- Indexed Sequential Access Method (ISAM).
- B+-Tree.
Step 5.3 Choose secondary indexes
Determine whether adding secondary
indexes will improve the performance of the system. Secondary indexes provide a
mechanism for specifying an additional key for a base relation that can be used
to retrieve data more efficiently.
Step 5.4 Estimate disk space requirements
Estimate the amount of
disk space that will be required by the database.
Step 6 Design User Views
Step 7 Design Security Mechanisms
Design the security measures for the
database implementation as specified by the users.
Step 8 Consider the introduction of controlled redundancy
Determine
whether introducing redundancy in a controlled manner by relaxing the
normalization rules will improve the performance of the system. Consider
introducing derived data and duplicating attributes or joining relations
together.
Step 9 Monitor and Tune the Operational System
Monitor the operational
system and improve the performance of the system to correct inappropriate design
decisions or reflect changing requirements.