Wednesday, March 17, 2010

Database Concepts

General Concepts
A constraint is a property assigned to a column or the set of columns in a table that prevents certain types of inconsistent data values from being placed in the column(s). Constraints are used to enforce the data integrity. This ensures the accuracy and reliability of the data in the database. The following categories of the data integrity exist:
• Entity Integrity
• Domain Integrity
• Referential integrity
• User-Defined Integrity

Entity Integrity ensures that there are no duplicate rows in a table.
Domain Integrity enforces valid entries for a given column by restricting the type, the format, or the range of possible values.
Referential integrity ensures that rows cannot be deleted, which are used by other records (for example, corresponding data values between tables will be vital).
User-Defined Integrity enforces some specific business rules that do not fall into entity, domain, or referential integrity categories.

Each of these categories of the data integrity can be enforced by the appropriate constraints. Microsoft SQL Server supports the following constraints:
• PRIMARY KEY
• UNIQUE
• FOREIGN KEY
• CHECK
• NOT NULL

A PRIMARY KEY constraint is a unique identifier for a row within a database table. Every table should have a primary key constraint to uniquely identify each row and only one primary key constraint can be created for each table. The primary key constraints are used to enforce entity integrity.

A UNIQUE constraint enforces the uniqueness of the values in a set of columns, so no duplicate values are entered. The unique key constraints are used to enforce entity integrity as the primary key constraints.

A FOREIGN KEY constraint prevents any actions that would destroy link between tables with the corresponding data values. A foreign key in one table points to a primary key in another table. Foreign keys prevent actions that would leave rows with foreign key values when there are no primary keys with that value. The foreign key constraints are used to enforce referential integrity.

A CHECK constraint is used to limit the values that can be placed in a column. The check constraints are used to enforce domain integrity.

A NOT NULL constraint enforces that the column will not accept null values. The not null constraints are used to enforce domain integrity, as the check constraints.

You can create constraints when the table is created, as part of the table definition by using the CREATE TABLE statement.
Conclusion
Constraints are the built-in mechanism for enforcing data integrity. Using constraints is preferred to using triggers, rules, and defaults because built-in integrity features use much less overhead and perform faster than the ones you can create. When you write your own code to realize the same actions the constraints can make you can make some errors, so the constraints are not only faster, but also are more consistent and reliable. So, you should use triggers and rules only when the constraints do not provide all the needed functionality.
Difference between Primary key and Candidate Key
Candidate Key - A Candidate Key can be any column or a combination of columns that can qualify as unique key in database. There can be multiple Candidate Keys in one table. Each Candidate Key can qualify as Primary Key.
Primary Key - A Primary Key is a column or a combination of columns that uniquely identify a record. Only one Candidate Key can be Primary Key.

One needs to be very careful in selecting the Primary Key as an incorrect selection can adversely impact the database architect and future normalization. For a Candidate Key to qualify as a Primary Key, it should be Non-NULL and unique in any domain
Select a key that does not contain NULL
It may be possible that there are Candidate Keys that presently do not contain value (not null) but technically they can contain null. In this case, they will not qualify for Primary Key. In the following table structure, we can see that even though column [name] does not have any NULL value it does not qualify as it has the potential to contain NULL value in future.
Select a key that is unique and does not repeat
It may be possible that Candidate Keys that are unique at this moment may contain duplicate value. These kinds of Candidate Keys do not qualify for Primary Key. Let us understand this scenario by looking into the example given above. It is absolutely possible that two Manufacturers can create products with the same name; the resulting name will be a duplicate and only the name of the Manufacturer will differ in the table. This disqualifies Name in the table to be a Primary Key.
Make sure that Primary Key does not keep changing
This is not a hard and fast rule but rather a general recommendation: Primary Key values should not keep changing. It is quite convenient for a database if Primary Key is static. Primary Keys are referenced in numerous places in the database, from Index to Foreign Keys. If they keep changing then they can adversely affect database integrity, data statistics as well as internal of Indexes.
Summary
A table can have multiple Candidate Keys that are unique as single column or combined multiple columns to the table. They are all candidates for Primary Key. Candidate keys that follow all the three rules - 1) Not Null, 2) Unique Value in Table and 3) Static - are the best candidates for Primary Key. If there are multiple candidate keys that are satisfying the criteria for Primary Key, the decision should be made by experienced DBAs who should keep performance in mind.


Command Type
DML= Data manipulation language like insert, update, delete, select
DCL- Data control language like grant, revoke
DDL - Data Defination language - like create

No comments:

Post a Comment