Sunday, March 6, 2016

Designing a Database


Before analyzing most important factors to consider when designing a database, one must be aware of what is database design. It is the process of generating detailed data modal of a database. A data model organizes data elements and standardizes how the data elements relate to one another.
Data model may be conceptual, logical or physical in nature. A conceptual schema specifies the kinds of facts or propositions that can be expressed using the model. The logical data model captures the detailed business solution. The logical data model looks the same regardless of whether we are implementing in MongoDB or Oracle. This consists of descriptions of tables and columns, object oriented classes, and XML tags, among other things. Physical data model describes the physical means by which data are stored.
While designing a database, designer must follow these steps,
·         Determine the data to be stored in the database. This involves understanding the business and how the application is being proposed to behave.
·         Determine the relationships between the different data elements. Designer must be aware of and document how the business entities being discussed are interrelated.
·         Superimpose a logical structure upon the data on the basis of these relationships. Now designer must map business entities and logics to tables, views, primary key, foreign keys, normalization rules etc. In an Object database the storage objects correspond directly to the objects used by the Object-oriented programming language used to write the applications that will manage and access the data.
Some of the most important things to keep in mind while designing database are as follows:
1.      Understand Business
As mentioned above in steps of database design, understanding the business rules is most important, rest comes automatically by experience.
2.      Load on application and Volume of Data
You must be very much sure of how much load and concurrent users you are expecting. Do you need multiple servers or single server for both update and read? Always plan for much higher loads and data volume than anticipated as of today.
3.      Indexes
The application is going to be to more towards read operations or update/add? Indexes are likely to help more for a fast retrieval of records from a table. But if the update/add is going to be too much for the server to perform, it may actually reduce the performance.
4.      Normalization
The database designed must be structurally correct and optimal. Normalization rules may help a lot in this, but sometimes for the sake of performance, design being very specific to an application, these rules may be forgiven.
5.      Information  Integrity
Most of the big names database engines automatically enforce data integrity, but yes as a database designer, you may not be able to enforce certain business validation in DB itself, these must be clearly specified in the design documents. As a database designer, things like primary keys, foreign keys, transactions, triggers etc. are your responsibility.
6.      Security
Specifying Mode of authentication and hiding sensitive information via encryption is essential if business needs it. I have seen many deployments where multiple projects were using same credentials to access multiple databases. This is clearly not a standard practice to be followed. You must be aware of all the auditing features provided by the database engine being used.
7.      Backup and deployment policies
You must be aware of backup of deployment techniques and policies the organization is using, for whom you are designing the database.
8.      Programming platform
Database designer should be aware of the consuming application technology.
If you follow above mentioned thumb rules, most of the issues in production environment may be avoided even before they are encountered. Never forget those restless nights when a production issue comes and do the best you can well in time.
References
Chapple, M. (n.d.). Database Normalization Basics. Retrieved February 17, 2016, from http://databases.about.com/od/specificproducts/a/normalization.htm
Chuan, C. H. (2010, September). A Quick-Start Tutorial on Relational Database Design. Retrieved February 17, 2016, from http://www.ntu.edu.sg/home/ehchua/programming/sql/relational_database_design.html
Database design tutorial. (n.d.). Retrieved February 17, 2016, from http://en.tekstenuitleg.net/articles/software/database-design-tutorial/intro.html
Feasel, K. (2014, April 24). SQL Injection: How it Works and How to Thwart it. Retrieved February 17, 2016, from https://www.simple-talk.com/sql/database-administration/sql-injection-how-it-works-and-how-to-thwart-it/
IBM. (n.d.). DB2 Version 9.7 for Linux, UNIX, and Windows. Retrieved February 17, 2016, from http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r0001029.html
Maulo, F., Bohlen, S., Maughan, J., Brown, R., Zaytsev, A., & Berggren, O. (2015, August 17). NHibernate (Version 4.0.4.4000) [Computer software]. Retrieved February 17, 2016, from https://www.nuget.org/packages/NHibernate/
Microsoft. (n.d.). Maximum Capacity Specifications for SQL Server. Retrieved February 17, 2016, from https://msdn.microsoft.com/en-us/library/ms143432(v=sql.120).aspx
Oracle Corporation. (n.d.). 8 Database Auditing: Security Considerations. Retrieved February 17, 2016, from https://docs.oracle.com/cd/B14117_01/network.101/b10773/auditing.htm
Oracle Corporation. (2016, February 5). MySQL (Version 5.7.11) [Computer software]. Retrieved from http://dev.mysql.com/doc/refman/5.7/en/introduction.html
Oracle Corporation. (n.d.). Oracle Database Limits. Retrieved February 17, 2016, from https://docs.oracle.com/cd/B19306_01/server.102/b14237/limits.htm#REFRN004
PostgreSQL Global Development Group. (2016, February 11). PostgreSQL (Version 9.5.1) [Computer software]. Retrieved from http://www.postgresql.org/about/
Rojansky, S., Lenngren, E., Figueiredo, F., Jr., Uno, K., Asher, J., Cooley, J., . . . Saito, H. (2015, September 18). Npgsql for Entity Framework (Version 2.2.7) [Computer software]. Retrieved from https://www.nuget.org/packages/Npgsql.EntityFramework/
SoftwareInsider. (n.d.). MySQL. Retrieved February 17, 2016, from http://database-management.softwareinsider.com/l/30/MySQL

Telly, M. (2009, February 24). What are the most important considerations when designing a database? Retrieved February 17, 2016, from http://stackoverflow.com/questions/580233/what-are-the-most-important-considerations-when-designing-a-database

1 comment:

  1. COLLAPSE
    One important thing I forgot to mention here in my post : beware of database locks, they must be planned for very carefully, especially when you are doing a transaction.


    Suppose in production, you are trying to retrieve a record which is being edited right now. Or may be multiple users want to edit same record at the same point of time. These cases screw up live if not planned properly your design.

    In pessimistic locking a record or page is locked immediately when the lock is requested, while in an optimistic lock the record or page is only locked when the changes made to that record are updated. The latter situation is only appropriate when there is less chance of someone needing to access the record while it is locked; otherwise it cannot be certain that the update will succeed because the attempt to update the record will fail if another user updates the record first. With pessimistic locking it is guaranteed that the record will be updated.

    Suggested read on this : https://technet.microsoft.com/en-us/library/ms189122(v=SQL.105).aspx

    ReplyDelete