Sunday, March 6, 2016

Designing a Database


Before analyzing most important factors to consider when designing a database, one must be aware of what is database design. It is the process of generating detailed data modal of a database. A data model organizes data elements and standardizes how the data elements relate to one another.
Data model may be conceptual, logical or physical in nature. A conceptual schema specifies the kinds of facts or propositions that can be expressed using the model. The logical data model captures the detailed business solution. The logical data model looks the same regardless of whether we are implementing in MongoDB or Oracle. This consists of descriptions of tables and columns, object oriented classes, and XML tags, among other things. Physical data model describes the physical means by which data are stored.
While designing a database, designer must follow these steps,
·         Determine the data to be stored in the database. This involves understanding the business and how the application is being proposed to behave.
·         Determine the relationships between the different data elements. Designer must be aware of and document how the business entities being discussed are interrelated.
·         Superimpose a logical structure upon the data on the basis of these relationships. Now designer must map business entities and logics to tables, views, primary key, foreign keys, normalization rules etc. In an Object database the storage objects correspond directly to the objects used by the Object-oriented programming language used to write the applications that will manage and access the data.
Some of the most important things to keep in mind while designing database are as follows:
1.      Understand Business
As mentioned above in steps of database design, understanding the business rules is most important, rest comes automatically by experience.
2.      Load on application and Volume of Data
You must be very much sure of how much load and concurrent users you are expecting. Do you need multiple servers or single server for both update and read? Always plan for much higher loads and data volume than anticipated as of today.
3.      Indexes
The application is going to be to more towards read operations or update/add? Indexes are likely to help more for a fast retrieval of records from a table. But if the update/add is going to be too much for the server to perform, it may actually reduce the performance.
4.      Normalization
The database designed must be structurally correct and optimal. Normalization rules may help a lot in this, but sometimes for the sake of performance, design being very specific to an application, these rules may be forgiven.
5.      Information  Integrity
Most of the big names database engines automatically enforce data integrity, but yes as a database designer, you may not be able to enforce certain business validation in DB itself, these must be clearly specified in the design documents. As a database designer, things like primary keys, foreign keys, transactions, triggers etc. are your responsibility.
6.      Security
Specifying Mode of authentication and hiding sensitive information via encryption is essential if business needs it. I have seen many deployments where multiple projects were using same credentials to access multiple databases. This is clearly not a standard practice to be followed. You must be aware of all the auditing features provided by the database engine being used.
7.      Backup and deployment policies
You must be aware of backup of deployment techniques and policies the organization is using, for whom you are designing the database.
8.      Programming platform
Database designer should be aware of the consuming application technology.
If you follow above mentioned thumb rules, most of the issues in production environment may be avoided even before they are encountered. Never forget those restless nights when a production issue comes and do the best you can well in time.
References
Chapple, M. (n.d.). Database Normalization Basics. Retrieved February 17, 2016, from http://databases.about.com/od/specificproducts/a/normalization.htm
Chuan, C. H. (2010, September). A Quick-Start Tutorial on Relational Database Design. Retrieved February 17, 2016, from http://www.ntu.edu.sg/home/ehchua/programming/sql/relational_database_design.html
Database design tutorial. (n.d.). Retrieved February 17, 2016, from http://en.tekstenuitleg.net/articles/software/database-design-tutorial/intro.html
Feasel, K. (2014, April 24). SQL Injection: How it Works and How to Thwart it. Retrieved February 17, 2016, from https://www.simple-talk.com/sql/database-administration/sql-injection-how-it-works-and-how-to-thwart-it/
IBM. (n.d.). DB2 Version 9.7 for Linux, UNIX, and Windows. Retrieved February 17, 2016, from http://www-01.ibm.com/support/knowledgecenter/SSEPGG_9.7.0/com.ibm.db2.luw.sql.ref.doc/doc/r0001029.html
Maulo, F., Bohlen, S., Maughan, J., Brown, R., Zaytsev, A., & Berggren, O. (2015, August 17). NHibernate (Version 4.0.4.4000) [Computer software]. Retrieved February 17, 2016, from https://www.nuget.org/packages/NHibernate/
Microsoft. (n.d.). Maximum Capacity Specifications for SQL Server. Retrieved February 17, 2016, from https://msdn.microsoft.com/en-us/library/ms143432(v=sql.120).aspx
Oracle Corporation. (n.d.). 8 Database Auditing: Security Considerations. Retrieved February 17, 2016, from https://docs.oracle.com/cd/B14117_01/network.101/b10773/auditing.htm
Oracle Corporation. (2016, February 5). MySQL (Version 5.7.11) [Computer software]. Retrieved from http://dev.mysql.com/doc/refman/5.7/en/introduction.html
Oracle Corporation. (n.d.). Oracle Database Limits. Retrieved February 17, 2016, from https://docs.oracle.com/cd/B19306_01/server.102/b14237/limits.htm#REFRN004
PostgreSQL Global Development Group. (2016, February 11). PostgreSQL (Version 9.5.1) [Computer software]. Retrieved from http://www.postgresql.org/about/
Rojansky, S., Lenngren, E., Figueiredo, F., Jr., Uno, K., Asher, J., Cooley, J., . . . Saito, H. (2015, September 18). Npgsql for Entity Framework (Version 2.2.7) [Computer software]. Retrieved from https://www.nuget.org/packages/Npgsql.EntityFramework/
SoftwareInsider. (n.d.). MySQL. Retrieved February 17, 2016, from http://database-management.softwareinsider.com/l/30/MySQL

Telly, M. (2009, February 24). What are the most important considerations when designing a database? Retrieved February 17, 2016, from http://stackoverflow.com/questions/580233/what-are-the-most-important-considerations-when-designing-a-database

Get or Post ? - Yes it does matter!!

Get and Post are two of the ways to request server from a client using Hypertext Transfer Protocol (HTTP).
Get
Get means retrieving information from server (in format as defined by the agreement,) identified based on Request-URI.
We may opt for Conditional get Request, where header may contain If-Modified-Since, If-Match, If-None-Match, or If-Range. This is to reduce load over the network. So, partial Get is also supported here with range header.
Important points to be noted here to choose Get or Post:
1.      As a general rule, in a typical form submission with METHOD="GET", the browser constructs a URL by taking the value of the action attribute, appending a “?” to it, then appending the form data set. The encoding type used in Get request may be “application/x-www-form-urlencoded”.
2.      GET requests can be cached. Also, you may bookmark and look into browser history to retrieve complete request later, but bookmarking and browser history is more of the client features and vary as per what client you are using to make Get request.
3.      Please note, only ASCII characters are allowed while using Get.
4.      You may not be able to hide sensitive information (query string parameters) and even if you use https, still the server logs will contain this information. So for transferring sensitive information Post discussed below is a better choice.
5.      Amount of information you may pass on to server will be limited in case of Get, URL length limit is 2083(1024 in certain cases).It is recommended to keep number of parameters in query string less than 2K, but some servers even handle up to 64K. All in all, you should have a justification like caching or anything like that to keep url’s so large if you want to stick to Get.
Off the topic, in PHP, you may use QUERY_STRING environment variable to retrieve the parameters passed in url. You may also use $_GET to get the array of sent data.
Post
In this data posted is part of message body. This is used for data submission and caching is not an option here unless the response includes appropriate Cache-Control or Expires header fields.
Important points to be noted here to choose Get or Post:
1.      As a general rule, in a typical form submission with METHOD="POST",  a POST request is sent, using the value of the action attribute and a message created according to the content type specified by the enctype attribute. The encoding type used in Post request may be “application/x-www-form-urlencoded or multipart/form-data. Use multipart encoding for binary data”. You may pass query string parameters in a Post request too, if you wish to.
2.      Typically caching is not an option here under common scenarios. The most common browsers don't support bookmarking or history of the complete request.
3.      There are no restrictions on data type in the Post request. Binary data is also allowed.
4.      Post is a little bit safer than Get, since parameters are in message body as a thumb rule. But in case of Get on https, web travel is equally safe. Still server logs may be one of the reasons here to switch to Post in case of sensitive information passed on.
5.      Length of information posted may be huge in case of Post.

Off the topic, in PHP, you may use $_POST to get the array of sent data, based on complexity of the data.

References
Conditional GET Request. (2005). Retrieved February 02, 2016, from https://ruturajv.wordpress.com/2005/12/27/conditional-get-request/
GET vs POST. (n.d.). Retrieved February 02, 2016, from http://www.diffen.com/difference/GET_(HTTP)_vs_POST_(HTTP)
HTTP Methods: GET vs. POST. (n.d.). Retrieved February 02, 2016, from http://www.w3schools.com/tags/ref_httpmethods.asp
HTTP/1.1: Method Definitions. (n.d.). Retrieved February 02, 2016, from https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html
Nottingham, M. (2012, September 24). Caching POST. Retrieved February 02, 2016, from https://www.mnot.net/blog/2012/09/24/caching_POST
PHP GET and POST Method. (n.d.). Retrieved February 02, 2016, from http://www.tutorialspoint.com/php/php_get_post.htm