Friday, September 23, 2016

Modeling Change: Time

  • ·         A constraint or data restriction that results from the time dimension. - Time-related Constraint
  • ·         Refers to a relationship that may or may not be transferable, depending on time.- Conditional Non-transferability
  • ·         Property of a relationship where an instance of A is related to an instance of B, and the association cannot be moved to another instance of B non- transferability
1.  Speeding Tickets
Each police officer may issue speeding tickets to motorists in an assigned area. Originally, the attribute date was modeled as part of the SPEEDING TICKET entity. However, the city police department wants to see if there is a relationship between weather and the frequency of speeding tickets -- do people drive faster on nice sunny days? Are there more tickets in hot weather or cool weather?


Modify the ERD to address this new requirement.



2. Examine the ERD that represents classroom assignments for different exams.


a. Why is start time part of the UID of ASSIGNMENT?
Each assignment identified by StartTime, amongst instances with same exam, date of exam, and class.

Say, no of students in class are more than no of computers available on which exam will be presented. So I have timeslots mentioned in assignment instances.

b. Name at least three time-related constraints. For example: End time must be later than start time. Indicate if the constraint represents conditional non-transferability.
  • ·         An exam cannot be unassigned b/w starttime and end time. This one is a conditional non-transferability.

StartTime field is not dateTime, its only time here e.g. 20:30, not 21September 2016 20:30. Same is true for rnd time.  [Why didn’t we store it as a datetime rather than date and time separately in two attributes?]
  • ·         Start time and end time are times occurring in date of exam.
  • ·         Also, Below mentioned ERD restrict exam to finish on the same day. Since if end time is on the next day, it is not recorded anywhere. [ better is store end datetime if I want to allow exam to continue to next day.]
  • ·         There must be at least one day and start time in that same day when an exam instance must be assigned to any class.





Modeling Historical Data

1. In the scenario below, what words or phrases indicate to you the need to track data that changes over time?

Actors at the local theater sign contracts that specify their pay and working conditions. These contracts are renewed annually. Each year the contracts may change through new language and pay rates. In order to resolve labor disputes, the theater must sometimes go back several years in order to see the contract provisions in place on a particular date.

2. Given the scenario below, construct a conceptual model.
The Seville, Spain soccer association is renovating their soccer arena. They are adding luxury boxes that will be offered to companies. These boxes contain varying numbers of seats. Each box is assigned a section number and level. The boxes are leased by the year, and the prices are set individually according to the location and number of seats in the box. The soccer association needs to keep track of the company name, address, phone, and contact person. Sometimes the companies also provide a logo. Companies that lease a box have the right to renew their leases each year, but they often do not. When a lease for a box opens up, the association contacts companies that previously held leases to see if they want to lease one again.
Assumptions:
  • ·         If a box is booked say in Nov, it doesn’t mean company gets it only for Nov & Dec, they should get it till next Nov.  “The boxes are leased by the year” only helps here, I need not store end date, just start date of lease is sufficient. End date is automatically same date next year minus 1day.
  • ·         Since corporate discounts are also given based on negotiations, price should be stored in lease instance. Some may suggest, keep base price in box and discount % in company instance, but problem here is that sometimes returning customers get extra discounts on lease renewals. So, better keep price decided for next year in Lease instance.
  • ·         So each company may have multiple lease instances. Lease is identified by box, company and dateofbooking.
  • ·         A lease must be undertaken by a company and must be for a box.
  • ·         A box may be booked multiple times in many years.
  • ·         Section is identified by level and sectionnumber [with same number every level has a section]
  • ·         Box is identified by boxNumber [ with same number every section has a box]


Arcs

  • ·         A curved line used to represent an exclusive relationship in an entity-relationship diagram - Arc
  • ·         A logical operator that returns a true value if one, but not both, of its operands is true. - Exclusive OR
  • ·         A relationship between one entity and two (or more) other entities when only one of the relationships can exist at a time. - Exclusive OR Relationship/ mutually exclusive relationship
  • ·         A restriction that applies to data, which is often dictated by the business rules. - constraint

1. A show ticket is purchased from an agent, the box office, or the Internet. A ticket has a description, an event, a date and a price. An agent has a name and a phone number. The box office has an address and a phone number. The Internet has a URL address.
Draw the entities and represent the mutually exclusive relationship.
Assume that all type of sellers it for same price.  One Ticket Instance must be booked by one and only of (one Agent Instance, one box office instance or one website instance)
2. Draw the MENU entity as a supertype of the PROMOTIONAL, REGULAR, and OTHER entities. The UID of MENU is code. MENU is related to FOOD ITEM through this relationship: each MENU may contain one or more FOOD ITEMs, and each FOOD ITEM must be listed on one and only one MENU. The UID of FOOD ITEM is a barred UID using its attribute “number”. Add appropriate attributes to the other entities and be ready to defend your choices.






Second Normal Form

  • The outcome of the second step of database normalization – Second Normal Form (2NF)
1. The UID of a library book includes its BOOK number and SHELF number. Does this ERD follow the rules of Second Normal Form? If you spot a violation, correct it.


Sugession: 2NF on shelf. [Assume that a book won’t have multiple author, even if it has we only track major one so, 1NF is not violated their, problem specifically targets 2NF here.]
2. .      Class Enrollment is the intersection entity that resolves the M:M between STUDENT and CLASS. Does the ERD follow the rules of Second Normal Form? If you spot a violation, correct it.

Suggestion:

3. A store can be located in several shopping malls, and a shopping mall may house several stores. To locate a particular store in a specific neighborhood, you will need to know the name and address of the nearby shopping mall, plus the name of the store. Does the ERD follow the rules of Second Normal Form? If you spot a violation, correct it.

Answer:
To locate a store, name and address of mall + store name is needed, then why location is present in store listing entity. Ok, let’s be realistic here, inside mall, where is store? In shop no 100. This is what here location means. I say no harm in storing a location value. Some may argue why location is not in store entity, what if this store has another branch in another mall and it is located in shop no 200 there?
 Now we come to store specialty, this may be the reason why I have this problem here. Store franchises follow exactly same recipes’ and if there is a specialty, then it’s of the whole chain. Since specialty is of store not only the one located in specific mall, I want to move it to Store Entity. 2NF wants specialty to be dependent on Store UID.



Third Normal Form

  • ·         The form of database normalization where all non-key fields are dependent on the key, the whole key, and nothing but the key. - Third Normal Form (3NF)
  • ·         A condition that exists when any attribute in an entity is dependent upon any other non-UID attribute in that entity.- Transitive Dependency

1.      Identify the transitive dependency in the model below. State which attributes violate Third Normal Form.

Store address being dependent on store name, is a transitive dependency and violates 3NF.


2. Define the rule of Third Normal Form.
  • ·         1NF is satisfied by resolving multi-valued attributes.
  • ·         2NF is satisfied by - 1NF  + Each non UID attribute is dependent on the whole UID,
  • ·         #NF is satisfied by - 2NF + there should be no transitive dependency present.

3. A color scheme for a car includes specifications for paint color for the body and the interior colors and materials. For example: The “Desert” color scheme includes silver paint and gray leather interior; the “Sunburst” color scheme includes gold paint and cream leather interior. Does the model below follow the rules of Third Normal Form? If you spot a violation, correct it.

Suggestion:

  • ·         Model I choose depend on manufacturer (make) – not all manufacturer make every model.
  • ·         Color scheme available really depends on model I choose to buy.
  • ·         Paint color and interior color depends on color scheme I choose.

Assumptions:
  • 1.      A model must have at least one color scheme designed for it
  • 2.      A Color scheme must be created with at least one model in mind.
  • 3.      Once car is manufactured, I can’t change scheme and model of car.



4. Assume the following business rules:
• An athlete employs one agent.
• An agent may work for one or more athletes.
• An athlete may play for one team.
• A team may have one or more players
Does the model below contain one or more transitive dependencies? Identify the attributes involved in the transitive dependencies.


The transitive dependencies identified:
  • ·         Team coach is dependent on team. And if player changes team, coach changes for player, he won't drag his coach with him to new team J
  • Agent commission is dependent on what agent player chooses, if he changes agent, commission may also change (unless agents have a labor union who decides there rates J ). Or if agent revises his commission, 'all the players using him' instances will have to be updated. 

5. Now that you have an idea of what makes a good UID, you need to be aware of the controversy, benefits, and difficulty of uniquely identifying someone. Consider the idea of a national ID card. What kinds of problems would the card create and what kinds of problems would it solve? If your country already uses a national ID card, what are the benefits and issues associated with this? In your opinion, could DNA mapping become the national ID card? Why or why not?

I am more satisfied with the concept of single identity of every person than afraid of. 
Why satisfied:
  • ·         In all government transactions I just need to give SSN and done, say no paper work while getting a new credit card.
  • ·         Say I update my name at one place, ideally it should get propagated everywhere ( in reality it’s not so simple, since theoretically possible in future, so putting here under why satisfied)

Why afraid of:
  • ·         If someone has my SSN, he can even get a credit card that is another story that all credit cards have fraud protections but still…
  • ·         It’s bad for me in a way that, only by knowing SSN they know how many payments of credit card I missed ( I didn’t do it intentionally J )

DNA thing is a good idea but we are far way behind as of today. It won’t be an artificial UID like SSN but costly to collect this data as of today.





Normalization and First Normal Form

  • ·         The outcome of the first step of database normalization. 1NF eliminates repeating groups by putting each into a separate table and connecting them with a 1:M relationship – First Normal Form ( 1NF)
  • ·         A series of steps followed to obtain a database design that allows for efficient access and storage of data in a relational database. These steps reduce data redundancy and the chances of data be-coming inconsistent. - Normalization

1. When checking a database model for first normal form, what are you doing exactly?
Check entities for multi-valued attributes. Verifying that each entity contains attributes that are appropriate for it; insuring that the entity has a UID; checking each attribute against its UID and verifying that the attribute only contains a single value.

2. What is the rule of 1NF in the normalization process?
First Normal Form requires that there be no multi-valued attributes and no repeating groups. To check for First Normal Form, validate that each attribute has a single value for each instance of the entity.
  • ·         identify the related data parts (attributes) and place them together in one entity; if multiple entities exist, attributes should be listed one time only and in the best possible location
  • ·         identify the UID of the entity
  • ·         remove any attribute that contains repeating groups (multiple values) and place it in its own entity; create a 1:M relationship between the two entities

3. Check to see if each ERD is in 1NF. If not, make the necessary changes to correct it.
a.

Suggestion: passenger is multi-valued. A passenger rides a bus, he comes out of bus and go in another bus, at same time he is in single bus. (I don’t see passenger must ride bus, he may ride anything else too, so dotted on passenger side also)

b.



Suggestion: dormitory may have multiple residents. Resident may live in dormitory or something special. Dormitory has multiple beds to accommodate many residents.


c.  


Suggestion: 1NF OK here.

 




UIDs and Normalization

  • ·         Unique identifier - UID
  • ·         A UID that is a single attribute. -  Simple UID
  • ·         A UID that is a combination of attributes. – Composite UID
  • ·         A UID that does not occur in the natural world but is created for identification purposes in a system. – Artificial UID
  • ·         One of several UIDs that could identify something. – Candidate UID
  • ·         A candidate UID that is the primary identifier of something. Primary UID
  • ·         A candidate UID that also identifies something, but is not the primary UID. – Secondary UID

1. Match the type of Unique Identifier to the entity.
a. Simple UID





b. Composite UID
c. Composite UID comprising an attribute and a relationship

d. Primary Key & Secondary Key

2. Consider an entity for Library Member. Sometimes an entity can have more than one candidate UID. Describe a scenario where you would want more than one candidate UID for a Library Member – why would you want to find the member in the database using alternative information?
In libraries here, we are issued a library card with unique member ID on it, but it is quite possible that, I lose my library card. But when we open library card, they took our address, name, email address and phone no.  With same address there could be multiple family members joining the same library system. Two members may have same name. But email address or mobile no should be sufficient to identify a person, so two secondary UID’s [candidate UID which is not marked primary] are Mobile no, email address.

3. Describe how you would identify the entities listed below, making up any attributes and relationships you consider appropriate. For example, for the Student entity you may say that the UID is comprised of the first name, last name, and address. Or you might suggest that the Student entity contains an artificial UID called student number.

a. A city
#City
#State
#Country
In final design, we may observe Hierarchal relationship, rather than city and country names stored in the same Entity. Like, #CityId and a barred relationship from State Entity making a composite UID

b. A contact person for a customer
#emailaddress
(#)directphonenumber
*name

c. A train
#TrainId
It’s artificial UID. I am not using factors like routes, start point, end point here, because I may want to run different trains on same routes on different days and timings. So schedule entity may use route and train from two entities.

d. A road
#RoadName
#City
#Sate
#Country
I never saw two roads in the same city which have same name.

e. A financial transaction (for example: transfer funds)
#SenderAccount
#SenderBank
#ReceiverAccount
#ReceiverBank
#Amount
In final design, there may be an artificial UID to identify transaction and nontransferable relationships from sender bank account and receiver bank account Entities, with amount marked mandatory. Sender and receiver bank accounts Entities will have barred relationships with bank entity.

f. An award (for example: movies)
#AwardID
*Name
○Description
AwardId is artificial UID here.
AwardId from AWARD Entity,  PersonId from PERSON Entity and year make UID for DISTRIBUTION Entity.

g. A painting
#PaintingName
#AuthorId from Author Entity as non-transferrable barred relationship.


4. Identify one of the issues with respect to a national identification system. Provide your point of view on the issue.
They use our SSN as UID. Its artificial UID, what if you know my SSN, think what all things you could do!!! It’s not profiling of my speech, not my fingerprint and not my eyes.

5. You have previously reviewed the video store business requirements stated below:
“I’m the owner of a small movie hire store. We have over 3,000 DVDs that we need to keep track of.
“Each of our DVDs has a disk number. For each movie, we need to know its title and category (e.g., comedy, suspense, drama, action, war, or sci-fi). Yes, we do have multiple copies of many of our movies. We give each movie a specific ID, and then we track which movie a disk contains. A disk may be either High Definition (HD) or BlueRay format. We always have at least one disk for each movie we track, and each disk is always a copy of a single, specific movie. Our disks are very big, although we don’t have any movies that require multiple disks.”
For this activity read the new business requirements below and resolve the M:M relationship. Identify UIDs for all the entities that require multiple disks.
“We are frequently asked for movies starring specific actors. John Wayne and Kathe-rine Hepburn are always popular. So we’d like to keep track of the star actors appearing in each movie. Not all of our movies have star actors. Customers like to know each actor’s “real” birth name and date of birth. We track only actors who appear in the movies in our inventory.
“We have lots of customers. We only rent DVDs to people who have joined our DVD club. To belong to our club, they must have good credit. For each club member, we’d like to keep the first and last name, current phone number, and current address. And, of course, each club member has a membership number.
“Then we need to keep track of what DVDs each customer currently has checked out. A customer may check out multiple DVDs at any given time. We just track current rentals. We don’t keep track of any rental histories.
Assumption:
  • ·         we’d like to:: I consider this not mandatory, real name may be different than popular name, but unknown. DOB of actor may be unknown.
  • ·         we’d like : I consider this must, because I am giving my property to the person. So while giving club membership, I will take all these details.
  • ·         Actors who don’t appear in a movie coming up in future/already in stock: We don’t track ( so solid line in purple ‘Present  in’ in diagram). Also, there may be a low budget  movie which don’t have any star actor which we want to track ( so dotted line in purple ‘performed by’ in diagram).
  • ·         I may have a movie, for which I don’t have a media available, but it is coming in stock, say tomorrow. If you want to reject this assumption, please consider pink dotted line in below diagram as solid.
  • ·         I assume that a movie belongs to only one category and must be under one category; for special movies, others is one instance of category Entity.
  • ·         Under IssueLog, I may have put a composite UID on mediaInventoryNumber, ClubmemberId and IssueTime. But, I have chosen artificial UID here a separate IssueLogId.
  • ·         Even though problem says, ‘Issue time’, ‘issued till’ and ‘is returned’ is not required here, simply delete the row in issue log on return, but it doesn’t make sense in real world, if you want to stick to what problem says, I don’t even need ISSUELOG Entity, Simply create a  relationship b/w MEDIA and Clubmember will be sufficient, optional on Media side (dotted on media side because my all movies are not out at all times J), many on media side ( crowfoot on media side since a customer may take multiple movies), optional on clubmember side ( dotted on clubmemberside, may take ‘not must’), single tow on clubmemberside ( a media instance cannot be taken by multiple customer instances at same time).
  • ·         Here we don’t track disk number but InventoryNumber,  an inventory may have multiple disks. But all media under same inventory number have same format, so no worries. And it won’t be rented partiallyJ.