Friday, September 16, 2016

Entities, Instances, Attributes and Identifiers

  • ·         Required – Mandatory
  • ·         Incapable of being perceived by the senses - Intangible
  • ·         Any combination of attributes and/or relationships that serves, in all cases, to uniquely identify an instance of an entity - Identifiers
  • ·         An occurrence or example of an entity - Instance
  • ·         A characteristic; something that describes, quantifies, or specifies an entity - Attribute
  • ·         Not required - Optional
  • ·         A named thing or category of things that is significant to the business and about which data must be known - Entity
  • ·         A value that is unavailable, unassigned, unknown, or empty; however, it is neither a zero nor a space - Null
  • ·         Highly changeable - Volatile
  • ·         Perceptible to the senses, especially the sense of touch - Tangible
  • ·         A classification identifying one of various types of data, stating the possible values for that type, the operations that can be done on that type, and the way the values of that type are stored – Data Type
  • ·         Can only have one value at any point for each instance in the entity – Single Valued


1.       Consider the entity STUDENT. You are all students in a class. However, each one of you is a unique instance of that entity. Your teacher will lead you through an exercise to demonstrate this.
Our attribute values are different, say first name, last name etc.
2.       Think about the last time you went to get a haircut, permanent, manicure, or other service from a barbershop or hair salon.

Discuss the business of a barbershop or hair salon. Name the entities that are the “main things” about this business. Give examples of instances for each entity.
a)      BARBER: name, specialty(haircut/threading/massage/other), time of working
b)      COUNTER: Type (main/ individual of barber’s/hair washing counter/Sample Product sales/others), color
c)       HAIR CUTTING DEVICE: Type (seasor/electric machine/other)

3. The goal of this activity is to learn to make a distinction between an entity, an attribute, and an instance of an entity.
List which of the following concepts you think is an:
• entity
• attribute
• instance.
If you determine that one is an entity, then give an example of an instance. If you deter-mine that one is an attribute or an instance, give a sample entity name. For the last three rows, find a concept that fits.
Concept
Entity? Attribute? Instance?
Example instance or Entity
Vehicle
Entity
corolla, school bus, civic
Diet Cola
Instance
Drink
Price
Attribute
Toy,Drink,Vehicle
Customer number
Attribute
customer
Green
Attribute
car, train, grass

4. The goal of this practice is to recognize attributes for an entity.
The three entities that play a role in a DJ business: SONG, EVENT, and CUSTOMER are listed as the first three column headings of the table below. The fourth column contains a collection of attributes. Use a check mark to indicate if the attribute listed could be an at-tribute for the entities listed. (For example, could Title be an attribute for Song, for Event, and for Customer?)
SONG
EVENT
CUSTOMER

Yes
Yes

Title
Yes
Yes

Description

Yes

Venue


Yes
First Name


Yes
Phone Number
Yes


Release date


Yes
Last Name
Yes
Yes

Type


Yes
Email address

5. From the list of attributes provided, identify which attribute could be the unique identifier of each entity.
Entity: STUDENT
Attributes: student id, first name, last name, address
Entity: MOVIE
Attributes: title, date released, producer, director
Entity: LOCKER
Attributes: size, location, number
6. Read the given business scenario and walk through the steps below.
“I’m the owner of a small movie rental store. We have over 3,000 movies that we need to keep track of.
“Each of our movies has a DVD or VHS tape number. For each movie, we need to know its title and category (e.g., comedy, suspense, drama, action, war, or sci-fi).
“Yes, we do have multiple copies of many of our movies.
“We give each movie a specific ID, and then track which DVD or VHS contains the movie. A movie can be either DVD or VHS format.
“We always have at least one DVD or VHS tape for each movie we track, and each DVD or VHS tape is always a copy of a single, specific movie.
“Our DVDs and VHS tapes are very long. We don’t have any movies that require multi-ple DVDs or VHS tapes.”
a.       Write a list of nouns used in the scenario.
MOVIE, DVD, VHS
b.      Name each entity.
MOVIE, CATEGORY, MEDIA, MEDIAFORMAT
c.       Is each instance of the entity uniquely identifiable? Which attribute or attributes could serve as the unique attribute to identify the entity?
MOVIE (Attributes:  #Id,* Title, * CategoryId, *Year ) - generally with same Title two movies may not be released in same year, but what if? So, Id is primary key. I won’t even make a composite unique key on Title and year. CategoryId is foreign key to CATEGORY (Id)
CATEGORY (Attributes: #Id, #Name).  Even though Name of category is uniquely identifiable, but in physical model, I will make ID as a serial and primary key. On Name will put unique constraint.
 MEDIA (Attributes: #Id, *FormatId, *MovieId),  Id is the unique serial primary key. FormatId is the foreign key to MEDIAFORMAT.
MEDIAFORMAT( Attributes : #Id, #Format)  FORMAT column will be unique and sufficient , under physical model, it may have three instances of MEDIAFORMAT (DVD, VHS, Others). But I will also create an Id column as primary key (+ Serial) to refer as foreign key in MEDIA.
7. Identify the business rules contained in the following scenario.
“We are frequently asked for movies starring specific actors. John Wayne and Julia Roberts are always popular. So we’d like to keep track of the star actors appearing in each movie. Not all of our movies have star actors. Customers like to know each actor’s “real” birth name and date of birth. We track only actors who appear in the movies in our inventory.”
This adds two more entities here one actor and one intersection entity, we need intersection entity, because we cannot stop one actor to work in multiple movies, and almost always a movie has multiple actors.
ACTOR(Attributes: #Id, *Name, ◦RealName, ◦DOB) it is quite possible that some stars are not revealing DOB and real name anywhere. Name (screen name) may be same for two actors
ACTORMOVIEMAP ( Attributes:  #ActorID, #MovieId, *IsMainCastingActor) We got barred relationships here, so combination of ActorId and MovieId will make composite primary key,  IsMainCastingActor is required to show main names on the movie posters, but in full casting list of movie other actor names may also be shown.

8. Identify the business rules contained in the following scenario.
“We have lots of customers. We rent videos only to people who have joined our ‘video club.’ To belong to our club, they must have good credit. For each club member, we’d like to keep their first and last name, current phone number, and current address. And, of course, each club member has a membership number.
CUSTOMER( Attributes:  #clubMemberID, *firstName, *lastName, *phone, *address), since if customer is not member of club, we don’t rent him, and club membered is always unique, it is right candidate to be primary key here.  “Must have good credit” need to be programmatically handled by monthly / quarterly jobs running on server. If not good credit found, send notice to customer and throw him out.
“Then we need to keep track of what movie each customer currently has checked out. A customer may check out multiple video tapes at any given time. We just track current rentals. We don’t keep track of any rental histories.”
This makes us to add an optional attribute to MEDIA entity called ◦RenterId which is a foreign key to customer table. It is optional because at a given point of time, all the media in store will not be rented out, if it does I will become billionaire soon. J.  RenterID can not be kept unique because a customer can rent out multiple media if he wish to.

9. Age is an example of what type of attribute?
Volatile

10.Birth date is an example of what type of attribute?
nonvolatile

11.Read the business scenario of a fast-food restaurant below:
We are a small fast food business. Our menu features food items that can be ordered by a customer. A customer places an order at the counter and indicates what food items he/she would like on that order.
Lately we've noticed that we have some regular customers, so we started asking them for information -- such as name and address, so we can mail them coupons when we have specials.
a.       Using the restaurant entities ORDER, FOOD ITEM, AND CUSTOMER, identify the attributes for each entity.
ORDER ( #Id, ◦CustId, *DateTimeOfOrder): some customers may not be present in CUSTOMER, so CustId is optional. CustId is foreign key to Customer, it is not unique because it can be Null or present in multiple orders. Id is UId  and right candidate for primary key here.
CUSTOMER( #Id, ◦Name, ◦Address). Current scenario says a customer comes into Customer collection only if he gives personal details: Name and Address which means, these two may be marked as mandatory. Just to cater future scenarios, I have kept name and address as optional. Id is UId and right candidate for primary key here.
FOODITEM ( #Id, *Name, *Price) price is not optional, for free food items, give value as zero. A food item cannot be without a name.  . Id is UId and right candidate for primary key here.
ORDERFOODMAP ( #OrderId,#ItemId, *Count) we have composite primary key ( and unique identifier) here on order and item Ids. One order can have multiple fooditems , even multiple instances of same food item.  Rather than repetition of items saved as separate rows, I will prefer to have mandatory count column.

b.      Where possible, indicate whether the attribute is mandatory or optional.
See above
c.       Where possible, pick out the UIDs for each entity
See above




Conceptual and Physical Models

  • A design for an object (a car, a house, a database, etc.) which includes implementation details such as size, volume, weight, etc. - Physical model
  •  A data model, usually represented by an entity-relationship diagram. conceptual model
  •  A collection of facts from which conclusions may be drawn. Data
  • The process of capturing the important concepts and rules that shape a business and depicting them visually on a conceptual model- data modelling

1.            You will be working in pairs for this activity.

•             One student describes his/her “dream house” while the other student attempts to draw it.

You can discuss specific details, but the student describing the house is not allowed to see what is being drawn until after time is called.

•             After sharing your drawing, describe the importance of accurately describing information requirements.
To accurately design physical model, capture all business rules and take in account rules governing the future system.

2.            Review the scenario below. Identify the conceptual model and the physical model from the scenario.
2
Conceptual:
Zoe was about to go into a store to purchase drinks for the birthday party scheduled for that evening. Zoe knows that she needs drinks for 48 people and is expecting the store to accept a check for payment and to provide her with some assistance carrying the product to her car. Zoe wants to have carbonated drinks, non-carbonated drinks, and sugar free drinks. She is expecting to purchase eight six-packs.
Physical:
 Zoe enters the store and discovers the entire drink distribution system is automated. She also discovers that the drinks come in varying package sizes and that she must choose the correct vend-ing option for the products to be disbursed. Drinks are packaged in four-packs, six-packs, and ten-packs.

3.            Provide five reasons for creating a conceptual data model.
1.            Describes exactly the information needs of the business
2.            Facilitates discussion
3.            Prevents mistakes and misunderstandings
4.            Forms important “ideal system” documentation
5.            Forms a sound basis for physical database design.


4. List two examples of conceptual models and physical models.

Conceptual Model:




Physical model:





Major Transformations in Computing

  • ·         The basic framework or features of a system - Infrastructure
  • ·         A global effort to develop an environment in which individual users can access computers, databases, and experimental facilities simply and transparently, without having to consider where those facilities are located – Cloud Computing
  • ·         The programs, routines, and symbolic languages that control the functioning of the hardware and direct its operation. -Software
  • ·         A computer and the associated physical equipment directly involved in the performance of data-processing or communications functions. -hardware
  • ·         Software designed to control the hardware of a specific data-processing system in order to allow users and application programs to make use of it. - Operating system
  • ·         A software program which carries out specific tasks on behalf of other computer users - application
  • ·         A workstation or desktop computer including a screen, keyboard, and mouse; communicates directly with the user - client
  • ·         A more powerful computer which accepts work re-quests from clients, does the work, and sends results back to the client - server

1. Provide a definition and an example of each of these:
a. Hardware
A computer and the associated physical equipment directly involved in the performance of data-processing or communications functions. E.g. RAM, processor, DVD drive etc
b. Operating system
Software designed to control the hardware of a specific data-processing system in order to allow users and application programs to make use of it. E.g. Linux, Windows 10 etc
c. Software
The programs, routines, and symbolic languages that control the functioning of the hardware and direct its operation. E.g. Browsers,  visual studio, notepad etc.
2. How has the major transformation of cell phones impacted day-to-day activities of a service repair company – what can the service repair person do today that they could not do before this transformation?
Now they can give support on phone. Even cross platform audio/video calling is available. This reduces the distance b/w remote support representative and customer

3. List three e-businesses that use database software and describe how the database software is being used.
·         Online book store
·         Online clothing store
·         Online personal shopping service
And so on
All of this kind of e-businesses now has a front end web application [deployed on front end server farms] exposed to external world, and data being stored in database servers [in farm]. Each and every bit of data which might be helpful to lure user to buy more is stored, their purchase history, their browsing behavior, payment information for easy checkout, addresses and such personal details. And so on.


4. Write down the steps of a simple credit card transaction. How many places does the in-formation go, and what happens if the transaction is not complete at any one of the steps? What role does a database play in this process?
  • ·         Making the Purchase. The customer finds a product that he or she likes and decides to make the purchase. The customer can use a credit card to pay for the item in the store, through an online payment gateway, by phone or by mail.
  • ·          Entering the Transaction. The credit card is swiped through a secure credit card terminal, or the card and transaction information is entered in. For e-commerce transactions, the cardholder keys in the payment option. Generally, online store gives the option to customer to save credit card details to their DB for future use, if customer wishes to.
  • ·         Transmitting the Data. The credit card data is transmitted for approval as the terminal, point of sales system or secure payment gateway is connected to the processing network.
  • ·         Approve or Decline. Once the data is transmitted, the credit card issuer can approve or decline the transaction. This is based on the validity of the card, the transaction, as well as the cardholder’s available funds. This is based on information retrieved from card provider database.
  • ·         Responding. If the transaction is approved, the processor and the merchant receive an authorization response (along with unique identifier). To identify this pending debit, a unique identifier is maintained in payment gateway and is mapped to entries in card provider database records too.
  • ·         Completing the Transaction. The merchant completes the transaction. And stores relevant information in orders in his DB.
  • ·         Submitting a Batch Closure. The merchant completes the credit card payment process at the end of the day with a batch closure. This closes out the transactions that have been processed on that day. The processor’s acquiring bank then collects the funds from the credit card issuers. This marks the transaction complete in merchant’s DB, payment gateway provider DB and card provider DB.
  • ·         Depositing the Funds. The processor’s acquiring bank then deposits the funds into the merchant’s business account. This typically takes up to 48 hours.

Each step mentioned above stores intermittent data in database, if the transaction is not complete, it is reverted back with help of this intermittent data. 
For example,
  • ·         I placed the order, did the payment online, but within few hours. I cancelled the order, generally before EOD. The merchant will remove that transaction from daily batch closure and the pending charges on my card will be rolled back after some days.
  • ·         Say the payment is declined by card, the merchant won’t get the authorization code back and order won’t be placed.




History of the Database

  • ·         A relational database chart that is used to map the information from the entity relationship diagram – Table Instance Chart
  • ·         The unique identifier for each row of data - Primary Key (unique +Identifier)
  • ·         Links data in one table to the data in a second table by referring to the PK column in the second table – foreign key
  • ·         Indicates if a column must contain a value - mandatory
  • ·         Indicates if the value in the column is unique within the table. - Unique
  • ·         Refers to the format and definition of the data in each column. – Data Type


1.      Complete the diagram by listing the important events in the evolution of the database. Draw a line from each event to the era in which it falls.

·         1960’s: Computers became cost effective for private companies, and storage capacity increases.
·         1970-72 – E.F. Codd proposes the relational model for databases, disconnecting the logical organization from the physical storage.
·         1976: P.Chen proposes the entity relationship model (ERM) for database design.
·         Early 1980’s: The first commercially available relational DB systems start to appear at  the beginning of 1980s with Orcale ver 2
·         Mid-1980s: structured query language becomes widely accepted standard
·         Early 1990s: industry trend changes Oracle survives many go away.
·         Mid 1990’s: www appears
·         Late 1990’s: Large investment in internet companies – helps tools-market boom in web/internet/DB connectors
·         Early 21st Century: Solid growth of DB applications continues.


2. True or False: The Personal Digital Assistant (PDA) was an important historical contribution in database development and design. True
3. Name two important historical contributions in database development and design.
·         1970-72 – E.F. Codd proposes the relational model for databases, disconnecting the logical organization from the physical storage.
·         1976: P.Chen proposes the entity relationship model (ERM) for database design.
4. If Building the database is the third major task in the database development process, then what are the first two major tasks?
·         Analyze – ( gives ERD)
·         Design – (Gives Table Instance Chart)




Data vs Information

  • ·         A collection of data arranged for ease and speed of search and retrieval. - database
  • ·         Raw material, from which you can draw conclusions - data
  • ·         Knowledge, intelligence, a particular piece of data with a special meaning or function. - Information

List specific data that the school collects about each student. What information could be gathered using all the student data?
·         Score in individual subjects > class average in individual subject / average as a whole > school average in individual subject / in all subjects
·         Address of students > maximum students from each state
And so on

a.    Write examples of data and information related to any of the sites. 

Database schema of imdb may look like i.imgur.com/pDq0n.png


Say there is a TV series called Mr. Robot

In the title table there may be a row which has title of an episode, season id , episode no and lot hell of data, but it is of no use without a proper query made to all tables

On the other hand if I query for keyword, say episode 5 season 2 , I will get what exactly happened in this season, what was the audience rating and so on, this might be an information worth reading for me.

a.    State why you believe data collections of this type could be important.

Information is something of importance, to spend my next 45 minutes watching a TV episode which is crap ( I don’t want that) – above mentioned data collection just helped me, since data is arranged in a structured manner, which may be queried to give useful information for me.



 3. Choose a website that displays data that might be stored in a database; find 10 data items on the site and list 5 pieces of information that could be gleaned from the data.

Let’s take example of auctions.godaddy.com , each row in each table mentioned below is data which has no value for me individually:
       I.            There might be a table called auction Table with a serial id as primary key, domain id being referred from domain table as foreign key , a seller id mapped to another table as foreign key , price and so on
    II.            There may be another table with seller details which are also referred in table in point 1.
 III.            There might be a table which has traffic details, each visit as a row, and a column called domain id which is mapped to domain table as foreign key
 IV.            The domain table might have domain id referred in point 1 and point 3, which is also the primary key. There might be a value column being updated by nightly or weekly evaluation jobs running on the backend, there might be a name field which also will be unique key. There might be also a tld column to group them by tld id, tld id may be a foreign key to TLD table.
    V.            There might be a TLD table with id as primary key referred in point 4 as foreign key. TLD name which may have unique key constraint, also a column to specify whether godaddy supports direct transfer or not (True/false)
 VI.            There might be a Bids table, with bdId as primary key, a foreign key reference to auctionTable to identify for which auction entry this bid was made, To identify type of bid there might be a bidcategoryId, which may be a foreign key to Bidcategory Table. There might be expireOn column also, if bidcategoryId referes to buy now, it might be set to some default value.
VII.            There will be a BidCategory Table which might have data like fixed bid, make offer, buy now and so on
VIII.            There might be table called featuredAuction, it could have ids referred as foreign key from auction table, start and end time of being featured.
 IX.            There might be a currency table to identify, the seller preffered currency details, which might have foregn key reference in point 2 . Say a seller row in point two refers to currency with id 10, it might be named as USD, and signature as $
    X.             There might be a table to identified types of bids allowed for an auction entry in point 1. Table in point 6 identifies what buyer selected, but to identify what all options a an auction entry offer, there might be a table which has each row as auctionId, AllowedBidType
And so on
list 5 pieces of information that could be gleaned from the data:
                   I.            Say I want to buy a domain which starts with “hemant” and is available under auction, I might several joins in the query to get this information from above tables.
                II.            Say I want to buy a domain with minimum price as $2 or max price as $100, I might extract this information from above tables rows.
             III.            Say I want a domain with exactly 5 characters, this information is possible to be extracted.
             IV.            Say I want to buy a domain with traffic more than 100, this information can be extracted here
                V.            Say I want to buy a domain with highest no of bids ( which I think make to valuable), I may extract this information from above tables.
And so on

4. Give examples of how data becomes information for these two industries:
• film/movie
Example: Some Entity storing a data row like, personId, MovieId  , role id is of no use to me at first look, but when queried using the right joins, it can tell me casting information of a movie and I can choose what movie I want to see, based on what Star character I like.
• hospital/healthcare
Example: Some Entity in hospital database may say, personId, VisitDate, MedicationId
It is of no use to doctor, but if queried in right way, it can tell doctor that, antiviral course of this patient is already complete and no need to give more anti-viral medicine. This is the real information which saves patient from overdose.

5. Using this Lunch Room Data Report, answer the questions that follow.



a.    What does this report mean?
If I sum up first two columns vs last three columns, - they are equal, so, I got to know:

·         Date wise, it is trying to record sales, how many items sold to student, how many to Faculty.
·         Date wise, it wants to store, sale of each type of store available over the counter in a festival for fund raising.

b.    What data was collected?

See above

c.    What information does this table provide from the data collected?

See above

d.    How do you think this information is used by those reading the report?
See above

e.    Generate at least two conclusions based on the data provided.

See above

     f. Generate at least two questions that you would ask about the data provided.
·         On a day say, 4 Dec 2003, Pizza bar sold 126 items, out of these 126, how many sold to students?
·         On a day say, 6 Dec 2003, Soup/Salad Bar sold 30 items, but on 02 Dec 2003, it sold 63 items, what food item declined in sale, what went wrong with which food item ?