Hemant's Blogs

Saturday, October 15, 2016

25 Set Operators

· operator that returns all rows from both tables and eliminates duplicates-UNION
· columns that were made up to match queries in another table that are not in both tables-TO_CHAR(NULL) or TO_DATE(NULL) or TO_NUMBER(NULL) –matching the select list
· operator that returns all rows from both tables, including duplicates-UNION ALL
· used to combine results into one single result from multiple SELECT statements-set operators
· operator that returns rows that are unique to each table-MINUS
· operator that returns rows common to both tables-INTERSECT

1. Name the different Set operators?

UNION

UNION ALL

MINUS

INTERSECT

2. Write one query to return the employee_id, job_id, hire_date, and department_id of all employees and a second query listing employee_id, job_id, start_date, and department_id from the job_hist table and combine the results as one single output. Make sure you sup-press duplicates in the output.

SELECT employee_id, job_id, hire_date, TO_DATE(NULL) start_date, department_id

FROM employees

UNION

SELECT employee_id, job_id, TO_DATE(NULL), start_date, department_id

FROM job_history

ORDER BY employee_id, hire_date, start_date NULLS FIRST;

But after reading next problem I changed it to:

SELECT employee_id, job_id, hire_date, department_id

FROM employees

UNION

SELECT employee_id, job_id, start_date, department_id

FROM job_history

ORDER BY employee_id, hire_date;

29 rows returned

3. Amend the previous statement to not suppress duplicates and examine the output. How many extra rows did you get returned and which were they? Sort the output by employee_id to make it easier to spot.

SELECT employee_id, job_id, hire_date, department_id

FROM employees

UNION ALL

SELECT employee_id, job_id, start_date, department_id

FROM job_history

ORDER BY employee_id, hire_date;

30 rows returned

The duplicate:

This can be verified by:

SELECT employee_id, job_id, hire_date, department_id, COUNT(*)

FROM

(SELECT employee_id, job_id, hire_date, department_id

FROM employees

UNION ALL

SELECT employee_id, job_id, start_date, department_id

FROM job_history)

GROUP BY employee_id, job_id, hire_date, department_id

HAVING COUNT(*) > 1;

4. List all employees who have not changed jobs even once. (Such employees are not found in the job_history table)

SELECT DISTINCT employee_id

FROM employees

MINUS

SELECT DISTINCT employee_id

FROM job_history;

15 rows returned

5. List the employees that HAVE changed their jobs at least once.

SELECT DISTINCT employee_id

FROM employees

INTERSECT

SELECT DISTINCT employee_id

FROM job_history;

5 rows returned

6. Using the UNION operator, write a query that displays the employee_id, job_id, and salary of ALL present and past employees. If a salary is not found, then just display a 0 (zero) in its place.

SELECT employee_id, job_id, NVL(salary, 0)

FROM employees

UNION

SELECT employee_id, job_id, 0

FROM job_history

ORDER BY employee_id;

24 Using ROLLUP and CUBE Operations and GROUPING SETS

· Used to create subtotals that roll up from the most detailed level to a grand total, following a grouping list specified in the clause-ROLLUP
· An extension to the GROUP BY clause like ROLLUP that produces cross-tabulation reports-CUBE
· Used to specify multiple groupings of data- GROUPING SETS

1. Within the Employees table, each manager_id is the manager of one or more employees who each have a job_id and earn a salary. For each manager, what is the total salary earned by all of the employees within each job_id? Write a query to display the Manager_id, job_id, and total salary. Include in the result the subtotal salary for each manager and a grand total of all salaries.

SELECT manager_id, job_id, SUM(salary) "total salary", GROUPING(manager_id), GROUPING(job_id)

FROM employees

GROUP BY ROLLUP(manager_id, job_id);

2. Amend the previous query to also include a subtotal salary for each job_id regardless of the manager_id.

SELECT manager_id, job_id, SUM(salary) "total salary", GROUPING(manager_id), GROUPING(job_id)

FROM employees

GROUP BY CUBE(manager_id, job_id);

Observe 12 new rows here:

3. Using GROUPING SETS, write a query to show the following groupings:

• department_id, manager_id, job_id

• manager_id, job_id

• department_id, manager_id

SELECT department_id, manager_id, job_id, SUM(salary) "total salary", GROUPING(department_id), GROUPING(manager_id), GROUPING(job_id)

FROM employees

GROUP BY GROUPING SETS((department_id, manager_id, job_id), (manager_id, job_id), (department_id, manager_id));

23 Using GROUP BY and HAVING Clauses

· Used to specify which groups are to be displayed; restricts groups that do not meet group criteria-HAVING
· Divides the rows in a table into groups-GROUP BY

1. In the SQL query shown below, which of the following is true about this query?

_______ a. Kimberly Grant would not appear in the results set.

TRUE

_______ b. The GROUP BY clause has an error because the manager_id is not listed in the SELECT clause.

FALSE – GROUP function require that any column listed in the SELECT clause that is not part of group function must be listed in a GROUP BY Clause. But it doesn’t mandate presence of all columns in the group by clause in SELECT.

_______ c. Only salaries greater than 16001 will be in the result set.

FALSE. Greater than or equal to 16001 may ~~must~~ come here. That too, ‘16001 or greater will come here’ only if the row is not filtered by WHERE before group application, AND also the whole group satisfy the HAVING clause.

_______ d. Names beginning with Ki will appear after names beginning with Ko.

FALSE, Names here means the last_name. ‘i’ is before ‘o’ if DESC is skipped.

_______ e. Last names such as King and Kochhar will be returned even if they don’t have salaries > 16000.

FALSE. The WHERE is applied even before the GROUP BY.

SELECT last_name, MAX(salary)

FROM employees

WHERE last_name LIKE 'K%'

GROUP BY manager_id, last_name

HAVING MAX(salary) >16000

ORDER BY last_name DESC ;

2. Each of the following SQL queries has an error. Find the error and correct it. Use Oracle Application Express to verify that your corrections produce the desired results.

a. SELECT manager_id

FROM employees

WHERE AVG(salary) <16000

GROUP BY manager_id;

Above query will say ORA-00934: group function is not allowed here.

If 'desired' result is getting manger ids, under whom, average salary is less than 16000:

SELECT manager_id,AVG(salary)

FROM employees

GROUP BY manager_id

HAVING AVG(salary) <16000;

b. SELECT cd_number, COUNT(title)

FROM d_cds

WHERE cd_number < 93;

Above query will say ORA-00937: not a single-group group function

If ‘desired’ result is getting row count, excluding anything greater than or equal to 93:

SELECT COUNT(*)

FROM d_cds

WHERE cd_number < 93;

c. SELECT ID, MAX(ID), artist AS Artist

FROM d_songs

WHERE duration IN('3 min', '6 min', '10 min')

HAVING ID < 50

GROUP by ID;

Above query will say ORA-00979: not a GROUP BY expression

id is primary key of the table, so most likely my intention may not be to group by id here.

Desired may have been grouping by type_code and getting maximum duration

but limiting the input to group by with duration IN('3 min', '6 min', '10 min') and ID < 50

SELECT type_code, MAX(TO_NUMBER(REPLACE(duration,' min',''))) || ' min' as "max duration"

FROM d_songs

WHERE duration IN('3 min', '6 min', '10 min') AND id < 50

GROUP BY type_code;

d. SELECT loc_type, rental_fee AS Fee

FROM d_venues

WHERE id <100

GROUP BY "Fee"

ORDER BY 2;

1) Above query will say ORA-00904: "Fee": invalid identifier

2) Even if I replace "Fee" alias in group by [don't use alias in group by clause] it will give me error : ORA-00979: not a GROUP BY expression

3) Seems to be question wanted average rental_fee of each loc_type with id less than 100 before grouping.

I assumed rental_fee is a number field:

SELECT loc_type, AVG(rental_fee) AS Fee

FROM d_venues

WHERE id <100

GROUP BY loc_type

ORDER BY 2;

But above query gives error ORA-01722: invalid number, because rental_fee is a VARCHAR2(50)

4) Now, I need assumptions to view the calculations with same eyes: /hour, /flat fee, /per person.

Let’s assume that we are group of 10 people and event I am organizing occurs for 5 hours.

SELECT loc_type, AVG(

CASE

WHEN INSTR(rental_fee, '/hour') != 0 THEN TO_NUMBER(REPLACE(rental_fee,'/hour',''))*5

WHEN INSTR(rental_fee, '/flat fee') != 0 THEN TO_NUMBER(REPLACE(rental_fee,'/flat fee',''))

WHEN INSTR(rental_fee, '/per person') != 0 THEN TO_NUMBER(REPLACE(rental_fee,'/per person',''))*10

ELSE 0

END

) AS Fee

FROM d_venues

WHERE id <100

GROUP BY loc_type

ORDER BY 2;

3. Rewrite the following query to accomplish the same result:

SELECT DISTINCT MAX(song_id)

FROM d_track_listings

WHERE track IN ( 1, 2, 3);

SELECT track, MAX(song_id)

FROM d_track_listings

WHERE track IN ( 1, 2, 3)

GROUP BY track;

4. Indicate True or False

_____ a. If you include a group function and any other individual columns in a SELECT clause, then each individual column must also appear in the GROUP BY clause.

TRUE

_____ b. You can use a column alias in the GROUP BY clause.

FALSE. query will say ORA-00904: "Some Alias": invalid identifier

_____ c. The GROUP BY clause always includes a group function.

FALSE- GROUP BY clause of SQL statement contains the column name- value of which is used to divide input into subgroups, each subgroup represented as a row in output. The group function- goes in SELECT part. HAVING clause may also contain group function. If I put group function in group by clause I get error ‘ORA-00934: group function is not allowed here’ e.g. SELECT AVG(salary) FROM employees GROUP BY AVG(salary) HAVING AVG(salary) <16000; will give error.

5. Write a query that will return both the maximum and minimum average salary grouped by department from the employees table.

SELECT ROUND(MAX(AVG(salary)),2) as "Maximum Average of Departments", ROUND(MIN(AVG(salary)),2) "Minimum Average of Departments"

FROM employees

GROUP BY department_id;

Verify MAX and MIN working against:

6. Write a query that will return the average of the maximum salaries in each department for the employees table.

SELECT AVG(MAX(salary))

FROM employees

GROUP BY department_id;

Verify AVG working against:

22 Count, Distinct, NVL

· Returns the number of non-null values in the expression column-COUNT group function
· The keyword used to return only non-duplicate values or combinations of non-duplicate values in a query.-DISTINCT
· Returns the number of unique non-null values in the expression column.-COUNT(DISTINCT expression)

1. How many songs are listed in the DJs on Demand D_SONGS table?

SELECT COUNT(DISTINCT title)

FROM d_songs; gives 6

SELECT COUNT(*)

FROM d_songs; gives 6

SELECT COUNT(id)

FROM d_songs; gives 6

2. In how many different location types has DJs on Demand had venues?

Possible venues count: (4)

SELECT COUNT(DISTINCT loc_type)

FROM d_venues;

Venue types used by events count: (2)

SELECT COUNT(DISTINCT venue_id)

FROM d_events;

3. The d_track_listings table in the DJs on Demand database has a song_id column and a cd_number column. How many song IDs are in the table and how many different CD numbers are in the table?

SELECT COUNT(song_id) AS "song with possible duplication", COUNT(distinct cd_number) "cd no. distinct"

FROM d_track_listings;

5 and 4

4. How many of the DJs on Demand customers have email addresses?

Count will skip nulls anyhow.

SELECT COUNT(email) "count with email"

FROM d_clients;

5. Some of the partners in DJs on Demand do not have authorized expense amounts (auth_expense_amt). How many partners do have this privilege?

The language of question seems to be, the one with mentioned limit as auth_expense_amt are bound by this limit, but the one who don’t have limit are free to spend anything. Say, a row has auth_expense_amt has 20, that partner can spend $20, If it is 0 he can’t spend anything, but if this field is null, it means he is out of bound .

SELECT (COUNT(*) - COUNT(auth_expense_amt)) "Free from limit count"

FROM d_partners;

COUNT(*) – all row count is 3, and COUNT(auth_expense_amt) is 1. That’s why above query says 2 are out of any bound due to auth_expense_amt specified.

COUNT(auth_expense_amt) is 1 because count skips nulls.

6. What values will be returned when the statement below is issued?

ID	type	shoe_color
456	oxford	brown
463	sandal	tan
262	heel	black
433	slipper	tan

SELECT COUNT(shoe_color), COUNT(DISTINCT shoe_color)

FROM shoes;

4 and 2

7. Create a query that will convert any null values in the auth_expense_amt column on the DJs on Demand D_PARTNERS table to 100000 and find the average of the values in this column. Round the result to two decimal places.

SELECT TO_CHAR(ROUND(AVG(NVL(auth_expense_amt,100000)),2), '$999999.99')

FROM d_partners;

$166666.67

8. Which statement(s) is/are True about the following SQL statement:

SELECT AVG(NVL(selling_bonus, 0.10))

FROM bonuses;

_____ a. The datatypes of the values in the NVL clause can be any datatype except date data.

FALSE, the data type of null value column and the new value must be the same in NVL. In the example above, its 0.10 which suggests selling_bonus is a number. And one more thing, below mentioned statement is valid too:

SELECT NVL(order_date,TO_DATE('01/01/2000' , 'MM/DD/YYYY')) from f_orders;

But one point to be noted here, AVG works on only numeric values.

_____ b. If the selling_bonus column has a null value, 0.10 will be substituted.

TRUE. The field will be assumed to contain 0.10 instead of null and hence won’t be skipped by AVG.

_____ c. There will be no null values in the selling_bonus column when the average is calculated.

TRUE, it won’t permanently change the column data in table, just in the scope of current query, AVG will get 0.10 in the row where null selling_bonus is encountered.

_____ d. This statement will cause an error. There cannot be two functions in the SELECT statement.

FALSE, AVG is a group function and NVL is single row function. And such two functions are allowed. Definitely in any case above error depicted is out of the world here in this case.

9. Which of the following statements is/are TRUE about the following query?

SELECT DISTINCT colors, sizes

FROM items;

_____ a. Each color will appear only once in the result set.

FALSE

_____ b. Each size will appear only once in the result set.

FALSE

_____ c. Unique combinations of color and size will appear only once in the result set.

TRUE

_____ d. Each color and size combination will appear more than once in the result set.

FALSE

21 Group Functions

· Calculates average value excluding nulls- AVG group function
· Returns the number of rows with non-null values for the expression- COUNT group function
· For two sets of data with approximately the same mean, the greater the spread, the greater the standard deviation.- STDDEV group function
· Operate on sets of rows to give one result per group- any group function
· Returns minimum value ignoring nulls- MIN group function
· Used with columns that store numeric data to calculate the spread of data around the mean- VARIANCE group function
· Calculates the sum ignoring null values- SUM group function
· Returns the maximum value ignoring nulls- MAX group function
· To gather into a sum or whole- Aggregate functions return a single result row based on groups of rows. Aggregate is something that is formed by combining several separate elements.

1. Define and give an example of the seven group functions: AVG, COUNT, MAX, MIN, STDDEV, SUM, and VARIANCE.

AVG, COUNT :

· Calculates average value excluding nulls- AVG group function

· Returns the number of rows with non-null values for the expression- COUNT group function

SELECT AVG(salary) || ' is Average Salary of ' || COUNT(salary) || ' employees. This table has primary key employee_id which won''t be null. So, ' || (COUNT(employee_id) - COUNT(salary)) || ' rows are skipped in Average Salary calculation.' "Example"

FROM EMPLOYEES;

(I may use COUNT(*) instead of COUNT(employee_id), if I want to)

Output:

8775 is Average Salary of 20 employees. This table has primary key employee_id which won't be null. So, 0 rows are skipped in Average Salary calculation.

Please note that in this concatenation and minus has same precedence. So () were required.

MAX, MIN:

· Returns minimum value ignoring nulls- MIN group function

· Returns the maximum value ignoring nulls- MAX group function

SELECT 'The maximum of ' || COUNT(salary) || ' salaries in employees table is ' || MAX(salary) ||'. The minimum of '|| COUNT(salary) || ' salaries in employees table is '|| MIN(salary) ||'.' "Example2"

FROM EMPLOYEES;

Output:

The maximum of 20 salaries in employees table is 24000. The minimum of 20 salaries in employees table is 2500.

STDDEV, VARIANCE:

· Used with columns that store numeric data to calculate the spread of data around the mean- VARIANCE group function

· For two sets of data with approximately the same mean, the greater the spread, the greater the standard deviation.- STDDEV group function

SELECT 'The standard deviation of ' || COUNT(salary) || ' salaries in employees table is ' || ROUND(STDDEV(salary), 4) ||'. The variance of '|| COUNT(salary) || ' salaries in employees table is '|| ROUND(VARIANCE(salary), 4) ||'.' "Example3"

FROM EMPLOYEES;

Output:

The standard deviation of 20 salaries in employees table is 5659.6331. The variance of 20 salaries in employees table is 32031447.3684.

If not rounded the values would have been inconvenient to read.

SUM:

· Calculates the sum ignoring null values- SUM group function

SELECT 'The sum of ' || COUNT(salary) || ' salaries in employees table is ' || SUM(salary) ||'.' "Example4"

FROM EMPLOYEES;

Output:

The sum of 20 salaries in employees table is 175500.

2. Create a query that will show the average cost of the DJs on Demand events. Round to two decimal places.

SELECT ROUND(AVG(cost),2) as "Average Cost"

FROM d_events;

9000

3. Find the average salary for Global Fast Foods staff members whose manager ID is 19.

SELECT TO_CHAR(ROUND(AVG(salary),2), '$999999.99') as "Average Salary"

FROM f_staffs

WHERE manager_id = 19;

$8.38

4. Find the sum of the salaries for Global Fast Foods staff members whose IDs are 12 and 9.

SELECT TO_CHAR(ROUND(SUM(salary),2), '$999999.99') as "Total Salary"

FROM f_staffs

WHERE id in (12, 19);

$66.75

5. Using the Oracle database, select the lowest salary, the most recent hire date, the last name of the person who is at the top of an alphabetical list of employees, and the last name of the person who is at the bottom of an alphabetical list of employees. Select only employees who are in departments 50 or 60.

SELECT MIN(salary) "lowest salary", MAX(hire_date) "most recent hire date", MIN(last_name) "top last name", MAX(last_name) "bottom last name"

FROM employees

WHERE department_id in (50, 60);

6. Your new Internet business has had a good year financially. You have had 1,289 orders this year. Your customer order table has a column named total_sales. If you submit the following query, how many rows will be returned?

SELECT sum(total_sales)

FROM orders;

one

7. You were asked to create a report of the average salaries for all employees in each division of the company. Some employees in your company are paid hourly instead of by salary. When you ran the report, it seemed as though the averages were not what you expected—they were much higher than you thought! What could have been the cause?

SELECT AVG(NVL(salary, hourly_rate* hrs_worked_in_yr ))

This way the null fields beings ignored will also be counted in.

8. Employees of Global Fast Foods have birth dates of July 1, 1980, March 19, 1979, and March 30, 1969. If you select MIN(birthdate), which date will be returned?

March 30, 1969

9. Create a query that will return the average order total for all Global Fast Foods orders from January 1, 2002, to December 21, 2002.

SELECT 'Average of ' || COUNT(order_number) || ' orders is : ' || AVG(NVL(order_total, 0)) as "Average"

FROM f_orders

WHERE order_date BETWEEN TO_DATE('January 1, 2002', 'fmMonth DD, YYYY') AND TO_DATE('December 21, 2002', 'fmMonth DD, YYYY');

Average of 1 orders is : 103.02

(I may use COUNT(*) instead of COUNT(order_number), if I want to)

10. What was the hire date of the last Oracle employee hired?

SELECT MAX(hire_date) as "the last"

FROM employees;

29-Jan-2000

11. In the following SELECT clause, which value returned by the SELECT statement will be larger?

SELECT SUM(operating_cost), AVG(operating_cost)

SUM must be be ‘equal or greater than’ average.

12. Refer to the DJs on Demand database D_EVENTS table:

Which clauses represent valid statements?

_______a. FROM event_date

FALSE, this is a column. It will say ORA-00942: table or view does not exist

_______b. SELECT SUM(cost)

TRUE

_______c. SELECT SUM(event_date)

FALSE. This column is not a number. ORA-00932: inconsistent datatypes: expected NUMBER got DATE

_______d. SELECT description, AVG(cost) AS "Expense"

FALSE. ORA-00937: not a single-group group function. Remove either description (to get single row output), or remove avg(cost)

_______e. WHERE MIN(id) = 100

FASLE. MIN is a group function. ORA-00934: group function is not allowed here

_______f. SELECT MAX(AVG(cost)

FALSE. ORA-00978: nested group function without GROUP BY

_______g. SELECT MIN(event_date)

TRUE