Assignment No. 4 Graded Semester Fall 2015 Data Warehousing– CS614 due date 8 Feb,2016

Objective:

The assignment has been designed to develop your ability to calculate the Bitmap index.

Instructions:

1. 1.      The assignment will not be accepted after due date in any case (whether it is the case of load shedding or emergency electric failure or internet malfunctioning etc.).
2. 2.      Zero marks will be awarded to the assignment that does not open or the file is corrupt.
3. 3.      The assignment file must be an MS Word (.doc) file format; Assignment will not be accepted in any other format.
4. 4.      Zero marks will be awarded to the assignment if copied (from other student or copied from handouts or internet).
5. 5.      Zero marks will be awarded to the assignment if the Student ID is not mentioned in the assignment file.

For any query about the assignment, contact only at CS614@vu.edu.pk

Do not post queries related to assignment on MDB.

GOOD LUCK

Question 1                                                                                                                                         [10 Marks]

Consider the following table:

Player_Team

 Player_ID Player_Name Team_ID Pool_ID IND-06 Gangoli IND A AFG-05 Najeeb AFG B SA-07 AB Devillier SA A AU-01 Steve Waugh AU B IND-01 Tandulker IND A AU-04 Maxwell AU B AFG-01 Nawroze AFG B SA-09 Dal Styn SA A

Consider the following query:

SELECT Count (Player_Team.Pool_ID) AS CountOfPool_ID

FROM Player_Team

GROUP BY Player_Team.Pool_ID;

1. You need to identify the number of clusters from this data.
2. Secondly, you have to identify whether the given clustering is one way or two way clustering. Your answer should support by valid reasons.

Question 2                                                                                                                                         [10 Marks]

Consider the following tables:

Player

 Player_ID Player_Name Team PK-01 Wasim Pakistan PK-02 Misbah Pakistan SA-03 AB Devillier South Africa

Award

 Award_ID Match_ID Player_ID 01 01 PK-01 01 02 PK-01 02 03 PK-02 01 04 SA-03

Consider the following query:

Select * from Player P, Award A where P.Team= ‘Pakistan’ and A.Award_ID = ‘01’ and P.Player_ID = A.Player_ID

Suppose this query is executed using Naive Nested-Loop join and (i.e. there is no index created on both Player and Award tables). Mention that which table should be the Outer table to get minimum I/O by manually calculating the cost in both cases i.e. when “Player” is outer table and when “Award” is outer table.

Note: You need to mention the calculations in your solutions where required.

 Player_ID Player_Name Team_ID Pool_ID IND-06 Gangoli IND A IND-01 Tandulker IND A SA-07 AB Devillier SA A SA-09 Dal Styn SA A AU-01 Steve Waugh AU B AU-04 Maxwell AU B AFG-01 Nawroze AFG B AFG-05 Najeeb AFG B

this is the answer of question no.1

Complete Solution....

Is it right?

identify number of clusster ?

guys plz give solution of 2nd question.

Idea Solution.................

Question no. 1:

Consider the following table:

Player_Team

 Player_ID Player_Name Team_ID Pool_ID WI-06 Richordson WI A PK-05 Misbah PK A SA-07 AB Devillier SA B AU-01 Steve Waugh AU B PK-01 Hafiz PK A AU-04 Maxwell AU B WI-01 Ambrose WI A SA-09 Dal Styn SA B

Consider the following query:

SELECT Count (Player_Team.Pool_ID) AS CountOfPool_ID

FROM Player_Team

GROUP BY Player_Team.Pool_ID;

1. You need to identify the number of clusters from this data.
2. Secondly, you have to identify whether the given clustering is one way or two way clustering. Your answer should support by valid reasons.

Cluster indexing on Team_ID

 Player_ID Player_Name Team_ID Pool_ID WI-06 Richordson WI A WI-01 Ambrose WI A PK-05 Misbah PK A PK-01 Hafiz PK A SA-07 AB Devillier SA B SA-09 Dal Styn SA B AU-01 Steve Waugh AU B AU-04 Maxwell AU B

Cluster indexing on Pool_ID

 Player_ID Player_Name Team_ID Pool_ID WI-06 Richordson WI A PK-05 Misbah PK A PK-01 Hafiz PK A WI-01 Ambrose WI A SA-07 AB Devillier SA B AU-01 Steve Waugh AU B AU-04 Maxwell AU B SA-09 Dal Styn SA B

In this Query the indicate the separate grouping of objects because all the object separately. It’s a one way cluster.

Player

 Player_ID Player_Name Team PK-01 Wasim Pakistan PK-02 Misbah Pakistan SA-03 AB Devillier South Africa

Award

 Award_ID Match_ID Player_ID 01 01 PK-01 01 02 PK-01 02 03 PK-02 01 04 SA-03

Consider the following query:

Select * from Player P, Award A where P.Team= ‘Pakistan’ and A.Award_ID = ‘01’ and P.Player_ID = A.Player_ID

Suppose this query is executed using Naive Nested-Loop join (i.e. there is no index created on both Player and Award tables). Mention that which table should be the Outer table to get minimum I/O by manually calculating the cost in both cases i.e. when “Player” is outer table and when “Award” is outer table.

Qualifying blocks in table P = 3

Size of Table P=9 Blocks

Qualifying blocks in table A = 4

Size of Table A=12 Blocks

If table_P is outer & table_A is inner

Formula for Join cost = size of blocks of table_P + (table blocks of table_P * size of blocks of table_A

Join cost1(player_table:outer, Award_table::inner)

=9+(3 * 12) =9+36

=45

{ size of table_P +(blocks of table_P * size of table_A)

If table_A is outer & table_P is inner

Formula for Join cost = size of blocks of table_A + (table blocks of table_A * size of blocks of table_P

Join cost2(Award_table:outer, Player_table::inner)

=12+(4 * 9) =12+36

=48

{ size of table_A +(blocks of table_A * size of table_P)

Thanx to all of u guys...

All solutions are wrong.............

Read comments on the files mentioned above carefully, these are idea solutions.........

bro menay apka nai kaha i know ap nay just idea dia sab nay asay hi copy paste krlena so kaha k sab wrong samj k kerain and discuss kren hojayegi complete

tu thek dy do

