Data warehousing (CS614)
Assignment # 04 (GRADED)
Total marks = 20
Deadline Date = 10-August-2015
Please carefully read the following instructions before attempting the assignment.
It should be clear that your assignment would not get any credit if:
The assignment is submitted after due date.
The submitted assignment does not open or file is corrupt.
The assignment is copied. Note that strict action would be taken if the submitted assignment is copied from any other student. Both students will be punished severely.
1) You should consult recommended books to clarify your concepts as handouts are not sufficient.
2) You are supposed to submit your assignment in .doc or docx format. Any other formats like scan images, PDF, Zip, rar, bmp etc will not be accepted
3) You are advised to upload your assignment at least two days before Due date.
Assignment comprises of 20 Marks. Note that no assignment will be accepted after due date via email in any case (whether it is the case of load shedding or emergency electric failure or internet malfunctioning etc.). Hence, refrain from uploading assignment in the last hour of the deadline, and try to upload Solutions at least 02 days before the deadline to avoid inconvenience later on.
For any query please contact: CS614@vu.edu.pk
Question no. 1:
Consider the following table:
Player_ID Player_Name Team_ID Pool_ID
WI-06 Richordson WI A
PK-05 Misbah PK A
SA-07 AB Devillier SA B
AU-01 Steve Waugh AU B
PK-01 Hafiz PK A
AU-04 Maxwell AU B
WI-01 Ambrose WI A
SA-09 Dal Styn SA B
Consider the following query:
SELECT Count (Player_Team.Pool_ID) AS CountOfPool_ID
GROUP BY Player_Team.Pool_ID;
Answer the following questions:
1. You need to identify the number of clusters from this data.
2. Secondly, you have to identify whether the given clustering is one way or two way clustering. Your answer should support by valid reasons.
Question no. 2:
Consider the following tables:
Player_ID Player_Name Team
PK-01 Wasim Pakistan
PK-02 Misbah Pakistan
SA-03 AB Devillier South Africa
Award_ID Match_ID Player_ID
01 01 PK-01
01 02 PK-01
02 03 PK-02
01 04 SA-03
Consider the following query:
Select * from Player P, Award A where P.Team= ‘Pakistan’ and A.Award_ID = ‘01’ and P.Player_ID = A.Player_ID
Suppose this query is executed using Naive Nested-Loop join (i.e. there is no index created on both Player and Award tables). Mention that which table should be the Outer table to get minimum I/O by manually calculating the cost in both cases i.e. when “Player” is outer table and when “Award” is outer table.
Note: You need to mention the calculations in your solutions where required.
+ http://bit.ly/vucodes (Link for Assignments, GDBs & Online Quizzes Solution)
+ http://bit.ly/papersvu (Link for Past Papers, Solved MCQs, Short Notes & More)+ Click Here to Search (Looking For something at vustudents.ning.com?) + Click Here To Join (Our facebook study Group)
solution ma table ka snapshot b dena ha?????
Is this correct or join table ka dusra table bahe da do
WA ANTUM FA JAZAKUMU ALLAHU KHAIRAN brother
please upload the first Question
One way clustering make separate grouping for objects and features...whereas two way clustering take the relationships of both
In Question No. 1, we may say that there are two number of clusters.
What is the role of the given query in the solution of question No. 1 ???
according to me in second question we have to solve the query using naive nested loop and then use the join cost formula.. wht say??
koe to solution de yar
mere bhi agr kisi ko solution ata hai to plzzz send kr do time boht km reh gya .doc file bej do plzzzzz abi cs401 ki submit krwani hai