We are here with you hands in hands to facilitate your learning & don't appreciate the idea of copying or replicating solutions. Read More>>

Looking For Something at vustudents.ning.com? Click Here to Search

www.bit.ly/vucodes

+ Link For Assignments, GDBs & Online Quizzes Solution

www.bit.ly/papersvu

+ Link For Past Papers, Solved MCQs, Short Notes & More


Dear Students! Share your Assignments / GDBs / Quizzes files as you receive in your LMS, So it can be discussed/solved timely. Add Discussion

How to Add New Discussion in Study Group ? Step By Step Guide Click Here.

CS614 Data Warehousing Assignment No.2 (Graded) Spring 2014 Due Date: 26 May 2014

Data warehousing (CS614)

Assignment # 2 (GRADED)

Total marks = 20

                                                                                       Deadline Date = 26/05/2014

Please carefully read the following instructions before attempting the assignment.

 

Rules for Marking

It should be clear that your assignment would not get any credit if:

 

  • The assignment is submitted after due date.
  • The submitted assignment does not open or file is corrupt.
  • The assignment is copied. Note that strict action would be taken if the submitted assignment is copied from any other student. Both students will be punished severely.

 

1)      You should consult recommended books to clarify your concepts as handouts are not sufficient.

2)      You are supposed to submit your assignment in .doc format. Any other formats like scan images, PDF, Zip, rar, bmp, docx etc will not be accepted

3)      You are advised to upload your assignment at least two days before Due date.

4)      This assignment file comprises of Two (2) pages.

 

 

Important Note: 

 

Assignment comprises of 20 Marks. Note that no assignment will be accepted after due date via email in any case (whether it is the case of load shedding or emergency electric failure or internet malfunctioning etc.). Hence, refrain from uploading assignment in the last hour of the deadline, and try to upload Solutions at least 02 days before the deadline to avoid inconvenience later on.

 

For any query please contact: CS614@vu.edu.pk

Assignment No. 2

Consider the business process “exams_sessions_papers_aggregate”. The entities involved in the process are shown in above portion of ERD. The process calculates the information that can be sampled as follows:

 

“Totally 100 papers were conducted in exam center “VU001” in morning session in Fall 2013 exam”.

 

“Totally 110 papers were conducted in exam center “VU002” in evening session in Spring 2013 exam”.

 

“Totally 150 papers were conducted in exam center “VU001” in evening session in Fall 2013 exam”.

 

Question:

 

There are four steps of dimensional modeling. First step is to identify the business process. Consider the “exams_sessions_papers_aggregate” process described above. Apply the remaining steps of dimensional modeling to find out the fact and dimensional table(s).

 

Note: You are required to show these steps in solution as well.

 


+ How to Follow the New Added Discussions at Your Mail Address?

+ How to Join Subject Study Groups & Get Helping Material?

+ How to become Top Reputation, Angels, Intellectual, Featured Members & Moderators?

+ VU Students Reserves The Right to Delete Your Profile, If?


See Your Saved Posts Timeline

Views: 7465

.

+ http://bit.ly/vucodes (Link for Assignments, GDBs & Online Quizzes Solution)

+ http://bit.ly/papersvu (Link for Past Papers, Solved MCQs, Short Notes & More)

+ Click Here to Search (Looking For something at vustudents.ning.com?)

+ Click Here To Join (Our facebook study Group)

Replies to This Discussion

       PRICE:   “DUA”

 

 

STEP 2:

Gain: 
The total number of Exam with a certain session has paper within exam centre.

STEP 3:

Exam Fall 2013 Spring 2013 This is the fact because this is changing over the time peroid.

STEP4:

Exam centre, paper and etc. will be the dimension.

       REMEMBER ME IN YOUR DUA.  “ALLAH HAFIZ”

Allah bless u

please share complete idea of solution ALLAH BLESS YOU AMEEN

Idea for Assignment

 

There are four steps of dimensional modeling. First step is to identify the business process. Consider the “flight_Status_Aggregate” process described above. Apply the remaining steps of dimensional modeling to find out the fact and dimensional tables.

 

 

Solution:

 

Step-1: Identify Business Process

 

Business process:

 

“Flight_Status_Aggregate”:

 

This process involves following entities:

 

 

 

 

 

Step-2: Identify Grain           .

 

From the description of the given process and the sample output, the gain statement will be:

 

The total number of certain flight types with a certain departure status with in a certain time period.

 

The resultant table may have the structure like:

 

FlightType

Primary key

FlightStatus

Primary key

Month_Dim_ID

Primary key

TotalFligths

 

 

Step-3: Choose facts.

 

Here only involves one fact i.e. the TotalFlights. Note that this is the only attribute that keeps changing; all the other attributes remain constant for a certain grain.

 

Step-4: Identify Dimensions.

 

The dimensions will be FlightType, FlightStatus and Month_Dimension (date/duration). Note that these are part of a composite primary key that remains constant for a certain grain. So the possible fact table will be:

 

 

FlightType

Primary key

FlightStatus

Primary key

Month_Dim_ID

Primary key

TotalFligths:

 

 

And the dimension tables will be:

 

FlightADStatus

StatusID

StatusName

Description

01

OnTime

This status is set if flight departs or arrives as per schedule

02

Late

This status is set when the flight departs/arrives behind the specified schedule

03

NotDecided

This status is set when the flight is schedule first time. After the flight departs/arrives, the status is modified

For flight types, a master table with the name “Flight Type” is also defined with following structure and data:

 

Flight Type

FTypeID

FTypeName

Description

01

Direct

 

A flight is direct when it moves directly from source to destination location

02

Connected

A flight is connected with moves indirectly from source to destination involving multiple legs.

 

 

We suppose that the minimum date granularity is month so a date dimension with month granularity (as minimum) is defined as follows:

 

 

 

Month_dim

Month_Dim_ID

Month_Number

Month_Name

Year

001

1

Jan

2013

002

2

Feb

2013

003

3

Mar

2013

 

Consider the business process “exams_sessions_papers_aggregate”. The entities involved in the process are shown in above portion of ERD. The process calculates the information that can be sampled as follows:

“Totally 100 papers were conducted in exam center “VU001” in morning session in Fall 2013 exam”.

“Totally 110 papers were conducted in exam center “VU002” in evening session in Spring 2013 exam”.

“Totally 150 papers were conducted in exam center “VU001” in evening session in Fall 2013 exam”.

Question:

There are four steps of dimensional modeling. First step is to identify the business process. Consider the “exams_sessions_papers_aggregate” process described above. Apply the remaining steps of dimensional modeling to find out the fact and dimensional table(s).

Dimensional models

Before you create a dimensional model, you should understand the basic objects that you use to create dimensional models: fact tables and entities, dimension tables and entities, hierarchies, outriggers, and measures.

Just Idea for Assignment

 

There are four steps of dimensional modeling. First step is to identify the business process. Consider the “flight_Status_Aggregate” process described above. Apply the remaining steps of dimensional modeling to find out the fact and dimensional tables.

 

 

Solution:

 

Step-1: Identify Business Process

 

Business process:

 

“Flight_Status_Aggregate”:

 

This process involves following entities:

 

 

 

 

 

Step-2: Identify Grain           .

 

From the description of the given process and the sample output, the gain statement will be:

 

The total number of certain flight types with a certain departure status with in a certain time period.

 

The resultant table may have the structure like:

 

FlightType

Primary key

FlightStatus

Primary key

Month_Dim_ID

Primary key

TotalFligths

 

 

Step-3: Choose facts.

 

Here only involves one fact i.e. the TotalFlights. Note that this is the only attribute that keeps changing; all the other attributes remain constant for a certain grain.

 

Step-4: Identify Dimensions.

 

The dimensions will be FlightType, FlightStatus and Month_Dimension (date/duration). Note that these are part of a composite primary key that remains constant for a certain grain. So the possible fact table will be:

 

 

FlightType

Primary key

FlightStatus

Primary key

Month_Dim_ID

Primary key

TotalFligths:

 

 

And the dimension tables will be:

 

FlightADStatus

StatusID

StatusName

Description

01

OnTime

This status is set if flight departs or arrives as per schedule

02

Late

This status is set when the flight departs/arrives behind the specified schedule

03

NotDecided

This status is set when the flight is schedule first time. After the flight departs/arrives, the status is modified

For flight types, a master table with the name “Flight Type” is also defined with following structure and data:

 

Flight Type

FTypeID

FTypeName

Description

01

Direct

 

A flight is direct when it moves directly from source to destination location

02

Connected

A flight is connected with moves indirectly from source to destination involving multiple legs.

 

 

We suppose that the minimum date granularity is month so a date dimension with month granularity (as minimum) is defined as follows:

 

 

 

Month_dim

Month_Dim_ID

Month_Number

Month_Name

Year

001

1

Jan

2013

002

2

Feb

2013

003

3

Mar

2013

 

Grain

The grain describe the level of detail.

It can be also see as:

At the top level, there are two main options in choosing the level of granularity:

  • Unsummarized/Atomic (transaction level granularity): this is the highest level of granularity where each fact table row corresponds to a single transaction or line item
  • Summarized: transactions may be summarized by a subset of dimensions or dimensional attributes. In this case, each row in the fact table corresponds to multiple transactions

The most granular or atomic data (atomic as an indivisible unit of work) has the most dimensionality. Atomic data is highly dimensional. Preferably, you should develop dimensional models for the most atomic information captured by a Event. Atomic data is the most detailled information collected: such data cannot be subdivided further.

Example: If a high grain is the month whereas a low or detail grain can be the day

A data warehouse almost always demands data to the lowest possible grain of each dimension not because queries want to see individual low level rows but because queries need to cut through the details in very precise ways.

The lower the level of granularity (or conversely, the higher the level of summarization), the less storage space required and the faster queries will be executed.

Example of grain

  • An individual line item on a customer's retail sales ticket as measured by a scanner device
  • A line item on a bill received from a doctor
  • An individual boarding pass to get on a flight
  • A daily snapshot of the inventory levels for each product in a warehouse
  • A monthly snapshot for each bank account

The basic Table would be!

Exam Center Session Total papers Exam
VU001 Morning 100 2013
VU002 Evening 110 2013
VU002 Evening 150 2013

Now keeping in mind the qualities of Grain select Grain factor from it:

Further keeping in mind the qualioty of fact that it keep changing select Fact from it:

Finally Keeping in mind the quaity of aggregates that they might be single valued or they might be multi_valued select aggregates from it:

Hope this much info will be enough!

Best of luck!

RSS

© 2020   Created by +M.Tariq Malik.   Powered by

Promote Us  |  Report an Issue  |  Privacy Policy  |  Terms of Service

.