We are here with you hands in hands to facilitate your learning & don't appreciate the idea of copying or replicating solutions. Read More>>

Looking For Something at vustudents.ning.com? Click Here to Search

www.bit.ly/vucodes

+ Link For Assignments, GDBs & Online Quizzes Solution

www.bit.ly/papersvu

+ Link For Past Papers, Solved MCQs, Short Notes & More


Dear Students! Share your Assignments / GDBs / Quizzes files as you receive in your LMS, So it can be discussed/solved timely. Add Discussion

How to Add New Discussion in Study Group ? Step By Step Guide Click Here.

CS614 ALL Current Mid Term Papers Fall 2014 & Past Mid Term Papers at One Place from 10 January 2015 to 25 January 2015

CS614 ALL Current Mid Term Papers Fall 2014 & Past Mid Term Papers at One Place from 10 January 2015 to 25 January 2015

 

You Can Download Solved Mid Term Papers, Short Notes, Lecture Wise Questions Answers Files, Solved MCQs, Solved Quizzes , Solved Mid Term Subjective Papers , Solved Mid Term Objective Papers From This Discussion For Preparation Mid Term Papers of Fall 2014

For Important Helping Material related to this subject (Solved MCQs, Short Notes, Solved past Papers, E-Books, FAQ,Short Questions Answers & more). You must view all the featured Discussion in this subject group.

For how you can view all the Featured discussions click on the Back to Subject Name Discussions link below the title of this Discussion & then under featured Discussion corner click on the view all link.

Or visit this link 

Click Here For Detail.

You Can download papers by a simple click on file. All the uploaded filed are in Zip file or PDF format So Install Adobe Reader and Winrar Software’s to open these Files

Note: If you download files with Internet Download Manager (IDM), you would face problem of damage files. All files are correct and no damage file, If you face this problem, Open IDM > Click on Options > Click on File Types > Remove PDF and ZIP from files types and save. After it download again files, files will work properly.OR Press alt+click the file ...

Thanks and Regards,

VU Students.ning Social Network.

+ How to Follow the New Added Discussions at Your Mail Address?

+ How to Join Subject Study Groups & Get Helping Material?

+ How to become Top Reputation, Angels, Intellectual, Featured Members & Moderators?

+ VU Students Reserves The Right to Delete Your Profile, If?


See Your Saved Posts Timeline

Views: 5798

.

+ http://bit.ly/vucodes (Link for Assignments, GDBs & Online Quizzes Solution)

+ http://bit.ly/papersvu (Link for Past Papers, Solved MCQs, Short Notes & More)

+ Click Here to Search (Looking For something at vustudents.ning.com?)

+ Click Here To Join (Our facebook study Group)

Replies to This Discussion

S614 ALL Current Mid Term Papers Fall 2014

Spring2014 All Curent MIDTRM Papers solved by Humda 
1. Additive and non-additive facts 2 marks
There can be two types of facts i.e. additive and non-additive.
Additive facts are those facts which give the correct result by an addition operation. 
Or Additive facts are easy to work withsumming the fact value gives meaningful 
results.
Examples of such facts couldbe number of items sold, sales amount etc.
Non-additive facts can also be added, but the addition gives incorrect results.
Examples of non-additive facts are average,discount, ratios etc. page 119
2. Oneto many transformation with example 3 marks
To store information we need one to many transformations of names. We need to 
transform name of each student into 3 columns
• First Name
• Last name
• Student Name (middle part of name)
This type of transformation requires scripts. We will write VB Scripts for such 
transformations.
3. Ek statement d hoi thi about timestamp btanatha true ha ya false with reason
A timestamp in the product table to record the change.
The tables in some operational systems have timestamp columns. The timestamp 
specifies the time and date that a given row was last modified. If the tables in an 
operational systemhave columns containing timestamps, then the latest data can 
easily be identified usingthe timestamp columns. If the timestamp information is 
not available in an operational source system, you will not always be able to modify 
the system to include timestamps.Such modification would require, first, modifying 
the operational system's tables toinclude a new timestamp column and then creating 
a trigger to update the timestampcolumn following every operation that modifies a 
given row.
4. Staraur snow flock schema kidiagramthi and identify karnatha k konsa schema 
haStar schema: designs usually used to facilitate ROLAP querying.
A star schema is generally considered to be the most efficient design for two 
reasons. First, a design with de-normalized tables encounters fewer join operations. 
Second, most optimizers are smart enough to recognize a star schema and generate 
accessplans that use efficient "star join" operations. It has been established that a 
"standardtemplate” data warehouse query directly maps to a star schema.
Diagram page106, page 110
Snowflake scheme: Sometimes a pure star schema might suffer performance 
problems. This can occur when a de-normalized dimension table becomes very 
large and penalizesthe star join operation. Conversely, sometimes a small outer-
level dimension table doesnot incur a significant join cost because it can be 
permanently stored in a memory buffer.Furthermore, because a star structure exists 
at the center of a snowflake, an efficient star join can be used to satisfy part of a 
query. Finally, some queries will not access data from outer-level dimension tables. 
These queries effectively execute against a star schema thatcontains smaller 
dimension tables. Therefore, under some circumstances, a snowflakeschema is 
more efficient than a star schema.
6. 5 marks
Product Id Region ID Period Quantity
1 N Month 25
2 N Month 50
2 S Week 30
Find the primary key & dimension
Answer:
Primary key: product Id, region Id
Dimension: Period, quantity
1.Automatic Data Cleansing
1) Statistical
2) Pattern Based3) Clustering
4) Association Rules
2. List down any three ways of “handling missing data” during “data cleansing 
process”.
There are any three ways of “handling missing data” during “data cleansing 
process”. Handling missing data:
Dropping records.
“Manually” filling missing values.
Using a global constant as filler.
Using the attribute mean (or median) as filler.
Using the most probable value as filler.
Data cleansing process as describing as semi-automatic but can be performance 
without the involvement of a domain expert.
3. basic tasks of Data Transformation
Basic tasks
Selection
Splitting/Joining
Conversion
Summarization
Enrichment
one q was regarding Data Extraction ...
Extraction is the operation of extracting data from a heterogeneous source system 
for further use in a data warehouse environment. This is the first step of the ETL 
process. After the extraction, this data can be transformed, cleansed and loaded into 
the data warehouse.
Identify at least one factor that lead to such relation.
"Normally performance increase when use more of the disk".
Write this answer....Performancevs space trade-off
Q. identify the give statement correct or incorrect 5
1. Intelligent learning organization never shares its information what its 
employees.
Answer: Incorrect
Correct is: The intelligent learning organization shares information openly across 
the enterprise in a way that maximizes the throughput of theentire organization. 
Page 181
2. Orr’s law say that that data which in not used is always correct.
Answer: Incorrect
Correct is: Law 1: “Data that is not used cannot be correct!” page 181
Q. identify the give statement correct or incorrect 3
If defects are found in the process of attributes domain validation then it is 
batter to fix the errors in DHW and leave the data source as it as.
Answer: incorrect
Correct is: if at all possible, fix the problem in the source system. People have the 
tendency of applying fixes in the DWH. This is a wrong i.e. if you are fixing 
theproblems in the DW; you are not fixing the root cause. Page 190
Q. identify the give statement correct or incorrect 3
We can dissolve the aggregation to get the original data from which the 
aggregates were created.
Answer: incorrect
Correct is: Aggregation is one-way i.e. you can create aggregates, but cannotdissolve aggregates toget the original data from which the aggregates were 
created. Page 113
Q. Mention one factor that lends towards long load time of MOLAP cube. 2
Long load time (pre-calculating the cube may take days!). The biggest drawback is 
the extremely long time taken to pre-calculate the cubes, remember that in a 
MOLAP all possible aggregates are calculated.
1) 2 ways to simplify ER?
Answer: De normalization and Dimensional modeling
2) 4 data validation techniques?
Answer: Referential Integrity (RI).
Attribute domain
Using Data Quality Rules
Data Histograming
3) Identify the following statement as correct or incorrect. Justify your answer in 
either case.
"The less likely something to happen the less traumatic it will be when it 
happens"
Answer: incorrect
Law #5: “The less likely something is to occur, the more traumatic it will bewhen 
it happens!”
Page 182
6) suppose ke sales table hai grain is "total sales by day by store "....now identify 3 
facts at least.
Answer sochkilikh Dena 3 facts........such as total products sold , total sales amount 
etcsome thing like this
Answer: prepare 2nd
assignment solution for this questionlexical error:
For example, assume the data to be stored in table form with each row representing 
a tuple and each column an attribute. If we expect the table to have fivecolumns 
because each tuple has five attributes but some or all of the rows contain onlyfour 
columns then the actual structure of the data does not conform to the 
specifiedformat.
Aggregate:refers to a summarization coupled with a calculation across different 
business elements. An example of aggregation is the addition of bimonthly salary 
tomonthly commission and bonus to arrive at monthly employee compensation 
values.
Transformation
Molapki statement di geithi and puchagyatha k correct haiyanahi justify 
karnatha
What is merge and purge problem.
Within the data warehousing field, data cleansing is applied especially when several 
databases are merged. Records referring to the same entity are represented in 
differentformats in the different data sets or are represented erroneously. Thus, 
duplicate records will appear in the merged database. The issue is to identify and 
eliminate these duplicates.The problem is known as the merge/purge problem.
Aggregate ka question tha
1. Data cleansing and its role. 5 Marks
Data cleansing is the 3rd
step in ETL. It is the activity that is used to remove noise 
from the input data before bringing it into DWH environment. Data cleansing is 
vitally important to the overall health of your warehouse project and ultimately 
affects the health of your company. DO not take this statement lightly.
The original aim of data cleansing was to eliminate duplicates in a data collection,
aproblem occurring already in single database applications and gets worse whenintegrating data from different sources.
Data cleansing is much more than simply updating a record with good data.
2. A table was given and we are required to identify fact, dimensions ,PK from 
the table. 5 Marks
3. Performance vs more use of disk space. 3 Marks
Performance vs. Space Trade-off
“Maximum performance boost implies using lots of disk space for storing everypre-
calculation”
If storage is not an issue, then just pre-compute every cube at every unique 
combination of dimensions at every level as it does not cost anything. This will 
result in maximum queryperformance. But in reality, this implies huge cost in disk 
space and the time for constructing the pre-aggregates.
BSN method k steps
Basic Sorted Neighborhood (BSN) Method
Concatenate data into one sequential list of N records
Steps 1: create keys
Compute a key for each record in the list by extracting relevant fields or portions of 
fields
Step 2: sort data
Sort the records in the data list using the key of step 1
Step 3: merge
Move a fixed size window through the sequential list of records limitingthe 
comparisons for matching records to those records in the window
If the size of the window isw records then every new record entering thewindow is 
compared with the previousw-1 records
transformationkay steps
ETL: Extract, Transform, Load3, 3 number kiaik statement
facts , transformation
5,5 number ki 2, 2 statement
yah transformation, cleansing, logical data extraction etc
LOGICAL DATA EXTRACTION:
Full Extraction
The data extracted completely from the source system.
No need to keep track of changes.
Source data made available as-is w/o any additional information.
Incremental Extraction
Data extracted after a well-defined point/event in time.
Mechanism used to reflect/record the temporal changes in data (column or
table)
Sometimes entire tables off-loaded from source system into the DWH.
Can have significant performance impacts on the data warehouse server.
Q4. Identify the following statements as correct or incorrect. (3 marks)
"One of the basic function of OLTP is to show the historical background of an 
organization."
Answer: incorrect
Correct is: OLTP systems don’t keep history, cant get balance statement more than 
a year old
Q5. Identify the given statements as correct or incorrect and justify your answer in 
either case.
1. "Transformation is the process in which we extract the data from 
single/multiple data sources".
Answer: INCORRECT
Correctis: “Transformation is the process in which we exact the data from multiple 
data sources”.
2. "Offline Extraction is a type of Logical Data Extraction".Answer: INCORRECT
Correct is: “Offline Extraction is a type of Physical data extraction”.
3 marks
Q. Identify the given statement as correct or incorrect and justify your answer in 
either case.
"Standard Query Language can not be used for querying the MOLAP cube".
Answer: incorrect
Correct statement is: No, standard query language can not be used for querying the 
MOLAP cube”.
2 marks
Q. The problems associated with the extracted data can correspond to non-primary 
keys. List down any four problems associated with the non-primary key.
There are any four problems associated with the non-primary key:
Same primary key but different data
Same entity with different keys
Primary key in same information
Source might contain invalid data
my cs614 paper.... some new mcqs from my paper with solution
1. The telecommunication data warehouse is dominated by the sheer volume of data 
generated at the call level _________ area.
• Subject page 35
• Object
• Aggregate
• Details
1. 4NF has an additional requirements which is• Data is in 3NF and no null key dependency
• Data is in 2NF and no Multi value dependency
• Data is in 3NF and no multi value dependency page 48
• Data is in 3NF and no foreign key table
1. 3NF remove even more data redundancy than 2NF but it is at the cost of
• Simplicity and performance page 48
• Complexity
• No of table
• Relations
1. In full extraction, data extracted completely from source. No need to keep track of change 
to the _________
• Data source page 133
• DWH
• Data mart
• Data destination
1. Which is not the characteristics of DWH
• Ad-hoc access
• Complete repository
• Historical data
• Volatile page 27
1. Experienced showed that for a single pass of magnetic tape that scanned 100% of the 
record only________ of the records.
• 5% page 12
• 30%
• 50%
• 80%
1. HOLAP provides a combination of relational database access and “cube” data structures 
within a single framework. The goal is to get the best of both MOLAP and ROLAP:
• scalability and high performance page 78
1. ____________ are created out of the data warehouse to service the needs of different 
departments such as marketing, sales etc.
• MIS
• OLAPs
• Data mart page 31• None of the given
Subjective
1) 2 limitation of aggregation. …… 2 marks
2) Realistic data quality……….2marks
Answer: Degree of utility or value of data to business page 180
3) Check whether the given statement is correct or not. Justify in either case.
Rollup is 3cube used to view of table……3marks
4) Three drawback of data redundancy……. 3marks
5) In DWH data is collected from heterogeneous source. Which create redundancy? How it 
effect decision of an organization…….5marks
subjective q3 was: rollup in cube operation used to change the view of data.
Answer:
incorrect
rollup is used to summarize data..........page 80.
Cube operation:
Rollup: summarize data
Drill down: get more details
Slice and dice: select and project
Pivot: change the view of data

Subjectives are:
1) 2 ways to simplifiy ER?
Answer: De normalization and Dimensional modelling
2) 4 data validation techniques?
Answer: See lecture 22
3) identify the following statement as correct or incorrect. Justify your answer in either cases
"The less likely something to happen the less traumatic it will be when it happens"
Answer : incorrect statement....see orr law no.5
4) bhool gaya
5) 2 statements thin or batana tha k correct hain ya nahi
Statement 1: intelligent organizations never shares their info with their employes
Statement 2: data which is not used will always correct
Answer: both are incorrect statement
6) suppose ke sales table hai grain is "total sales by day by store "....now identify 3 facts at least.
Answer • Quantity sold
• Amount
• Sales volume
• Total Rs.sales

Best of luck .pray for me 
Thanks

koi paper share kro , aur ye bhi btai k past se kucha aya k nhi

today my paper cs614 at 11 AM

mcqs past m sy ni thy sb new thy muskil b itni itni long statement k sth

long part easy tha

aditive non adtive wala tabale tha 5 marks

data extration m data caputre k about likhna tha 2 marks

molap k disadvantages likhny thy 2 marks

validation techniques k about ak senrio deia tha k kon c technique use hwi btya jay explian b keya jay 5 marks

2 statement thy

1 law jo 5 deye us m sy 4th num wala law corect h ya ni 3 marks

1 agregate wali statment thi btna tha corect h ya ni 3marks

                                 Best of luck and pray for me kindly

thanks

Humda  thanks keep up 

Attention Students: You don’t need to go any other site for current papers pattern & questions. Because all sharing data related to current Mid term papers of our members are going from here to other sites. You can judge this at other sites yourself. So don’t waste your precious time with different links. Just keep visiting http://vustudents.ning.com/ for all latest updates.

 

 

 

 

 

Which cube operation changes the view of data?

 

Ans: Pivot: change the view of data

Q.23 Change data capture (CDC) is most important step in data extraction. Why? (3 marks)

 


Answer (page 149)
Without Change Data Capture, database extraction is a cumbersome process in which you
move the entire contents of tables into flat files, and then load the files into the data
warehouse. This ad hoc approach is expensive in a number of ways.

Q.24 Identify the following statements as correct or incorrect. (3 marks)
"One of the basic function of OLTP is to show the historical background of an organization."
Answer (page 122)
OLTP & Slowly Changing Dimensions
• OLTP systems not good at tracking the past. History never changes.
• OLTP systems are not “static” always evolving, data changing by overwriting.
• Inability of OLTP systems to track history, purged after 90 to 180 days.
Actually don’t want to keep historical data for OLTP system


Q.25 List down any three ways of “handling missing data” during “data cleansing process”. (5 marks)


Answer (page 162)
• Dropping records.
• “Manually” filling missing values.
• Using a global constant as filler.
• Using the attribute mean (or median) as filler.
• Using the most probable value as filler.

Q.26 Suppose there is a table sale. Grain is “sales by day by product by store. Identify at least three facts so that sales table can easily be built. (5 marks)


Answer (page 74)
• Quantity sold
• Amount
• Sales volume
• Total Rs.sales

 


Objective almost 40% from from moaaz files and subjective totally out of moaaz file all subjective was from last 10 lectures.

Remember me in your prayers!


Spring2014 All Curent MIDTRM Papers solved by Humda 


1. Additive and non-additive facts 2 marks

 


There can be two types of facts i.e. additive and non-additive.
Additive facts are those facts which give the correct result by an addition operation. 
Or Additive facts are easy to work with summing the fact value gives meaningful 
results.
Examples of such facts could be number of items sold, sales amount etc.


Non-additive facts can also be added, but the addition gives incorrect results.
Examples of non-additive facts are average,discount, ratios etc.


2. One to many transformation with example 3 marks


To store information we need one to many transformations of names. We need to 
transform name of each student into 3 columns
• First Name
• Last name
• Student Name (middle part of name)
This type of transformation requires scripts. We will write VB Scripts for such 
transformations.


3. Ek statement d hoi thi about timestamp btanatha true ha ya false with reason
A timestamp in the product table to record the change.


The tables in some operational systems have timestamp columns. The timestamp 
specifies the time and date that a given row was last modified. If the tables in an 
operational systemhave columns containing timestamps, then the latest data can 
easily be identified usingthe timestamp columns. If the timestamp information is 
not available in an operational source system, you will not always be able to modify 
the system to include timestamps.Such modification would require, first, modifying 
the operational system's tables toinclude a new timestamp column and then creating 
a trigger to update the timestampcolumn following every operation that modifies a 
given row.
4. Staraur snow flock schema kidiagramthi and identify karnatha k konsa schema 
haStar schema: designs usually used to facilitate ROLAP querying

.
A star schema is generally considered to be the most efficient design for two 
reasons. First, a design with de-normalized tables encounters fewer join operations. 
Second, most optimizers are smart enough to recognize a star schema and generate 
access plans that use efficient "star join" operations. It has been established that a 
"standard template” data warehouse query directly maps to a star schema.
Diagram page106, page 110


Snowflake scheme: Sometimes a pure star schema might suffer performance 
problems. This can occur when a de-normalized dimension table becomes very 
large and penalizes the star join operation. Conversely, sometimes a small outer-
level dimension table doesnot incur a significant join cost because it can be 
permanently stored in a memory buffer.Furthermore, because a star structure exists 
at the center of a snowflake, an efficient star join can be used to satisfy part of a 
query. Finally, some queries will not access data from outer-level dimension tables. 
These queries effectively execute against a star schema thatcontains smaller 
dimension tables. Therefore, under some circumstances, a snowflakeschema is 
more efficient than a star schema.
6. 5 marks
Product Id Region ID Period Quantity
1 N Month 25
2 N Month 50
2 S Week 30
Find the primary key & dimension


Answer:
Primary key: product Id, region Id
Dimension: Period, quantity
1.Automatic Data Cleansing
1) Statistical
2) Pattern Based3) Clustering
4) Association Rules


2. List down any three ways of “handling missing data” during “data cleansing 
process”.
There are any three ways of “handling missing data” during “data cleansing 
process”. Handling missing data:
Dropping records.
“Manually” filling missing values.
Using a global constant as filler.
Using the attribute mean (or median) as filler.
Using the most probable value as filler.
Data cleansing process as describing as semi-automatic but can be performance 
without the involvement of a domain expert.
3. basic tasks of Data Transformation
Basic tasks
Selection
Splitting/Joining
Conversion
Summarization
Enrichment
one q was regarding Data Extraction ...
Extraction is the operation of extracting data from a heterogeneous source system 
for further use in a data warehouse environment. This is the first step of the ETL 
process. After the extraction, this data can be transformed, cleansed and loaded into 
the data warehouse.
Identify at least one factor that lead to such relation.
"Normally performance increase when use more of the disk".
Write this answer....Performancevs space trade-off
Q. identify the give statement correct or incorrect 5
1. Intelligent learning organization never shares its information what its 
employees.
Answer: Incorrect
Correct is: The intelligent learning organization shares information openly across 
the enterprise in a way that maximizes the throughput of theentire organization. 
Page 181
2. Orr’s law say that that data which in not used is always correct.
Answer: Incorrect
Correct is: Law 1: “Data that is not used cannot be correct!” page 181
Q. identify the give statement correct or incorrect 3
If defects are found in the process of attributes domain validation then it is 
batter to fix the errors in DHW and leave the data source as it as.
Answer: incorrect
Correct is: if at all possible, fix the problem in the source system. People have the 
tendency of applying fixes in the DWH. This is a wrong i.e. if you are fixing 
theproblems in the DW; you are not fixing the root cause. Page 190
Q. identify the give statement correct or incorrect 3
We can dissolve the aggregation to get the original data from which the 
aggregates were created.
Answer: incorrect
Correct is: Aggregation is one-way i.e. you can create aggregates, but cannot dissolve aggregates toget the original data from which the aggregates were 
created. Page 113
Q. Mention one factor that lends towards long load time of MOLAP cube. 2
Long load time (pre-calculating the cube may take days!). The biggest drawback is 
the extremely long time taken to pre-calculate the cubes, remember that in a 
MOLAP all possible aggregates are calculated.
1) 2 ways to simplify ER?
Answer: De normalization and Dimensional modeling
2) 4 data validation techniques?
Answer: Referential Integrity (RI).
Attribute domain
Using Data Quality Rules
Data Histograming
3) Identify the following statement as correct or incorrect. Justify your answer in 
either case.
"The less likely something to happen the less traumatic it will be when it 
happens"
Answer: incorrect
Law #5: “The less likely something is to occur, the more traumatic it will bewhen 
it happens!”
Page 182
6) suppose ke sales table hai grain is "total sales by day by store "....now identify 3 
facts at least.
Answer sochkilikh Dena 3 facts........such as total products sold , total sales amount 
etcsome thing like this
Answer: prepare 2nd
assignment solution for this questionlexical error:
For example, assume the data to be stored in table form with each row representing 
a tuple and each column an attribute. If we expect the table to have fivecolumns 
because each tuple has five attributes but some or all of the rows contain onlyfour 
columns then the actual structure of the data does not conform to the 
specifiedformat.
Aggregate:refers to a summarization coupled with a calculation across different 
business elements. An example of aggregation is the addition of bimonthly salary 
tomonthly commission and bonus to arrive at monthly employee compensation 
values.
Transformation
Molapki statement di geithi and puchagyatha k correct haiyanahi justify 
karnatha
What is merge and purge problem.


Within the data warehousing field, data cleansing is applied especially when several 
databases are merged. Records referring to the same entity are represented in 
different  formats in the different data sets or are represented erroneously/error . Thus, 
duplicate records will appear in the merged database. The issue is to identify and 
eliminate these duplicates.The problem is known as the merge/purge problem.


Aggregate ka question tha
1. Data cleansing and its role. 5 Marks
Data cleansing is the 3rd
step in ETL. It is the activity that is used to remove noise 
from the input data before bringing it into DWH environment. Data cleansing is 
vitally important to the overall health of your warehouse project and ultimately 
affects the health of your company. DO not take this statement lightly.
The original aim of data cleansing was to eliminate duplicates in a data collection,
a problem occurring already in single database applications and gets worse when integrating data from different sources.
Data cleansing is much more than simply updating a record with good data.


2. A table was given and we are required to identify fact, dimensions ,PK from 
the table. 5 Marks


3. Performance vs more use of disk space. 3 Marks
Performance vs. Space Trade-off
“Maximum performance boost implies using lots of disk space for storing everypre-
calculation”
If storage is not an issue, then just pre-compute every cube at every unique 
combination of dimensions at every level as it does not cost anything. This will 
result in maximum queryperformance. But in reality, this implies huge cost in disk 
space and the time for constructing the pre-aggregates.
BSN method k steps
Basic Sorted Neighborhood (BSN) Method
Concatenate data into one sequential list of N records
Steps 1: create keys
Compute a key for each record in the list by extracting relevant fields or portions of 
fields
Step 2: sort data
Sort the records in the data list using the key of step 1
Step 3: merge
Move a fixed size window through the sequential list of records limiting the 
comparisons for matching records to those records in the window
If the size of the window isw records then every new record entering the window is 
compared with the previousw-1 records
transformationkay steps
ETL: Extract, Transform, Load3, 3 number kiaik statement
facts , transformation
5,5 number ki 2, 2 statement
yah transformation, cleansing, logical data extraction etc
LOGICAL DATA EXTRACTION:
Full Extraction
The data extracted completely from the source system.
No need to keep track of changes.
Source data made available as-is w/o any additional information.
Incremental Extraction
Data extracted after a well-defined point/event in time.
Mechanism used to reflect/record the temporal changes in data (column or
table)
Sometimes entire tables off-loaded from source system into the DWH.
Can have significant performance impacts on the data warehouse server.
Q4. Identify the following statements as correct or incorrect. (3 marks)


"One of the basic function of OLTP is to show the historical background of an 
organization."
Answer: incorrect
Correct is: OLTP systems don’t keep history, cant get balance statement more than 
a year old
Q5. Identify the given statements as correct or incorrect and justify your answer in 
either case.
1. "Transformation is the process in which we extract the data from 
single/multiple data sources".
Answer: INCORRECT
Correctis: “Transformation is the process in which we exact the data from multiple 
data sources”.
2. "Offline Extraction is a type of Logical Data Extraction".Answer: INCORRECT
Correct is: “Offline Extraction is a type of Physical data extraction”.
3 marks
Q. Identify the given statement as correct or incorrect and justify your answer in 
either case.
"Standard Query Language can not be used for querying the MOLAP cube".
Answer: incorrect
Correct statement is: No, standard query language can not be used for querying the 
MOLAP cube”.
2 marks
Q. The problems associated with the extracted data can correspond to non-primary 
keys. List down any four problems associated with the non-primary key.
There are any four problems associated with the non-primary key:
Same primary key but different data
Same entity with different keys
Primary key in same information
Source might contain invalid data
my cs614 paper.... some new mcqs from my paper with solution
1. The telecommunication data warehouse is dominated by the sheer volume of data 
generated at the call level _________ area.
• Subject page 35
• Object
• Aggregate
• Details
1. 4NF has an additional requirements which is• Data is in 3NF and no null key dependency
• Data is in 2NF and no Multi value dependency
• Data is in 3NF and no multi value dependency page 48
• Data is in 3NF and no foreign key table
1. 3NF remove even more data redundancy than 2NF but it is at the cost of
• Simplicity and performance page 48
• Complexity
• No of table
• Relations
1. In full extraction, data extracted completely from source. No need to keep track of change 
to the _________
• Data source page 133
• DWH
• Data mart
• Data destination
1. Which is not the characteristics of DWH
• Ad-hoc access
• Complete repository
• Historical data
• Volatile page 27
1. Experienced showed that for a single pass of magnetic tape that scanned 100% of the 
record only________ of the records.
• 5% page 12
• 30%
• 50%
• 80%
1. HOLAP provides a combination of relational database access and “cube” data structures 
within a single framework. The goal is to get the best of both MOLAP and ROLAP:
• scalability and high performance page 78
1. ____________ are created out of the data warehouse to service the needs of different 
departments such as marketing, sales etc.
• MIS
• OLAPs
• Data mart page 31• None of the given
Subjective
1) 2 limitation of aggregation. …… 2 marks
2) Realistic data quality……….2marks
Answer: Degree of utility or value of data to business page 180
3) Check whether the given statement is correct or not. Justify in either case.
Rollup is 3cube used to view of table……3marks
4) Three drawback of data redundancy……. 3marks
5) In DWH data is collected from heterogeneous source. Which create redundancy? How it 
effect decision of an organization…….5marks
subjective q3 was: rollup in cube operation used to change the view of data.
Answer:
incorrect
rollup is used to summarize data..........page 80.
Cube operation:
Rollup: summarize data
Drill down: get more details
Slice and dice: select and project
Pivot: change the view of data

Subjectives are:
1) 2 ways to simplifiy ER?


Answer: De normalization and Dimensional modelling
2) 4 data validation techniques?
Answer: See lecture 22
3) identify the following statement as correct or incorrect. Justify your answer in either cases
"The less likely something to happen the less traumatic it will be when it happens"
Answer : incorrect statement....see orr law no.5


4) bhool gaya
5) 2 statements thin or batana tha k correct hain ya nahi
Statement 1: intelligent organizations never shares their info with their employes
Statement 2: data which is not used will always correct
Answer: both are incorrect statement
6) suppose ke sales table hai grain is "total sales by day by store "....now identify 3 facts at least.


Answer • Quantity sold
• Amount
• Sales volume
• Total Rs.sales

 

 

disadvantages of MOLAP

MOLAP stands for Multi dimensional Online Analytical Processing. MOLAP is the most used storage type. It is designed to offer maximum query performance to the users. The data and aggregations are stored in a multidimensional format, compressed and optimized for performance.  When a cube with MOLAP storage is processed, the data is pulled from the relational database, the aggregations are performed, and the data is stored in the AS database in the form of binary files. The data inside the cube will refresh only when the cube is processed, so latency is high.

Advantages:

  • Since the data is stored on the OLAP server in optimized format, queries (even complex calculations) are faster than ROLAP.
  • The data is compressed so it takes up less space.
  • And because the data is stored on the OLAP server, you don’t need to keep the connection to the relational database.
  • Cube browsing is fastest using MOLAP.

Disadvantages:

  • This doesn’t support REAL TIME i.e newly inserted data will not be available for analysis untill the cube is processed.

 

 

 

 

Mustafa mcs      mc110403622@vu.edu.pk


 

 

 

 

 

 

 

 

Mustafa Noor thanks 

t0dy my ppr was easy m0stly mcqs frm past pprz

change data capture is changllng tchniqu why?2 marks

2 way t0 simplify ER m0dlng 2 marks

ek qst merg/[urge wala tha

5 marks waly new thy qst

thanks for sharing 

my paper also easy mostly from old papers and i have share file above also do that file see my above post 

mine paper

mcqs from moaaz file except these

pb value

which is rite option which describe market item etc

in subjective

1st quetion is

 basic tasks of Data Transformation

2nd is

he problems associated with the extracted data can correspond to non-primary
keys. List down any four problems associated with the non-primary key

3rd is what is ranking

4th is which method is used to remove duplication

5th is what is relation between dimensional cardinality and cube in molap 

. Performance vs more use of disk space.

kiran  thanks for sharing 

Attention Students: You don’t need to go any other site for current papers pattern & questions. Because all sharing data related to current Mid term papers of our members are going from here to other sites. You can judge this at other sites yourself. So don’t waste your precious time with different links. Just keep visiting http://vustudents.ning.com/ for all latest updates.

RSS

© 2020   Created by + M.Tariq Malik.   Powered by

Promote Us  |  Report an Issue  |  Privacy Policy  |  Terms of Service

.