Latest Activity In Study Groups

Join Your Study Groups

VU Past Papers, MCQs and More

We non-commercial site working hard since 2009 to facilitate learning Read More. We can't keep up without your support. Donate.


CS614 Short Notes - CS614 Short Questions Answers

Online Study Material , Short Notes, Handouts, Study Resource Links and much more

CS614 - Data Warehousing Notes

Why both aggregation and summarization are required? - Data ware housing

Briefly describe snowflake schema - Data Ware housing

Difference between Low granular and high granular - Data Ware housing

CDC time stamping triggers and Portion, which is the best? tell reason

Aggregate or hardware which is best to enhance the DWH.

Factors behind poor data Quality - Data Ware Housing

Differentiate between MOLAP and ROLAP implementation - Data Ware Housing

How Cube is created in ROLAP? (CS614 - Data Warehousing)

How does aggregates awareness helps the users? (CS614 - Data Warehousing)

Timestamp - (CS614 - Data Ware housing )

Classification Process and Accuracy Measurement - ( CS614 - Data Ware Housing)

Data parallelism - Data Ware housing

Purposes of Data Profiling - Data Ware housing

Real life Examples of Clustering - Data Warehousing

Explain the Additive and non-additive (Data Warehousing)

Clustering and Association Rules

Reason to summarization during data transformation

One-to-One Transformation and One-to-many Transformation

Views: 659

Replies to This Discussion

Purposes of Data Profiling - Data Ware housing

Q. Describe the purposes of Data Profiling.

Data profiling is a powerful method to have an idea about the quality of data. While profiling data we need to run queries to identify:
• Inconsistencies in date formats
• Invalidities
• Missing values of dates
• Violations in business rules

Reference: CS614 - Date Ware Housing - Handouts Page No. 477

Real life Examples of Clustering - Data Warehousing

Q. Give Some Real life Examples of clustering.

Below are real examples of Clustering

Marketing: 
Discovering distinct groups in customer databases, such as customers who make lot of long-distance calls and don’t have a job. Who are they? Students. Marketers use this knowledge to develop targeted marketing programs.

Insurance: 
Identifying groups of crop insurance policy holders with a high average claim rate. Farmers crash crops, when it is “profitable”.

Land use: 
Identification of areas of similar land use in a GIS database.

Seismic studies: 
Identifying probable areas for oil/gas exploration based on seismic data.

Reference : CS614 - Data Warehousing - Handouts Page No. 264

Explain the Additive and non-additive (Data Warehousing)

Q. Explain the additive and non-additive with examples.

Additive Facts:-


Additive facts are those facts which give the correct result by an addition operation.
Examples of such facts could be number of items sold, sales amount

Non-Additive Facts:-


Non-additive facts can also be added, but the addition gives incorrect results.
Examples of non-additive facts are average, discount, ratios etc.

Ref: Handouts Page No. 104

Clustering and Association Rules

Clustering:

Identify outlier records using clustering based on Euclidian (or other) distance. Existing clustering algorithms provide little support for identifying outliers. However, in some cases clustering the entire record space can reveal outliers that are not identified at the field level inspection. The main drawback of this method is computational time. The clustering algorithms have high computational complexity. For large record spaces and large number of records, the run time of the clustering algorithms is prohibitive.

Association rules:

Association rules with high confidence and support define a different kind of pattern. As before, records that do not follow these rules are considered outliers. The power of association rules is that they can deal with data of different types. However, Boolean association rules do not provide enough quantitative and qualitative information.

Ref: Handouts Page No. 146

Reason to summarization during data transformation

Q. Describe reason to summarization during data transformation ?



The reason for this is to make transformation of data easy, and to be able to use a wide. In this term describe programs for transforming data for a grocery chain, sales data at the lowest level of detail for every transaction at the checkout may not be needed. Storing sales by product by store by day in the data warehouse may be quite adequate. So, in this case, the data transformation function includes summarization of daily sales by product and by store.


Reference: CS614 Handouts Page No. 136

One-to-One Transformation and One-to-many Transformation

definitions by umair saulat,

One-to-One Transformation:-


• It is Simple scalar transformation is a one-to-one mapping from one set of values to another set of values 
• it is sufficient to ensure that the transformation is one-to-one.
• it provides a design environment for creating data transformation applications.
• The transformation functions are polynomials.

  

One-to-Many Transformation:-


• A one-to-many transformation is more complex than scalar transformation
• It is data element form the source system results in several columns in the DW
• Code generation can also create transformation in easy-to-maintain computer languages such as Java or XSLT.
• a data transformation converts data from a source data format into destination data.

RSS

© 2021   Created by + M.Tariq Malik.   Powered by

Promote Us  |  Report an Issue  |  Privacy Policy  |  Terms of Service