CS614 Current Final Term Fall 2013 Shared by Nomi File 5

Q1: Identify the statements correct or incorrect justify in either case: (5)
1. “Hash based indexing keeps the index entries in B-tree structure”.
2. “Just like primary key primary index has to be unique always”.

First statement is incorrect as the correct one is: page 227
Index entries kept in hash organized tables rather than B-tree structures.
Second statement is also incorrect the correct one is: page 229
Primary Key (PK) & Primary Index (PI):
PK is ALWAYS unique.
PI can be unique, but does not have to be.

Q2: What are different issues during data acquisition and cleansing in agricultural data warehouse? (5) page 340
Solution:
Step-6: Why the issues?
Major issues of data cleansing had arisen due to data processing and handling at four levels by different groups of people
1. Hand recordings by the scouts at the field level.
2. Typing hand recordings into data sheets at the DPWQCP office.
3. Photocopying of the typed sheets by DPWQCP personnel.
4. Data entry or digitization by hired data entry operators.

Q3: How gender guide is used for large no of records if gender is missing? (5) page 457
Gender_guide contains only two columns name and gender. Populate Gender_guide table by a query for selecting all distinct first names from student table. Then manually placing their gender. This table can serve us as guide by telling what can be the gender against this particular name. For example if we have hundred students in our database with first name equal to ‘Muhammed’. Then in our Gender_guide table we will have just one entry ‘Muhammed’ and we will manually set the gender as ‘Male’ against ‘Muhammed’.
Now to fill missing genders in exception table we will just do a inner join on Error table and Gender_guide table.

Q5: Data profiling is a process which involves gathering of information about column. What is Data profiling purpose? (3) page 439
To identify the degree of transformation required we will perform data profiling
Data profiling is a process which involves gathering of information about column through execution of certain queries with intention to identify erroneous records. In this process we identify the following:
Total number of values in a column
Number of distinct values in a column
Domain of a column
Values out of domain of a column
Validation of business rules

Q6: Write down three cotton pest scouting Dynamic attributes? (3) page 342

Q7: What is the ranking in DSS? (3)

Q8: Following statement is correct or incorrect? If incorrect then justify your answer? (3)
“One way clustering gives local view and two way clustering gives global view”.
The above statement is incorrect: page 271
Bi-clustering (Two way clustering) gives a local view of your data set while one-way clustering gives a global view.

Q9: What are problem you will face if low priority is given to cube construction? (2) page 313
Low priority for OLAP Cube Construction
Make sure your OLAP cube-building or pre-calculation process is optimized and given the right priority. It is common for the data warehouse to be on the bottom of the nightly batch loads, and after the loading the DWH, usually there isn't much time left for the OLAP cube to be refreshed. As a result, it is worthwhile to experiment with the OLAP cube generation paths to ensure optimal performance.

Q10: Is there any fixed strategy to standardize the column? (2) page 480
There are no fixed strategies to standardize the columns.

Q11: What is unsupervised learning in Data Mining? (2) page 271
Unsupervised learning where you don’t know the number of clusters and obviously no idea about their attributes too. In other words you are not guiding in any way the DM process for performing the DM, no guidance and no input.

Q12: Which DML operation is used in OLAP? (2) page 76
In OLAP applications the typical user is an analyst who is interested in selecting data needed for decision support. He/She is primarily not interested in detailed data, but usually in aggregated data over large sets of data as it gives the big picture. A typical OLAP query is to find the average amount of money drawn from ATM by those customers who are male, and of age between 15 and 25 years from (say) Jinnah Super Market Islamabad after 8 pm. For this kind of query there are no DML operations and the DBMS contents do not change.

CS614 Quiz No.4 Shared by Abdul_Mateen (Solved)

Question # 1 of 10 ( Start time: 01:03:34 PM ) Total Marks: 1 The first step of the “12-steps data warehouse implementation approach” of Shaku Atre is: Select correct option: Finding user needs (Page No. 336) Planning system resources Finding system scope Data acquisition and cleansing Question # 2 of 10 ( Start time: 01:04:15 PM ) Total Marks: 1 Users do not care, how advance the front end of your DWH is, what they care is that: Select correct option: Tables should be properly denormalized Proper partitioning technique should be used At least star or snow flake schema should be implemented They should get information in timely manner and the way they want Question # 3 of 10 ( Start time: 01:04:49 PM ) Total Marks: 1 Which of the following is NOT one of the top-10 mistakes that should be avoided during DWH development? Select correct option: Not interacting directly with end user Not being an accommodating person (Page No. 316) Isolating IT support p...

CS504 Quiz No.3 Shared by Angel

Question # 1 of 10 ( Start time: 09:08:01 PM ) Total Marks: 1 Defining the services of an object means: Select correct option: What it does? ok What it knows? Who knows it? Whome it knows? Question # 2 of 10 ( Start time: 09:08:27 PM ) Total Marks: 1 Which one of these represents the Krutchen’s 4+1 architectural view model? Select correct option: Logical view, Process view, Physical view, Development view, Use case view Logical view, Dynamic view, Physical view, Development view, Use case view Logical view, Process view, Physical view, Development view, Sequence view Dynamic view, Process view, Physical view, Development view, Use case view Question # 3 of 10 ( Start time: 09:09:50 PM ) Total Marks: 1 Return values in Synchronous messages are represented by: Select correct option: A solid line A dotted line with label ok A solid line with label Double line Question # 4 of 10 ( Start tim...

CS614 Quiz No.3 Shared by Sweety (Solved)

Question # 1 of 10 ( Start time: 09:48:28 PM ) Total Marks: 1 Mining multi dimensional databases allow users to: Select correct option: Categorize the data Analyze the data Summarize the data All of the given options (Correct) Question # 2 of 10 ( Start time: 09:49:23 PM ) Total Marks: 1 As per Bill Inmon, a data warehouse, in contrast with classical applications is: Select correct option: Data driven (Correct) Resource driven Requirement driven Time sensitive Question # 3 of 10 ( Start time: 09:50:11 PM ) Total Marks: 1 In ________learning you don’t know the number of clusters and no idea about their attributes. Select correct option: Supervised learning Unsupervised learning (Correct) Multi Dimension modeling None of the given options Question # 4 of 10 ( Start time: 09:51:04 PM ) Total Marks: 1 Identify the TRUE statement: Select correct option: The data value increases as volume decreases (Correct) The data value decreases ...

Virtual University Forum

Search This Blog