Q1: Identify the statements correct or incorrect
justify in either case: (5)
1. “Hash based indexing keeps the index entries in B-tree structure”.
2. “Just like primary key primary index has to be unique always”.
First statement is incorrect as the correct one is: page 227
Index entries kept in hash organized tables rather than B-tree structures.
Second statement is also incorrect the correct one is: page 229
Primary Key (PK) & Primary Index (PI):
PK is ALWAYS unique.
PI can be unique, but does not have to be.
Q2: What are different issues during data acquisition and cleansing in
agricultural data warehouse? (5) page 340
Solution:
Step-6: Why the issues?
Major issues of data cleansing had arisen due to data processing and handling
at four levels by different groups of people
1. Hand recordings by the scouts at the field level.
2. Typing hand recordings into data sheets at the DPWQCP office.
3. Photocopying of the typed sheets by DPWQCP personnel.
4. Data entry or digitization by hired data entry operators.
Q3: How gender guide is used for large no of records if gender is missing? (5)
page 457
Gender_guide contains only two columns name and gender. Populate Gender_guide
table by a query for selecting all distinct first names from student table.
Then manually placing their gender. This table can serve us as guide by telling
what can be the gender against this particular name. For example if we have
hundred students in our database with first name equal to ‘Muhammed’. Then in
our Gender_guide table we will have just one entry ‘Muhammed’ and we will
manually set the gender as ‘Male’ against ‘Muhammed’.
Now to fill missing genders in exception table we will just do a inner join on
Error table and Gender_guide table.
Q5: Data profiling is a process which involves gathering of information about
column. What is Data profiling purpose? (3) page 439
To identify the degree of transformation required we will perform data
profiling
Data profiling is a process which involves gathering of information about
column through execution of certain queries with intention to identify
erroneous records. In this process we identify the following:
Total number of values in a column
Number of distinct values in a column
Domain of a column
Values out of domain of a column
Validation of business rules
Q6: Write down three cotton pest scouting Dynamic attributes? (3) page 342
Q7: What is the ranking in DSS? (3)
Q8: Following statement is correct or incorrect? If incorrect then justify your
answer? (3)
“One way clustering gives local view and two way clustering gives global view”.
The above statement is incorrect: page 271
Bi-clustering (Two way clustering) gives a local view of your data set while
one-way clustering gives a global view.
Q9: What are problem you will face if low priority is given to cube
construction? (2) page 313
Low priority for OLAP Cube Construction
Make sure your OLAP cube-building or pre-calculation process is optimized and
given the right priority. It is common for the data warehouse to be on the
bottom of the nightly batch loads, and after the loading the DWH, usually there
isn't much time left for the OLAP cube to be refreshed. As a result, it is
worthwhile to experiment with the OLAP cube generation paths to ensure optimal
performance.
Q10: Is there any fixed strategy to standardize the column? (2) page 480
There are no fixed strategies to standardize the columns.
Q11: What is unsupervised learning in Data Mining? (2) page 271
Unsupervised learning where you don’t know the number of clusters and obviously
no idea about their attributes too. In other words you are not guiding in any
way the DM process for performing the DM, no guidance and no input.
Q12: Which DML operation is used in OLAP? (2) page 76
In OLAP applications the typical user is an analyst who is interested in
selecting data needed for decision support. He/She is primarily not interested
in detailed data, but usually in aggregated data over large sets of data as it
gives the big picture. A typical OLAP query is to find the average amount of
money drawn from ATM by those customers who are male, and of age between 15 and
25 years from (say) Jinnah Super Market Islamabad after 8 pm. For this kind of
query there are no DML operations and the DBMS contents do not change.
Thankxxx alot of u dear for sharing paper. Allah bless u ameen
ReplyDelete