Data Mining and Data Warehousing multiple choice questions with answers

Data Mining and Data Warehousing multiple choice questions with answers pdf for preparation of IT academic and competitive exams. Before jumping to the MCQs of Data Mining and Data Warehousing, lets brief some related terms.

Definition of Data Mining:

Data mining is the systematic process of extracting hidden patterns, correlations, and knowledge from vast and diverse datasets, utilizing techniques from fields such as statistics, machine learning, and database management.

It involves transforming raw data into meaningful information to support informed decision-making and gain a deeper understanding of complex phenomena.

What is the data mining process?

The data mining is the process of data collection, preprocessing, exploration, modeling, evaluation, and deployment of results to extract meaningful patterns and knowledge.

What are the applications of data mining?

Data mining is applied for customer segmentation, fraud detection, recommendation systems, predictive analytics, image recognition, sentiment analysis, and more.

Data Warehouse Defined as:

A data warehouse is a centralised storage facility for collecting and preserving past and present data from multiple sources. It provides a structured data analysis and reporting environment, supporting business intelligence initiatives.

Definiton of Data Warehouse: A repository that gathers structured and unstructured data from diverse sources, a data warehouse offers a fundamental platform for conducting data analysis and generating reports.

Warehouse Database:

A warehouse database, or data warehouse, is a specialized database designed for analytical processing. It stores historical data and supports complex queries for reporting and analysis.

Data Warehouse Explained as:

A data warehouse acts as a data hub, integrating data from different departments and systems. It transforms raw data into meaningful insights, aiding stakeholders in understanding business performance and trends.

A Data Warehouse serves as a dedicated and consolidated storage facility that houses extensive quantities of organized and unorganized data gathered from diverse origins within a company.

Unlike operational databases, which are designed for transactional tasks, a Data Warehouse is optimized for analytical processing, enabling businesses to glean valuable insights and make informed decisions.

The primary objective of a Data Warehouse is to provide a unified and consistent platform where data from disparate systems and departments can be integrated, organized, and made easily accessible for querying and reporting. 

This structured storage facilitates complex data analysis, trend identification, and pattern recognition that are crucial for strategic planning and business intelligence.

In essence, a Data Warehouse acts as a treasure trove of historical and current data, serving as the foundation for data-driven decision-making. It empowers organizations to extract actionable insights, uncover hidden correlations, and generate comprehensive reports that drive business growth and innovation. 

Through the careful design, integration, and management of data, a Data Warehouse becomes a vital asset in the modern digital landscape, enabling businesses to turn raw information into meaningful knowledge.

Top 70 Data Mining and Data Warehousing multiple choice questions with answers

1. OLTP stands for ___.
Ans. Online Analytical Processing

2. OLTP handles day to day business transactions (true/false)
Ans. True

3. Updates on the Data Warehouse is allowed (true/false)
Ans. False

4. Data Warehouse is a database that is designed for facilitating ___ and ___.
Ans. Query and Analysis

5. Data Warehouse is defined as subject-oriented, integrated, time-variant and ___.
Ans. Non-Volatile

6. Data Warehouse contains only aggregated data and individual transactions (true/false)
Ans. True

7. List the types of the data warehouse.
Ans. Real-time, federated and distributed

8. ___ data Warehouse will allow changes in the information to be monitored and recorded over time.
Ans. time-variant

9. The Data Warehouse functions as ___ and an Executive Information System (EIS).
Ans. DSS

10. Data about data is called ___.
Ans. Metadata

11. Data Warehouse contains data for ___ purpose.
Ans. Analysis

12. Data Warehouse is a storehouse of ___ data.
Ans. Historical

13. In most organizations, two groups of people are key to the success of the project, ___ and ___.
Ans. Senior Management and Working Management

14. OLTP systems are designed for ___.
Ans. Real-time business operations

15. Data Warehouses does not require real-time validation (True / False)
Ans. True

16. In most organizations, two groups of people are key to the success of the project, ___ and ___.
Ans. Senior Management,

17. In Data Warehouse, the requirements are gathered subject area wise. (True / False)
Ans. True

18. The 3 major functions that needed to be performed for getting the data ready into the Data Warehouse are extraction, transformation and ___.
Ans. Loading

19. ___ and ___ of data take place on a large scale in the data staging area.
Ans. Sorting and Merging

20. Knowledge discovery is called ___.
Ans. Data Mining

21. The main purpose of E-R modelling is
a. To remove redundancy
b. To improve analysis for decision-making
c. To record historical data
d. None
Ans. a

22. E-R modelling and Dimensional modelling are the same (True / False)
Ans. No

23. A Dimension is an entity or subject area, which can group the data (True / False)
Ans. True

24. Dimensional model consists of ___ and ___ tables.
Ans. Dimensions and fact tables

25. ___ is often used in dimensional modelling.
Ans. Text data

26. Fact –tables usually consist of ___to___ relationships.
Ans. Many to many

27. Dimensional model can be implemented with the following databases,
a. Relational database
c. Flat files
d. Excel data files
e. None
Ans. a

28. Customer name change in the dimensional model comes under ___.
Ans. Slowly-changing-dimension

29. The most popular model for the data warehouse is ___.
Ans. Multidimensional model

30. Which of the following schema supports the normalization in dimensional modelling?
a. Star Schema
b. Snow-Flake schema
c. Fact-Constellation
Ans. a

31. Each dimension table is in ___ relationship with the central fact table.
Ans. One-to-many

32. Dimensional table and a fact table can be connected with the following database keys:
a. Foreign key
b. Surrogate key
c. Candidate key
Ans. a

33. For sales analysis units sold is a ___ kind of measure.
Ans. Additive numeric measure

34. OLAP tools are data accessing and discovery tools (True / False)
Ans. True

35. In Data Warehouse a system with multiple architectures is called ___
Ans. Federated Data Warehouse architecture

36. Data marts are,
a) Department level
b) Limited in size
c) Read-only
d) All the above
Ans. d

37. Data Warehouse functions are a Decision support system and ___.
Ans. EIS

38. Info Data extraction, ___ and ___ encompass the areas of data acquisition and data storage.
Ans. Transformation and Loading

39. Populating all the Data Warehouse tables for the very first time is called ___.
Ans. Initial Load

40. Which of the following are open source ETL tools?
a) SAS Data Integrator
b) Ascetical Data Stage
c) Cognos Decision Stream
d) Microsoft DTS
e) Clover
Ans. Clover

41. Average daily balances ___ attribute.
Ans. Derived attribute

42. OLAP stands for ___
Ans. Online analytical processing

43. OLAP tools enable the user to access the data in Data Warehouse in an interactive manner (True / False)
Ans. True

44. ERP and CRM are ___ kinds of systems.

45. Data cube contains ___ and ___.
Ans. Dimensions and Facts

46. A dimensional table contains hierarchies (True / False)
Ans. true

47. Which of the following are the intermediate servers that stand in between a relational back-end server and client front-end tools?
d. All the above
Ans. all

48. The advantage of using a data cube is that it allows fast indexing to precomputed summarized data. (True / False)
Ans. true

49. In Data Warehouse, a single record link to all the duplicate record in the sources systems is called ___.
Ans. De-duplication

50. Sorting the data in the given source file is a transformation (True / False).
Ans. True

51. OLTP is abbreviated as ___
Ans. Online transaction processing

52. Query response time is ___ kind of metadata.
Ans. Operational metadata

53. Key hierarchies and key performance indicators are ___ kind of Metadata.
Ans. Business metadata

54. Storing, data mapping and transformation from source systems to the data warehouse fall into:
a. Technical metadata
b. Operational metadata
c. Business metadata
Ans. a

55. According to Ralph Kimball, Back-room metadata guides:
a. Extraction
b. Cleaning
c. Loading processes
d. All the above
Ans. d

56. One tool that can allow data warehouse managers to deal with metadata is called___.
Ans. Repository

57. Access rights, protocols are ___ metadata.
Ans. Administrative metadata

58. Data about data is called ___.
Ans. Metadata

59. Information can be converted into knowledge about ___ patterns and future trends.
Ans. Historical

60. Data about data is called ___.
Ans. Metadata

61. The ___ software gives the user the opportunity to look at the data from a variety of different dimensions.
Ans. Multidimensional Analysis

62. ___ Optimization techniques are based on the concepts of genetic combination, mutation, and natural selection.
Ans. Genetic algorithms

63. Based on the overall requirements of business intelligence, the ___ layer is required to extract, cleanse and transform data into load files for the information warehouse.
Ans. Data integration

64. Data Mining is not a business solution; it is just a technology. (True/False)
Ans. True

65. ___ is used to refer to systems and technologies that provide the business with the means for decision-makers to extract personalized meaningful information about their business and industry.
Ans. Business Intelligence

66. OLAP Supports ___ user access and multiple queries.
Ans. Multiple

67. Statistics techniques are incorporated into Data mining methods. (True/False).
Ans. True

68. A priori algorithm operates in ___ method
a. Bottom-up search method
b. Breadth-first search method
c. None of the above
d. Both a & b
Ans. D

69. A bi-directional search takes advantage of ___ process
a. Bottom-up process
b. Top-down process
c. None
d. Both a & b
Ans. D

70. The pincer-search has an advantage over a priori algorithm when the largest frequent itemset is long. (True/false)
Ans. True

Download Data mining & data warehousing MCQs with answers in pdf

FAQs related to Data Mining and Data Warehousing

Ques 1: What Does Data Warehouse allow Organization to Achieve?

Answer: A data warehouse allows organizations to achieve streamlined data storage, efficient data analysis, improved decision-making, and enhanced business intelligence capabilities.

Ques 2: What is the difference between Data Warehousing and Data Mining?

Answer: Data warehousing and data mining go hand in hand. While data warehousing provides a robust infrastructure for data storage, data mining leverages advanced techniques to extract valuable insights and patterns from the stored data.

Here’s a clear explanation of the difference between Data Warehousing and Data Mining:

Data Warehousing vs. Data Mining:

  • Data Warehousing:
    • Definition: Data Warehousing involves the gathering, retention, and organization of both structured and unstructured data from diverse origins within a centralized storage location.
    • Purpose: The primary purpose of Data Warehousing is to provide a consolidated and organized storage environment for historical and current data.
    • Focus: Data Warehousing focuses on efficient data storage, retrieval, and integration to support reporting, analysis, and business intelligence.
    • Components: It involves the creation of a centralized database, data transformation through ETL processes, and designing schemas for optimal querying.
    • Usage: Data Warehousing is used for generating reports, dashboards, and visualizations, making historical comparisons, and supporting strategic decision-making.
  • Data Mining:
    • Definition: Large datasets are analyzed in order to extract valuable knowledge through the process known as data mining, which involves discovering patterns, correlations, and insights.
    • Purpose: The primary purpose of Data Mining is to uncover hidden information and relationships within data that might not be immediately apparent.
    • Focus: Data Mining focuses on analyzing data to identify trends, patterns, and anomalies, often using algorithms and statistical techniques.
    • Techniques: It involves various techniques such as clustering, classification, regression, and association rule mining.
    • Usage: Data Mining is used to predict future trends, segment data, make recommendations, and gain deeper insights for decision-making.

In essence, Data Warehousing is about efficiently storing and organizing data for easy access and analysis, while Data Mining is the process of extracting meaningful insights and knowledge from the stored data. Data Warehousing provides the foundation and infrastructure for Data Mining by creating a structured environment for data analysis.

Ques 3: In a Data warehouse, what is a dimension?

Answer: In the realm of a data warehouse, a dimension refers to a categorical attribute or characteristic that provides context and additional information to the measures or metrics being analyzed. Dimensions help categorize and organize data, allowing users to slice, dice, and drill down into the data for deeper insights.

Dimensions typically describe the “who,” “what,” “where,” “when,” and “how” aspects of the data. For instance, in a sales data warehouse, dimensions might include attributes like “product,” “time,” “location,” “customer,” and “salesperson.” These dimensions provide valuable context to the sales metrics, allowing users to analyze sales performance across different products, time periods, locations, customers, and salespeople.

Dimensions are often used in conjunction with fact tables, which contain the numerical measures or metrics being analyzed, such as sales revenue or quantity sold. By associating dimensions with facts, data warehouses enable multidimensional analysis, allowing users to explore data from various perspectives and uncover meaningful patterns and trends.

In summary, a dimension in a data warehouse serves as a vital component that categorizes and adds context to the measures being analyzed. It allows users to navigate and explore data from different angles, enhancing the depth and breadth of insights gained from the data.

Ques 4: What are the data mining techniques? 

Answer: Data mining techniques include clustering, classification, regression, association rule mining, anomaly detection, and neural networks, among others.

Ques 5: How do you define data mining in DBMS (database management system)? 

Answer: In a DBMS, data mining refers to the process of extracting valuable information from the stored data to discover patterns and relationships, enhancing decision-making.

Ques 5: What does data mining involve in the context of DBMS? 

Answer: In a DBMS, data mining involves querying, analyzing, and visualizing data to uncover hidden insights and generate actionable knowledge.


Thanks for your visit, if you like the post on Data Mining and Data Warehousing multiple choice questions with answers pdf please share on social media. You may also comment on your queries.

Share on Social Media

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top