Data Mining Architecture

Data Mining Architecture: Data Mining can be defined as a process of extracting the data which is usable from huge sets of data. It is also known as KDD (Knowledge Discovery in Data), Knowledge Extraction, Data Analysis, Information Harvesting and many more.

Working (in short)

It begins when the user decides to put up a specific Data Mining request. This request is sent over to the Data Mining Engines.

The engines then try to find the appropriate solution to the request or query with the help of the database already present in the situation.

After this, the meta data is extracted. Which is, in the next step, sent for the analysis to the data engine.

Finally, the result is sent to the front end. The result generally is written in an understandable language with an appropriate interface.

In the next section of this article, he different parts of Data Mining Architecture will be understood in detail. Below, there are the elaborated description of different parts of Data Mining Architecture.

Data Mining Architecture

Data Sources:

Data Sources, as the name implies, is the main source of data in Data Mining. There are Database, www (world wide web), data warehouse in Data Sources. The Data in these sources may be present in the form of plain text, spreadsheets, etc. These sources may also be present in the form of videos and photos.

One of the biggest primary sources of data is the www (world wide web) also known as the Internet.

Database Server:

The Database Server in Data Mining is mainly responsible for containing the real data that is ready to be processed. We can also call Database Server as the Database. This server mainly performs the task of retrieving the important or relevant data that is as per the request of Data Mining.

Data Mining Engine:

Data Mining Engine can be regarded as one of the most important and core features of Data Mining Architecture. It mainly performs the tasks of different Data Mining techniques such as association, classification, characterization, clustering, prediction and so on.

Pattern Evaluation Modules:

The main task of Pattern Evaluation Module is to keep count of the measure of investigation of different patterns or trends. The calculation is done on the basis of the threshold value. It also interacts with the Data Mining Engines so as to focus the search towards finding the investigation of those patterns.

GUI (Graphical User Interface):

This is also one of the best features of Data Mining Architecture. GUI (Graphical User Interface) is responsible for creating a communication bridge between the user an the Data Mining System. Since, it is very difficult for the users to understand the complexity of the Data Mining processes. In that case, the Graphical User Interface comes into play and helps the user establish an easy and understandable communication among the user and the Data Mining System.

Knowledge Base:

Last but most definitely not the least in this list is Knowledge Base. It is a very important and relevant part of Data Mining engine. The knowledge base, is basically a void that consists of data from the user experience. The main aim of this base is to make the result accurate, dependable, and authentic.

Different Types of Data Mining Architecture:

In this section of this article, there is going to be a description of different types of Data Mining Architecture.

There are mainly four types of Data Mining Architecture. Listed below are the four types of Data Mining Architecture observed in various Data Mining processes.

  1. No Coupling
  2. Loose Coupling
  3. Semi tight Coupling
  4. Tight Coupling

No Coupling:

This type of data mining architecture is the least used type and it is very poor when it comes to the performances as well. It is only used when the need of performance is not very high and the Data Mining process is very plain. In this method, it collects some data from a specific source of data and not from the Database. As previously also noticed, the benefit of retrieving data from the Database is that this way there are more chances of getting precise or accurate results. On the top of that, it is widely considered as a very efficient scheme. But since in the case of No Coupling, the data is not collected from the database, it renders useless for many users.

Loose Coupling:

Loose coupling scheme, unlike No coupling, is a very efficient scheme. It mainly collects data from the Database and stores it in the system. This mining is for memory based Data Mining Architecture.

Semi Tight Coupling:

This scheme is also very efficient scheme. It mainly works on the features such as sorting, indexing, and aggregation. The Data Mining system is attached with a Database or Data warehouse system. The user can store the result in the Database which eventually optimizes the overall performance by a big margin.

Tight Coupling:

This scheme is responsible for giving the users access to scalability, performance and integrated information. In this scheme, the data warehouse is regarded as the most crucial components as this component is recruited for performing different tasks in the various processes of Data Mining.