Data mining and knowledge accumulation process models and methodologies (Data Mining Concepts) in RemLabNet.


Data mining (DM), knowledge discovery in databases (KDD), knowledge discovery, and data mining and knowledge discovery (DM & KD) are terms used to refer to the results of research, techniques and tools used to extract useful information from large volumes of data. In other words, we need to apply procedure to accommodate the experimental data to subsequent analysis, removing the wrong data, noise, data misprints, etc., simultaneously the bulk of the data is lowered. On top of this, the data conversion takes place for the analytical purposes.

In remote laboratories we use all data mining algorithms are capable of solving all the task related to the search for useful knowledge in large volumes of data. From the viewpoint of data mining methodologies and process models, the year 2000 marked the most important milestone as the Cross-Industry Standard Process for Data Mining (CRISP-DM) was first published. CRISP-DM is the most widely used methodology for developing data mining projects. Since then, it is considered to be the standard in the field.

This process describes the activities that must been done to develop a data mining project. Every activity is composed of tasks. For every task, generated outputs needed and inputs are detailed.

In this section the evolution of data mining and knowledge discovery process model swill be presented and methodologies that provide the condensed knowledge from the data of stored in RLMS.

The methodologies for the data mining are numerous. For the remote labs we chose CRISP-DM methodology as it applies to the evaluation of the process model and to the model we want to explain furhter. This model is ideal for the data analysis that posses predetermined structure (in XML format) as is the case in remote laboratories. Such data structure can be fundamentally changed without information loss and thus may be easily integrated into arbitrary process of acquiring knowledge.

RUN HERE: Data mining in RemLabNet

Analysis date, Export measure deta to XLSX or PDF format. Analysis data from MEASURESERVER, Diagnostic server and Web Server atc.


Process for Data Mining (RemLabNet CRISP-DM) process model

Let us describe individual phases of the knowledge mining process in more detail.

Rig(s) recognition: This initial phase focuses on the project objectives and requirements mining from a application perspective and then converting this knowledge into a data mining problem definition, and a preliminary plan setup, designed to achieve the objectives of the assignment.

Data identification: The data identification starts with initiation and proceeds with the recognition activities that identify the data quality, or to detect interesting subsets to form later the hypotheses for hidden information [6].

Data preparation: The data preparation phase covers all activities to construct the final required data set (data that will be fed into the modeling tool(s)) from the initial raw data.

Phenomenon modelling: In this phase, various modeling techniques are selected and applied, and their parameters are adjusted to optimal values.

Theoretical and measurement data comparison: Before proceeding to final deployment of the model, it is important to more thoroughly evaluate the model, and review the steps executed to construct the model, to be sure it properly achieves the objectives.

Result deployment: Generally, the creation of the model is not the end of the project. Even if the purpose of the model is to increase knowledge of the data, it will be necessary to organize the knowledge extracted, as well as to present it in a useful way to the client across ( via ) the web page. Depending on the requirements, the deployment phase can be as simple as generating a report or as complex as implementing a repeatable data mining process. In many cases it will be the client, not the data analyst, who will carry out the deployment steps in the content management system LMS-Driven website – our domain -