PNC2 Rule Induction System

Some Basics

 The basic task of data-based modelling
 What about a variable's type?
 How to estimate a model's prediction accuracy?

Main Index

The basic task of data-based modelling

The following figure illustrates the basic task of data-based modelling. Given is a system with an input vector x and the corresponding output y. The input vector consists out of one or several single input variables. The aim is to build up a model, which predicts the unknown output value given the input vector. A tuple P=(x,y) with the input vector and the output value is also called data tuple.

To build up a data-based model you first need to collect some data tuples from the real system by recording input and corresponding output values for some representative operating conditions. Then the PNC2 cluster algorithm can be employed to find rules in your data.


Main Index top

What about a variable's type?

With respect to the possible values there are three different types of variables.

By the means of the output variable's type there are two fundamentally different types of learning. If the output is nominal, one has got to deal with a classification task. Whereas if the output is continuous, one has got to solve a regression task. The PNC2 Rule Induction System is primarily intended for classification tasks.


Main Index top

How to estimate a model's prediction accuracy?

Usually the prediction accuracy of a learned model is evaluated with respect to a new and unseen test data sample. Therefore, based upon the particular input vectors, a prediction of the output value is estimated for each test data tuple. Then the difference between the real and the predicted output values is evaluated and summarized into a single loss function value as follows:

last updated: 22 January 2004   © 2000-2004 by Lars Haendel