RapidMiner Performance Classification operator is used for statistical performance evaluation of classification tasks. This RapidMiner Performance Classification operator delivers a list of performance criteria values of the classification task.
The RapidMiner Performance Classification operator should be used for performance evaluation of only classification tasks. There are also other performance evaluation operators available in RapidMiner. For example
- Performance operator
- Performance Binominal Classification operator
- Performance Regression operator etc.
The RapidMiner Performance Classification operator is used with classification tasks only. On the other hand, the RapidMiner Performance Classification operator automatically determines the learning task type and calculates the most common criteria for that type. The Performance User Based operator can be used if you want to write your own performance measure.
Classification is a technique used for predicting the group membership for data instances. Eg: You may wish to use classification to predict whether the train on a particular day will be ‘on time’, ‘late’ or ‘very late’. Predicting whether the number of people on a particular event would be ‘below- average’, ‘average’ or ‘above-average’ is another example.
For evaluating a statistical performance of the classification model the data set should be labeled i.e. it should have a attribute with label role and an attribute with prediction role. This label attribute stores the actual observed values whereas the prediction attribute stores the values of label predicted by the classification model under discussion.
Input for RapidMiner Performance Classification
- Labeled Data
This input port expects a labeled Example Set. The Apply Model operator is a operator that provides labeled data. Make sure that the Example Set has a label attribute and a prediction attribute.
This is an optional parameter and requires a Performance Vector.
Output for RapidMiner Performance Classification
This port delivers a Performance Vector. The RapidMiner Performance Vector is a list of performance criteria values. The Performance vector is calculated on the basis of the label attribute and the prediction attribute of the input Example Set. The output performance vector contains the performance criteria calculated by this Performance operator.
If a Performance Vector was also fed at the performance input port criteria of the input-performance-vector are also added in the output-performance-vector. If the input performance vector and calculated performance vector both have the same criteria but with different values, the values of calculated-performance-vector are delivered through the output port.
- Example Set
Example Set that was given as an input is passed without changing to the output through this port. This usually is utilized to reuse the same example set in the further operators or to view the example set in the Results Work space
Use of performance port in the RapidMiner Performance Classification
Example Process is composed of two Subprocess operators and one Performance Classification operator.
- The first subprocess operator ‘Subprocess’ loads the ‘Golf’ data set using the Retrieve operator and then learns a classification model using the k-NN operator.
- The second Subprocess operator ‘ Subprocess’ loads the’ Golf ‘ data set using the Retrieve operator and then learns a classification model using the k-NN operator.