Writing your own Classifier (up to 3.5.2)

From WekaWiki

Missing image
Build_classifier.png
Mindmap for how to write a Classifier

In case you have a flash idea for a new classifier and want to write one for Weka, this HOWTO will help you developing it.

The Mindmap (produced with Freemind (http://freemind.sourceforge.net)) on the right side helps you decide from which base classifier to start, what methods are to be implemented and general guidelines.

The base classifiers are all located in the following package:

weka.classifiers
Table of contents

Packages

A few comments about the different classifier sub-packages:

  • bayes
    contains bayesian classifiers, e.g. NaiveBayes
  • evaluation
    classes related to evaluation, e.g., cost matrix
  • functions
    e.g., Support Vector Machines, regression algorithms, neural nets
  • lazy
    no offline learning, that is done during runtime, e.g., k-NN
  • meta
    Meta classifiers that use a base classifier as input, e.g., boosting or bagging
  • misc
    various classifiers that don't fit in any another category
  • rules
    rule-based classifiers, e.g. ZeroR
  • trees
    tree classifiers, like decision trees

Coding

Random number generators

In order to get repeatable experiments, one is not allowed to use unseeded random number generators like Math.random(). Instead, one has to instantiate a java.util.Random object in the buildClassifier(Instances) method with a specific seed value. The seed value can be user supplied, of course, which all the Randomizable... abstract classifiers already implement.

Integration

After finishing the coding stage, it's time to integrate your classifier in the Weka framework, i.e., to make it available in the Explorer, Experimenter, etc. Starting with version 3.4.4, Weka supports an automatic discovery of derived classes in your classpath, managed by the GenericPropertiesCreator.

This page shows you how to tell Weka where to find your classifier and therefore displaying it in the GenericObjectEditor.

Testing

Weka provides already a test framework to ensure the basic functionality of a classifier. It is essential for the classifier to pass these tests.

Commandline test

Use the CheckClassifier class to test your classifier from Commandline:

weka.classifiers.CheckClassifier -W classname [-- additional parameters]

Only the following tests may have "no" as result, the others must have a "no (OK error message)" or "yes":

  • options
  • updateable classifier
  • weighted instances classifier

Unit tests

In order to make sure that your classifier applies to the Weka criteria, you should add your classifier to the junit (http://www.junit.org/) unit test framework, i.e., by creating a Test class (starting with Weka version 3.4.6 and 3.5.1 the AbstractClassifierTest uses the CheckClassifier class to run a battery of tests).

How to check out the unit test framework, you can find here.

See also

Links

  • Weka API (http://weka.sourceforge.net/doc/)
  • Freemind (http://freemind.sourceforge.net/)
  • junit (http://www.junit.org/)

Author: PeterReutemann