Writing your own Classifier (up to 3.5.2)
From WekaWiki
Build_classifier.png
In case you have a flash idea for a new classifier and want to write one for Weka, this HOWTO will help you developing it.
The Mindmap (produced with Freemind (http://freemind.sourceforge.net)) on the right side helps you decide from which base classifier to start, what methods are to be implemented and general guidelines.
The base classifiers are all located in the following package:
weka.classifiers
| Table of contents |
Packages
A few comments about the different classifier sub-packages:
-
bayes
contains bayesian classifiers, e.g. NaiveBayes -
evaluation
classes related to evaluation, e.g., cost matrix -
functions
e.g., Support Vector Machines, regression algorithms, neural nets -
lazy
no offline learning, that is done during runtime, e.g., k-NN -
meta
Meta classifiers that use a base classifier as input, e.g., boosting or bagging -
misc
various classifiers that don't fit in any another category -
rules
rule-based classifiers, e.g. ZeroR -
trees
tree classifiers, like decision trees
Coding
Random number generators
In order to get repeatable experiments, one is not allowed to use unseeded random number generators like Math.random(). Instead, one has to instantiate a java.util.Random object in the buildClassifier(Instances) method with a specific seed value. The seed value can be user supplied, of course, which all the Randomizable... abstract classifiers already implement.
Integration
After finishing the coding stage, it's time to integrate your classifier in the Weka framework, i.e., to make it available in the Explorer, Experimenter, etc. Starting with version 3.4.4, Weka supports an automatic discovery of derived classes in your classpath, managed by the GenericPropertiesCreator.
This page shows you how to tell Weka where to find your classifier and therefore displaying it in the GenericObjectEditor.
Testing
Weka provides already a test framework to ensure the basic functionality of a classifier. It is essential for the classifier to pass these tests.
Commandline test
Use the CheckClassifier class to test your classifier from Commandline:
weka.classifiers.CheckClassifier -W classname [-- additional parameters]
Only the following tests may have "no" as result, the others must have a "no (OK error message)" or "yes":
- options
- updateable classifier
- weighted instances classifier
Unit tests
In order to make sure that your classifier applies to the Weka criteria, you should add your classifier to the junit (http://www.junit.org/) unit test framework, i.e., by creating a Test class (starting with Weka version 3.4.6 and 3.5.1 the AbstractClassifierTest uses the CheckClassifier class to run a battery of tests).
How to check out the unit test framework, you can find here.
See also
Links
- Weka API (http://weka.sourceforge.net/doc/)
- Freemind (http://freemind.sourceforge.net/)
- junit (http://www.junit.org/)
Author: PeterReutemann
