en:Datasets
From WekaDoc
[edit]
Overview
- A jarfile containing 37 classification problems, originally obtained from the UCI repository (http://www.ics.uci.edu/~mlearn/MLRepository.html) (datasets-UCI.jar (http://prdownloads.sourceforge.net/weka/datasets-UCI.jar), 1,190,961 Bytes).
- A jarfile containing 37 regression problems, obtained from various sources (datasets-numeric.jar (http://prdownloads.sourceforge.net/weka/datasets-numeric.jar), 169,344 Bytes).
- A jarfile containing 6 agricultural datasets obtained from agricultural researchers in New Zealand (agridatasets.jar (http://www.cs.waikato.ac.nz/~ml/weka/agridatasets.jar), 31,200 Bytes).
- A jarfile containing 30 regression datasets collected by Luis Torgo (regression-datasets.jar (http://prdownloads.sourceforge.net/weka/regression-datasets.jar), 10,090,266 Bytes).
- A gzip'ed tar containing UCI (http://www.ics.uci.edu/~mlearn/MLRepository.html) and UCI KDD (http://kdd.ics.uci.edu/) datasets (uci-20070111.tar.gz (http://prdownloads.sourceforge.net/weka/uci-20070111.tar.gz), 17,952,832 Bytes)
- A gzip'ed tar containing StatLib (http://lib.stat.cmu.edu/datasets/) datasets (statlib-20050214.tar.gz (http://prdownloads.sourceforge.net/weka/statlib-20050214.tar.gz), 12,785,582 Bytes)
- A gzip'ed tar containing ordinal, real-world datasets donated by Dr. Arie Ben David (Holon Inst. of Technology/Israel) (datasets-arie_ben_david.tar.gz (http://prdownloads.sourceforge.net/weka/datasets-arie_ben_david.tar.gz), 11,348 Bytes)
- A zip file containing 19 multi-class (1-of-n) text datasets donated by George Forman (http://www.hpl.hp.com/personal/George_Forman/)/Hewlett-Packard Labs (http://www.hpl.hp.com/) (19MclassTextWc.zip (http://prdownloads.sourceforge.net/weka/19MclassTextWc.zip?download), 14,084,828 Bytes)
[edit]
Other datasets in ARFF format
- Protein data sets (http://www.csc.lsu.edu/~ji/compbio/index.htm), maintained by Shuiwang Ji, CS Department, Louisiana State University
[edit]
Links
- UCI http://www.ics.uci.edu/~mlearn/MLRepository.html
- UCI KDD http://kdd.ics.uci.edu/
- StatLib http://lib.stat.cmu.edu/datasets/
- WEKA on SourceForge http://sf.net/projects/weka/
