en:Datasets

From WekaDoc

Overview

  • A jarfile containing 37 classification problems, originally obtained from the UCI repository (http://www.ics.uci.edu/~mlearn/MLRepository.html) (datasets-UCI.jar (http://prdownloads.sourceforge.net/weka/datasets-UCI.jar), 1,190,961 Bytes).
  • A jarfile containing 37 regression problems, obtained from various sources (datasets-numeric.jar (http://prdownloads.sourceforge.net/weka/datasets-numeric.jar), 169,344 Bytes).
  • A jarfile containing 6 agricultural datasets obtained from agricultural researchers in New Zealand (agridatasets.jar (http://www.cs.waikato.ac.nz/~ml/weka/agridatasets.jar), 31,200 Bytes).
  • A jarfile containing 30 regression datasets collected by Luis Torgo (regression-datasets.jar (http://prdownloads.sourceforge.net/weka/regression-datasets.jar), 10,090,266 Bytes).
  • A gzip'ed tar containing UCI (http://www.ics.uci.edu/~mlearn/MLRepository.html) and UCI KDD (http://kdd.ics.uci.edu/) datasets (uci-20070111.tar.gz (http://prdownloads.sourceforge.net/weka/uci-20070111.tar.gz), 17,952,832 Bytes)
  • A gzip'ed tar containing StatLib (http://lib.stat.cmu.edu/datasets/) datasets (statlib-20050214.tar.gz (http://prdownloads.sourceforge.net/weka/statlib-20050214.tar.gz), 12,785,582 Bytes)
  • A gzip'ed tar containing ordinal, real-world datasets donated by Dr. Arie Ben David (Holon Inst. of Technology/Israel) (datasets-arie_ben_david.tar.gz (http://prdownloads.sourceforge.net/weka/datasets-arie_ben_david.tar.gz), 11,348 Bytes)
  • A zip file containing 19 multi-class (1-of-n) text datasets donated by George Forman (http://www.hpl.hp.com/personal/George_Forman/)/Hewlett-Packard Labs (http://www.hpl.hp.com/) (19MclassTextWc.zip (http://prdownloads.sourceforge.net/weka/19MclassTextWc.zip?download), 14,084,828 Bytes)

Other datasets in ARFF format

  • Protein data sets (http://www.csc.lsu.edu/~ji/compbio/index.htm), maintained by Shuiwang Ji, CS Department, Louisiana State University

Links