Author:Mark Hall <mhall{[at]}pentaho.com>
Category:KnowledgeFlow
Changes:Now processes incoming batch data as well as streaming data
Date:2016-09-26
Depends:weka (>=3.8.0), distributedWekaBase (>=1.0.15)
Description:Provides a Knowledge Flow step that can compute various summary statistics incrementally on an incoming instance stream. Apart from simple statistics such as count, min, max, mean, standard deviation and frequency counts, the user can optionally turn on computation of the median and quartiles. This then uses a T-digest streaming quantile estimator. The T-digest estimator is slower to execute than the other statistics, so is not turned on by default. The step can also output a summary chart (histogram, box plot, pie chart) for each attribute in the data. Stats can be output once at the end of the stream or periodically as the stream is processed.
License:GPL 3.0
Maintainer:Mark Hall <mhall{[at]}pentaho.com>
PackageURL:http://downloads.sourceforge.net/project/weka/weka-packages/streamingUnivariateStats1.0.1.zip
URL:URL=http://weka.sourceforge.net/doc.packages/streamingUnivariateStats
Version:1.0.1