Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

cross_validation

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

cross_validation

  • 0.0.2
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

CrossValidation

Build Status Code Climate

This gem provides a k-fold cross-validation routine and confusion matrix for evaluating machine learning classifiers. See below for usage or jump to the documentation.

Installation

Add this line to your application's Gemfile:

gem 'cross_validation'

And then execute:

$ bundle install --binstubs .bin

Or install it yourself as:

$ gem install cross_validation

Usage

To cross-validate your classifier, you need to configure a run as follows:

require 'cross_validation'

runner = CrossValidation::Runner.create do |r|
  r.documents = my_array_of_documents
  r.folds = 10
  # or if you'd rather test on 10%
  # r.percentage = 0.1
  r.classifier = lambda { SpamClassifier.new }
  r.fetch_sample_class = lambda { |sample| sample.klass }
  r.fetch_sample_value = lambda { |sample| sample.value }
  r.matrix = CrossValidation::ConfusionMatrix.new(method(:keys_for))
  r.training = lambda { |classifier, doc|
    classifier.train doc.klass, doc.value
  }
  r.classifying = lambda { |classifier, doc|
    classifier.classify doc
  }
end

With the run configured, just invoke #run to return a confusion matrix:

mat = runner.run

With a confusion matrix in hand, you can compute many statistics about your classifier:

  • mat.accuracy
  • mat.f1
  • mat.fscore(beta)
  • mat.precision
  • mat.recall

Please see the respective documentation for each method for more details.

Defining keys_for

The ConfusionMatrix class requires a keys_for Proc that returns a symbol. In this method, you specify what constitutes a true positive (:tp), true negative (:tn), false positive (:fp), and false negative (:fn). For example, in spam classification, you can construct the following table to write the keys_for method:

                        actual
          +---------------------------------
 expected | correct        | not correct
----------+----------------+----------------
 spam     | true positive  | false positive
 ham      | true negative  | false negative

You can then implement this table with nested hashes or just a few conditionals:

def keys_for(expected, actual)
  if expected == :spam
    actual == :spam ? :tp : :fp
  elsif expected == :ham
    actual == :ham ? :tn : :fn
  end
end

Once you have your keys_for method implemented, pass it into the ConfusionMatrix with method(:keys_for), or if it's a class-method, MyClass.method(:keys_for). (You can also implement the method as a lambda.)

Roadmap

For v1.0:

  • Implement configurable, parallel cross-validation
  • Include more complete examples

Author

Jon-Michael Deldin, dev@jmdeldin.com

FAQs

Package last updated on 15 Apr 2013

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc