CsvPivot
The CsvPivot gem takes a table in the form of an array of arrays, or a file path to a csv file. It then creates a pivot table on the data from a specified column(s) and row(s) to pivot on.
It returns an array of arrays, or can be given an output path to create a pivoted csv file.
It can be given a proc for user defined aggregation methods.
Installation
Add this line to your application's Gemfile:
gem 'csv_pivot'
And then execute:
$ bundle
Or install it yourself as:
$ gem install csv_pivot
Usage
require 'csv_pivot'
The following are the args that must be specified:
input = {
:input_path => "spec/fixtures/testcsv.csv", # or :input_data
:pivot_rows => "date", # group by
:pivot_columns => "name", # new headers
:pivot_data => "baz"
}
call with:
array_of_arrays = CsvPivot::PivotTable.new(input).pivot
Example 1
assuming you have a csv that looks something like this:
foo | bar | baz | date | name |
---|
1 | 2 | 3 | 4/1/11 | mark |
4 | 5 | 6 | 5/15/14 | mark |
7 | 8 | 9 | 4/7/12 | bear |
10 | 11 | 12 | 5/11/11 | bear |
1 | 2 | 3 | 4/1/11 | mark |
4 | 5 | 6 | 5/11/11 | mark |
7 | 8 | 9 | 4/7/12 | bear |
10 | 11 | 12 | 5/15/14 | bear |
calling
array_of_arrays = CsvPivot::PivotTable.new(input).pivot
returns
[["date", "mark", "bear"], ["4/1/11", 6], ["5/15/14", 6, 12], ["4/7/12", nil, 18], ["5/11/11", 6, 12]]
which if printed to a csv row by row would be equivalent to:
date | mark | bear |
---|
4/1/11 | 6 | nil |
5/15/14 | 6 | 12 |
4/7/12 | nil | 18 |
5/11/11 | 6 | 12 |
note that nils are returned in the absence of data and not zeroes.
Example 2
A csv of the form (no headers):
| | | | |
---|
1 | 2 | 3 | 4/1/11 | mark |
4 | 5 | 6 | 5/15/14 | mark |
7 | 8 | 9 | 4/7/12 | bear |
10 | 11 | 12 | 5/11/11 | bear |
1 | 2 | 3 | 4/1/11 | mark |
4 | 5 | 6 | 5/11/11 | mark |
7 | 8 | 9 | 4/7/12 | bear |
10 | 11 | 12 | 5/15/14 | bear |
p = Proc.new do |array| # a proc that will return the count (see aggregation methods below)
array.length
end
input = {
:input_path => "spec/fixtures/testcsv.csv", # location of csv file
:pivot_rows => 3, # group by (index of the date column)
:pivot_columns => 4, # new headers (index of the name column)
:pivot_data => 2, # data to aggregate (index of baz column)
:aggregate_method => p,
:column_total => true,
:row_total => true
}
array_of_arrays = CsvPivot::PivotTable.new(input).pivot
puts array_of_arrays.inspect
=> [["date", "mark", "bear", "Total"], ["4/1/11", 2, nil, 2.0], ["5/15/14", 1, 1, 2.0], ["4/7/12", nil, 2, 2.0], ["5/11/11", 1, 1, 2.0], ["Total", 4.0, 4.0, 8.0]]
which is equivalent to the following table:
nil | mark | bear | Total |
---|
4/1/11 | 2 | nil | 2.0 |
5/15/14 | 1 | 1 | 2.0 |
4/7/12 | nil | 2 | 2.0 |
5/11/11 | 1 | 1 | 2.0 |
Total | 4.0 | 4.0 | 8.0 |
The last index of the Total row displays the sum of the Total column.
Optional Arguments
- :sort => boolean
sorts columns and rows
- :headers => boolean
- :aggregate_method => Proc
a proc that takes an array of values and returns desired output (e.g. average, max, min, sum, count, etc...)
- :column_total => boolean
return total column (default false)
- :row_total => boolean
return total row (default false)
Pivot to CSV
Calling
CsvPivot::PivotTable.new(input).pivot_to_csv(output_path)
will not return an array of arrays. It will print your data to a csv file specified by output_path.
example
output_path = "spec/fixtures/testcsv_pivoted.csv"
input = {:input_path => "spec/fixtures/testcsv.csv",
:pivot_rows => "date",
:pivot_columns => "name",
:pivot_data => "baz" }
CsvPivot::PivotTable.new(input).pivot_to_csv(output_path)
File.exists? output_path
=> true
Aggregation Methods
The default aggregation method, when no aggregate_method is specified, is sum.
The proc looks like:
p = Proc.new do |array|
array.map(&:to_i).reduce(0, :+)
end
Alternate aggregation method examples
Below are some examples of alternate aggregation methods. The csv_pivot gem makes no assumptions about the data passed to it. Data from a csv is a string. This will need to be cast to a numeric (int or float) before arithmetic can be performed on it. Casting to int (or not) is the responsibility of the proc. The default proc is sum, but note that it casts the members of the array to an int. Below are some examples of writing your own procs. Anything goes as long as it works on an array of values and returns a single value.
Average (floats)
p = Proc.new do |array|
array.map(&:to_f).reduce(0, :+)/array.length
end
Max (ints)
p = Proc.new do |array|
array.map(&:to_i).max
end
Concat an array of strings (comma separate)
p = Proc.new do |array|
array.join(",")
end
Count (type insensitive)
p = Proc.new do |array|
array.length
end
note that for user defined procs, the data to be aggregated is a string, and must be cast to numeric for mathematical aggregation.
Contributing
- Fork it ( https://github.com/[my-github-username]/csv_pivot/fork )
- Create your feature branch (
git checkout -b my-new-feature
) - Commit your changes (
git commit -am 'Add some feature'
) - Push to the branch (
git push origin my-new-feature
) - Create a new Pull Request