
Security News
Astral Launches pyx: A Python-Native Package Registry
Astral unveils pyx, a Python-native package registry in beta, designed to speed installs, enhance security, and integrate deeply with uv.
##Table of Content
ToARFF is a ruby library to convert SQLite database files to ARFF files (Attribute-Relation File Format), which is used to specify datasets for WEKA, a machine learning and data mining tool.
This wiki describes perfectly,
"An ARFF (Attribute-Relation File Format) file is an ASCII text file that describes a list of instances sharing a set of attributes. ARFF files were developed by the Machine Learning Project at the Department of Computer Science of The University of Waikato for use with the Weka machine learning software."
Note: Converting from an SQLite database will generate one ARFF file per table. See this stackoverflow post.
Add this line to your application's Gemfile:
gem 'to-arff'
And then execute:
$ bundle
Or install it yourself as:
$ gem install to-arff
###Convert from an SQLite Database
Use the convert() method and specify the column/attribute types as a json (or nested hash).
require 'to-arff'
# Get the db file from https://github.com/dhrubomoy/to-arff/blob/master/spec/sample_db_files/sample2.db
sample = ToARFF::SQLiteDB.new "/path/to/sample2.db"
# Attribute names and types must be valid
# eg. { "table1": {"column11"=>"NUMERIC",
# "column12"=>"STRING"
# },
# "table2": {"column21"=>"class {Iris-setosa,Iris-versicolor,Iris-virginica}",
# "column22"=>"DATE \"yyyy-MM-dd HH:mm:ss\""
# }
# }
# OR { "table1" => {"column11"=>"NUMERIC",
# "column12"=>"STRING"
# },
# "table2" => {"column21"=>"class {Iris-setosa,Iris-versicolor,Iris-virginica}",
# "column22"=>"DATE \"yyyy-MM-dd HH:mm:ss\""
# }
# }
sample_column_types_param_json = {
"albums": {
"Albumid": "NUMERIC",
"Title": "STRING"
},
"employees": {
"EmployeeId": "NUMERIC",
"LastName": "STRING",
"City": "STRING",
"HireDate": "DATE 'yyyy-MM-dd HH:mm:ss'"
}
}
sample_column_types_param_hash = { "employees" => {"EmployeeId"=>"NUMERIC",
"LastName"=>"STRING",
"City"=>"STRING",
"HireDate"=>"DATE \"yyyy-MM-dd HH:mm:ss\""
},
"albums" => { "Albumid"=>"NUMERIC",
"Title"=>"STRING"
}
}
puts sample.convert column_types: sample_column_types_param_json
#OR
puts sample.convert column_types: sample_column_types_param_hash
Both will produce string similar to following:
@RELATION employees
@ATTRIBUTE EmployeeId NUMERIC
@ATTRIBUTE LastName STRING
@ATTRIBUTE City STRING
@ATTRIBUTE HireDate DATE "yyyy-MM-dd HH:mm:ss"
@DATA
1,"Adams","Edmonton","2002-08-14 00:00:00"
2,"Edwards","Calgary","2002-05-01 00:00:00"
3,"Peacock","Calgary","2002-04-01 00:00:00"
...and so on...
@RELATION albums
@ATTRIBUTE Albumid NUMERIC
@ATTRIBUTE Title STRING
@DATA
1,"For Those About To Rock We Salute You"
2,"Balls to the Wall"
3,"Restless and Wild"
...and so on...
require 'to-arff'
sample = ToARFF::SQLiteDB.new "/path/to/sample_sqlite.db"
# Column names must be specified like this:
# { "table1" => ["column11", "column12",...],
# "table2" => ["column21", "column22",...]
# }
# OR
# { "table1": ["column11", "column12",...],
# "table2": ["column21", "column22",...]
# }
sample_columns_json = { "albums": ["AlbumId", "Title", "ArtistId"],
"employees": ["EmployeeId", "LastName", "FirstName", "Title"]
}
sample_columns_hash = { "albums" => ["AlbumId", "Title", "ArtistId"],
"employees" => ["EmployeeId", "LastName", "FirstName", "Title"]
}
puts sample.convert columns: sample_columns_json
puts sample.convert columns: sample_columns_hash
Both json and hash parameters for columns:
will return string similar to following:
@RELATION albums
@ATTRIBUTE AlbumId NUMERIC
@ATTRIBUTE Title STRING
@ATTRIBUTE ArtistId NUMERIC
@DATA
1,"For Those About To Rock We Salute You",1
2,"Balls to the Wall",2
...and so on...
@RELATION employees
@ATTRIBUTE EmployeeId NUMERIC
@ATTRIBUTE LastName STRING
@ATTRIBUTE FirstName STRING
@ATTRIBUTE HireDate STRING
@DATA
1,"Adams","Andrew","2002-08-14 00:00:00"
2,"Edwards","Nancy","2002-05-01 00:00:00"
...and so on..
As you can see, "HireDate" Attribute didn't have the correct datatype. It should be "DATE "yyyy-MM-dd HH:mm:ss"", not "STRING"
require 'to-arff'
sample = ToARFF::SQLiteDB.new "/path/to/sample_sqlite.db"
sample.convert tables: ["albums","employees"]
# OR
sample.convert
Bug reports and pull requests are welcome on GitHub at https://github.com/dhrubomoy/to-arff. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the Contributor Covenant code of conduct.
git checkout -b my-new-feature
)rake spec/
and make sure all the test passesgit commit -am 'Add some feature'
)git push origin my-new-feature
)The gem is available as open source under the terms of the MIT License.
FAQs
Unknown package
We found that to-arff demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Astral unveils pyx, a Python-native package registry in beta, designed to speed installs, enhance security, and integrate deeply with uv.
Security News
The Latio podcast explores how static and runtime reachability help teams prioritize exploitable vulnerabilities and streamline AppSec workflows.
Security News
The latest Opengrep releases add Apex scanning, precision rule tuning, and performance gains for open source static code analysis.