Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

rails_data_fix

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

rails_data_fix

  • 0.1.5
  • Rubygems
  • Socket score

Version published
Maintainers
1
Created
Source

#+TITLE: DataFix - Data Maintenance Tasks Manager #+AUTHOR: Adolfo Villafiorita

Manage your data maintenance tasks like migrations.

#+begin_quote I am not sure how it slipped through, but I realized there is another gem, [[https://rubygems.org/gems/data_migrate/versions][data_migrate]], which does the same thing as =data_fix=.

... well, I should actually say it the other way around, since =data_migrate= has been around for longer and has been downloaded extensively.

We keep using and maintaining =data_fix=, but if you are starting from scratch, [[https://rubygems.org/gems/data_migrate/versions][data_migrate]] is probably a more complete and safer choice. #+end_quote

  • Introduction

This Rails gem provides a set of tasks to manage DB data maintenance tasks like they were migrations.

Data maintenance tasks include anything which does not fit in a schema migration, such as, for instance, adding new records in production, fixing errors in existing records, migrating data to a new schema, ...

Before we wrote =data_fix= we would create a rake tasks or a script with the migration code, test the script in development and finally run it in production. The process was highly manual, with no information about which migrations were run and when. Although these scripts are usually one-offs and lose their value once run, we were not quite ok with the approach.

=data_fix= helps by enforcing standards and keeping track of the data fix run.

Similar to Rails schema migrations, thus, =data_fix= provides:

  • Tasks for generating time-stamped scripts, where you will put the code you need to run on your data.=
  • A table in your DBs to keep track of which scripts have been already run
  • Tasks to manage execution of the scripts and of the table of the tasks run
  • Automatic backup of your data before running the scripts

Different from schema migrations data_fix does not provide a mechanism for data rollbacks. In fact writing reversible maintenance scripts on data not only is complex, but also, in many cases, pointless: why should you write the code to fix a typo in a record and also that to reintroduce the typo you are fixing? It is also an overkill, since it is much simpler rolling back to a previous version using a backup.

We now use it at [[https://shair.tech][Shair.Tech]] to perform data cleaning and data updates of our Rails apps.

  • Installation

Add this line to your application's Gemfile:

#+begin_example ruby gem 'rails_data_fix' #+end_example

And then execute:

#+begin_example $ bundle install #+end_example

Or install it yourself as:

#+begin_example $ gem install rails_data_fix #+end_example

Than for each environment and DB in which you want to use =data_fix= run the following command:

#+begin_example rails data_fix:init #+end_example

This means that you need to run =RAILS_ENV=production rake data_fix:init= in your production environment, if you want to use it there.

  • Usage

First create a file and write the script:

#+begin_example sh rails data_fix:create[data_fix_name] [... write the script in the generated file ...] #+end_example

Then, for each environment in which you need to run the script:

#+begin_example sh rails data_fix:run #+end_example

=data_fix= scripts are not atomic: if you interrupt a script, the DB will be left in status which depends on the code you wrote and when you interrupted the script. In a typical scenario only part your record will have been updated/fixed.

  • Testing

It is a good idea to test your scripts in development before running them on the actual data.

You can run the same script multiple times on the DB either by restoring the status from a backup or by using the =rollback= task, which declares one or more migrations as not run.

For instance:

#+begin_example rake db:run [... ERROR! ... ] [... FIX SCRIPT ...] rake db:rollback rake db:run [... REPEAT ...] #+end_example

  • Example

A typical usage scenario is the following.

  1. You realize you have been inconsistent in storing color names in the =color= field of a table of your DB: some of your records use the word =gray= while others use the British spelling =grey=.

  2. You use =data_fix:create= to generate a file in =db/migrate-data=. (The file will contain your data migration/maintenance script):

    #+begin_example sh rails data_fix:create[prefer_british_spelling] #+end_example

    The script generates a file whose name is along the lines of: =db/migrate-data/20210730135129_prefer_british_over_american.rb=

  3. You now write the code to fix your data in the file just created. For instance something along the lines of:

    #+begin_example sh cat > db/migrate-data/20210730135129_prefer_british_over_american.rb Color.where(name: "gray").each do |record| record.color = "grey" record.save end ^D #+end_example

  4. You can test your script in development by running:

    #+begin_example sh rails data_fix:run #+end_example

  5. If you are unhappy, you can declare the data_fix as not run, fix you script, and run it again:

    #+begin_example sh rails data_fix:rollback #+end_example

    #+begin_example sh cat > db/migrate-data/20210730135129_prefer_british_over_american.rb puts "I prefer a brute-force approach" Color.all.each do |record| record.color = "grey" record.save end ^D #+end_example

    #+begin_example sh rails data_fix:run #+end_example

    #+begin_quote Despite the name of the task, =data_fix:rollback= does not roll back data: for that you need to reload from a DB. The =data_fix:rollback= task updates the table in the DB declaring the the latest =data_fix= has not yet been run. #+end_quote

You repeat the steps above for any other data fix you need. When you are ready, you can run all the migrations at once in production, with the following command:

#+begin_example RAILS_ENV=production rails data_fix:run #+end_example

=data_fix= keeps track of the scripts it has already run ensuring the script is not run twice.

  • Development

To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in version.rb, and then run bundle exec rake release, which will create a git tag for the version, push git commits and the created tag, and push the .gem file to rubygems.org.

  • Contributing

Bug reports and pull requests are welcome on GitHub at https://github.com/shair.tech/data_fix.

  • License

The gem is available as open source under the terms of the [[https://opensource.org/licenses/MIT][MIT License]].

FAQs

Package last updated on 17 Sep 2021

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc