
Research
2025 Report: Destructive Malware in Open Source Packages
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.
rails_data_fix
Advanced tools
#+TITLE: DataFix - Data Maintenance Tasks Manager #+AUTHOR: Adolfo Villafiorita
Manage your data maintenance tasks like migrations.
#+begin_quote I am not sure how it slipped through, but I realized there is another gem, [[https://rubygems.org/gems/data_migrate/versions][data_migrate]], which does the same thing as =data_fix=.
... well, I should actually say it the other way around, since =data_migrate= has been around for longer and has been downloaded extensively.
We keep using and maintaining =data_fix=, but if you are starting from scratch, [[https://rubygems.org/gems/data_migrate/versions][data_migrate]] is probably a more complete and safer choice. #+end_quote
This Rails gem provides a set of tasks to manage DB data maintenance tasks like they were migrations.
Data maintenance tasks include anything which does not fit in a schema migration, such as, for instance, adding new records in production, fixing errors in existing records, migrating data to a new schema, ...
Before we wrote =data_fix= we would create a rake tasks or a script with the migration code, test the script in development and finally run it in production. The process was highly manual, with no information about which migrations were run and when. Although these scripts are usually one-offs and lose their value once run, we were not quite ok with the approach.
=data_fix= helps by enforcing standards and keeping track of the data fix run.
Similar to Rails schema migrations, thus, =data_fix= provides:
Different from schema migrations data_fix does not provide a mechanism for data rollbacks. In fact writing reversible maintenance scripts on data not only is complex, but also, in many cases, pointless: why should you write the code to fix a typo in a record and also that to reintroduce the typo you are fixing? It is also an overkill, since it is much simpler rolling back to a previous version using a backup.
We now use it at [[https://shair.tech][Shair.Tech]] to perform data cleaning and data updates of our Rails apps.
Add this line to your application's Gemfile:
#+begin_example ruby gem 'rails_data_fix' #+end_example
And then execute:
#+begin_example $ bundle install #+end_example
Or install it yourself as:
#+begin_example $ gem install rails_data_fix #+end_example
Than for each environment and DB in which you want to use =data_fix= run the following command:
#+begin_example rails data_fix:init #+end_example
This means that you need to run =RAILS_ENV=production rake data_fix:init= in your production environment, if you want to use it there.
First create a file and write the script:
#+begin_example sh rails data_fix:create[data_fix_name] [... write the script in the generated file ...] #+end_example
Then, for each environment in which you need to run the script:
#+begin_example sh rails data_fix:run #+end_example
=data_fix= scripts are not atomic: if you interrupt a script, the DB will be left in status which depends on the code you wrote and when you interrupted the script. In a typical scenario only part your record will have been updated/fixed.
It is a good idea to test your scripts in development before running them on the actual data.
You can run the same script multiple times on the DB either by restoring the status from a backup or by using the =rollback= task, which declares one or more migrations as not run.
For instance:
#+begin_example rake db:run [... ERROR! ... ] [... FIX SCRIPT ...] rake db:rollback rake db:run [... REPEAT ...] #+end_example
A typical usage scenario is the following.
You realize you have been inconsistent in storing color names in the =color= field of a table of your DB: some of your records use the word =gray= while others use the British spelling =grey=.
You use =data_fix:create= to generate a file in =db/migrate-data=. (The file will contain your data migration/maintenance script):
#+begin_example sh rails data_fix:create[prefer_british_spelling] #+end_example
The script generates a file whose name is along the lines of: =db/migrate-data/20210730135129_prefer_british_over_american.rb=
You now write the code to fix your data in the file just created. For instance something along the lines of:
#+begin_example sh cat > db/migrate-data/20210730135129_prefer_british_over_american.rb Color.where(name: "gray").each do |record| record.color = "grey" record.save end ^D #+end_example
You can test your script in development by running:
#+begin_example sh rails data_fix:run #+end_example
If you are unhappy, you can declare the data_fix as not run, fix you script, and run it again:
#+begin_example sh rails data_fix:rollback #+end_example
#+begin_example sh cat > db/migrate-data/20210730135129_prefer_british_over_american.rb puts "I prefer a brute-force approach" Color.all.each do |record| record.color = "grey" record.save end ^D #+end_example
#+begin_example sh rails data_fix:run #+end_example
#+begin_quote Despite the name of the task, =data_fix:rollback= does not roll back data: for that you need to reload from a DB. The =data_fix:rollback= task updates the table in the DB declaring the the latest =data_fix= has not yet been run. #+end_quote
You repeat the steps above for any other data fix you need. When you are ready, you can run all the migrations at once in production, with the following command:
#+begin_example RAILS_ENV=production rails data_fix:run #+end_example
=data_fix= keeps track of the scripts it has already run ensuring the script is not run twice.
To install this gem onto your local machine, run bundle exec rake install. To release a new version, update the version number in
version.rb, and then run bundle exec rake release, which will
create a git tag for the version, push git commits and the created
tag, and push the .gem file to rubygems.org.
Bug reports and pull requests are welcome on GitHub at https://github.com/shair.tech/data_fix.
The gem is available as open source under the terms of the [[https://opensource.org/licenses/MIT][MIT License]].
FAQs
Unknown package
We found that rails_data_fix demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.

Security News
Socket CTO Ahmad Nassri shares practical AI coding techniques, tools, and team workflows, plus what still feels noisy and why shipping remains human-led.

Research
/Security News
A five-month operation turned 27 npm packages into durable hosting for browser-run lures that mimic document-sharing portals and Microsoft sign-in, targeting 25 organizations across manufacturing, industrial automation, plastics, and healthcare for credential theft.