AttrJson
ActiveRecord attributes stored serialized in a json column, super smooth. For Rails 6.0.x through 7.0.x. Ruby 2.7+.
Typed and cast like Active Record. Supporting nested models, dirty tracking, some querying (with postgres jsonb contains), and working smoothy with form builders.
Use your database as a typed object store via ActiveRecord, in the same models right next to ordinary ActiveRecord column-backed attributes and associations. Your json-serialized attr_json
attributes use as much of the existing ActiveRecord architecture as we can.
Why might you want or not want this?
Developed for postgres, but most features should work with MySQL json columns too, although
has not yet been tested with MySQL.
Basic Use
class CreatMyModels < ActiveRecord::Migration[5.0]
def change
create_table :my_models do |t|
t.jsonb :json_attributes
end
add_index :my_models, :json_attributes, using: :gin
end
end
class LangAndValue
include AttrJson::Model
attr_json :lang, :string, default: "en"
attr_json :value, :string
end
class MyModel < ActiveRecord::Base
include AttrJson::Record
attr_json :my_string, :string
attr_json :my_integer, :integer
attr_json :my_datetime, :datetime
attr_json :int_array, :integer, array: true
attr_json :str_array, :string, array: true, default: AttrJson::AttributeDefinition::NO_DEFAULT_PROVIDED
attr_json :str_with_default, :string, default: "default value"
attr_json :embedded_lang_and_val, LangAndValue.to_type
end
model = MyModel.create!(
my_integer: 101,
my_datetime: DateTime.new(2001,2,3,4,5,6),
embedded_lang_and_val: LangAndValue.new(value: "a sentance in default language english")
)
What will get serialized to your json_attributes
column will look like:
{
"my_integer":101,
"my_datetime":"2001-02-03T04:05:06Z",
"str_with_default":"default value",
"embedded_lang_and_val": {
"lang":"en",
"value":"a sentance in default language english"
}
}
These attributes have type-casting behavior very much like ordinary ActiveRecord values.
model = MyModel.new
model.my_integer = "12"
model.my_integer
model.int_array = "12"
model.int_array
model.my_datetime = "2016-01-01 17:45"
model.my_datetime
model.embedded_lang_and_val = { value: "val"}
model.embedded_lang_and_val
You can use ordinary ActiveRecord validation methods with attr_json
attributes.
All the attr_json
attributes are serialized to json as keys in a hash, in a database jsonb/json column. By default, in a column json_attributes
.
If you look at model.json_attributes
, you'll see values already cast to their ruby representations.
To see JSON representations, we can use Rails *_before_type_cast methods, *-in_database and [*_for_database] methods (Rails 7.0+ only).
These methods can all be called on the container json_attributes
json hash attribute (generally showing serialized JSON to string), or any individual attribute (generally showing in-memory JSON-able object). [This is a bit confusing and possibly not entirely consistent, needs more investigation.]
Specifying db column to use
While the default is to assume you want to serialize in a column called
json_attributes
, no worries, of course you can pick whatever named
jsonb column you like, class-wide or per-attribute.
class OtherModel < ActiveRecord::Base
include AttrJson::Record
attr_json_config(default_container_attribute: :some_other_column_name)
attr_json :my_int, :integer
attr_json :my_int, :integer, container_attribute: "yet_another_column_name"
end
Store key different than attribute name/methods
You can also specify that the serialized JSON key
should be different than the attribute name/methods, by using the store_key
argument.
class MyModel < ActiveRecord::Base
include AttrJson::Record
attr_json :special_string, :string, store_key: "__my_string"
end
model = MyModel.new
model.special_string = "foo"
model.json_attributes
model.save!
model.json_attributes_before_type_cast
You can of course combine array
, default
, store_key
, and container_attribute
params however you like, with whatever types you like: symbols resolvable
with ActiveRecord::Type.lookup
, or any ActiveModel::Type::Value subclass, built-in or custom.
You can register your custom ActiveModel::Type::Value
in a Rails initializer or early on in your app boot sequence:
ActiveRecord::Type.register(:my_type, MyActiveModelTypeSubclass)
Querying
There is some built-in support for querying using postgres jsonb containment
(@>
) operator. (or see here or here). For now you need to additionally include AttrJson::Record::QueryScopes
to get this behavior.
model = MyModel.create(my_string: "foo", my_integer: 100)
MyModel.jsonb_contains(my_string: "foo", my_integer: 100).to_sql
MyModel.jsonb_contains(my_string: "foo", my_integer: 100).first
MyModel.not_jsonb_contains(my:string: "foo", my_integer: 100).to_sql
MyModel.jsonb_contains(my_string: "foo", my_integer: "100")
model = MyModel.create(int_array: [10, 20, 30])
MyModel.jsonb_contains(int_array: 10)
MyModel.jsonb_contains(int_array: [10])
MyModel.jsonb_contains(int_array: [10, 20])
MyModel.jsonb_contains(int_array: [10, 1000])
jsonb_contains
will handle any store_key
you have set -- you should specify
attribute name, it'll actually query on store_key. And properly handles any
container_attribute
-- it'll look in the proper jsonb column.
Anything you can do with jsonb_contains
should be handled
by a postgres USING GIN
index. Figuring out how to use indexes for jsonb
queries can be confusing, here is a good blog post.
Nested models -- Structured/compound data
The AttrJson::Model
mix-in lets you make ActiveModel::Model objects that can be round-trip serialized to a json hash, and they can be used as types for your top-level AttrJson::Record.
AttrJson::Model
s can contain other AJ::Models, singly or as arrays, nested as many levels as you like.
That is, you can serialize complex object-oriented graphs of models into a single
jsonb column, and get them back as they went in.
AttrJson::Model
has an identical attr_json
api to
AttrJson::Record
, with the exception that container_attribute
is not supported.
class LangAndValue
include AttrJson::Model
attr_json :lang, :string, default: "en"
attr_json :value, :string
validates :lang, inclusion_in: I18n.config.available_locales.collect(&:to_s)
end
class MyModel < ActiveRecord::Base
include AttrJson::Record
include AttrJson::Record::QueryScopes
attr_json :lang_and_value, LangAndValue.to_type
attr_json :lang_and_value_array, LangAndValue.to_type, array: true
end
m = MyModel.new(lang_and_value: LangAndValue.new(lang: "fr", value: "S'il vous plaît"))
m.lang_and_value = LangAndValue.new(lang: "es", value: "hola")
m.lang_and_value
m.save!
m.attr_jsons_before_type_cast
m = MyModel.new(lang_and_value: { lang: 'fr', value: "S'il vous plaît"})
m.lang_and_value = { lang: 'en', value: "Hey there" }
m.save!
m.attr_jsons_before_type_cast
found = MyModel.find(m.id)
m.lang_and_value
m = MyModel.new(lang_and_value_array: [{ lang: 'fr', value: "S'il vous plaît"}, { lang: 'en', value: "Hey there" }])
m.lang_and_value_array
m.save!
m.attr_jsons_before_type_cast
You can nest AttrJson::Model objects inside each other, as deeply as you like.
You can edit nested models "in place", they will be properly saved.
m.lang_and_value.lang = "de"
m.save! # no problem!
For use with Rails forms, you may want to use attr_json_accepts_nested_attributes_for
(like Rails accepts_nested_attributes_for
, see doc page on Use with Forms and Form Builders.
Model-type defaults
If you want to set a default for an AttrJson::Model type, you should use a proc argument for
the default, to avoid accidentally re-using a shared global default value, similar to issues
people have with ruby Hash default.
attr_json :lang_and_value, LangAndValue.to_type, default: -> { LangAndValue.new(lang: "en", value: "default") }
You can also use a Hash value that will be cast to your model, no need for proc argument
in this case.
attr_json :lang_and_value, LangAndValue.to_type, default: { lang: "en", value: "default" }
Polymorphic model types
There is some support for "polymorphic" attributes that can hetereogenously contain instances of different AttrJson::Model classes, see comment docs at AttrJson::Type::PolymorphicModel.
class SomeLabels
include AttrJson::Model
attr_json :hello, LangAndValue.to_type, array: true
attr_json :goodbye, LangAndValue.to_type, array: true
end
class MyModel < ActiveRecord::Base
include AttrJson::Record
include AttrJson::Record::QueryScopes
attr_json :my_labels, SomeLabels.to_type
end
m = MyModel.new
m.my_labels = {}
m.my_labels
m.my_labels.hello = [{lang: 'en', value: 'hello'}, {lang: 'es', value: 'hola'}]
m.my_labels
m.my_labels.hello.find { |l| l.lang == "en" }.value = "Howdy"
m.save!
m.attr_jsons
m.attr_jsons_before_type_cast
GUESS WHAT? You can QUERY nested structures with jsonb_contains
,
using a dot-keypath notation, even through arrays as in this case. Your specific
defined attr_json
types determine the query and type-casting.
MyModel.jsonb_contains("my_labels.hello.lang" => "en").to_sql
MyModel.jsonb_contains("my_labels.hello.lang" => "en").first
MyModel.jsonb_contains("my_labels.hello" => LangAndValue.new(lang: 'en')).to_sql
MyModel.jsonb_contains("my_labels.hello" => {"lang" => "en"}).to_sql
Remember, we're using a postgres containment (@>
) operator, so queries
always mean 'contains' -- the previous query needs a my_labels.hello
which is a hash that includes the key/value, lang: en
, it can have
other key/values in it too. String values will need to match exactly.
Single AttrJson::Model serialized to an entire json column
The main use case of the gem is set up to let you combine multiple primitives and nested models
under different keys combined in a single json or jsonb column.
But you may also want to have one AttrJson::Model class that serializes to map one model class, as
a hash, to an entire json column on it's own.
AttrJson::Model
can supply a simple coder for the ActiveRecord serialization
feature to easily do that.
class MyModel
include AttrJson::Model
attr_json :some_string, :string
attr_json :some_int, :int
end
class MyTable < ApplicationRecord
serialize :some_json_column, MyModel.to_serialization_coder
end
MyTable.create(some_json_column: MyModel.new(some_string: "string"))
MyTable.create(some_json_column: { some_int: 12 })
To avoid errors raised at inconvenient times, we recommend you set these settings to make 'bad'
data turn into nil
, consistent with most ActiveRecord types:
class MyModel
include AttrJson::Model
attr_json_config(bad_cast: :as_nil, unknown_key: :strip)
end
And/or define a setter method to cast, and raise early on data problems:
class MyTable < ApplicationRecord
serialize :some_json_column, MyModel.to_serialization_coder
def some_json_column=(val)
super( )
end
end
Serializing a model to an entire json column is a relatively recent feature, please let us know how it's working for you.
Storing Arbitrary JSON data
Arbitrary JSON data (hashes, arrays, primitives of any depth) can be stored within attributes by using the rails built in ActiveModel::Type::Value
as the attribute type. This is basically a "no-op" value type -- JSON alone will be used to serialize/deserialize whatever values you put there, because of the json type on the container field.
class MyModel < ActiveRecord::Base
include AttrJson::Record
attr_json :arbitrary_hash, ActiveModel::Type::Value.new
end
Forms and Form Builders
Use with Rails form builders is supported pretty painlessly. Including with simple_form and cocoon (integration-tested in CI).
If you have nested AttrJson::Models you'd like to use in your forms much like Rails associated records: Where you would use Rails accepts_nested_attributes_for
, instead include AttrJson::NestedAttributes
and use attr_json_accepts_nested_attributes_for
. Multiple levels of nesting are supported.
For more info, see doc page on Use with Forms and Form Builders.
ActiveRecord Attributes and Dirty tracking
We endeavor to make record-level attr_json
attributes available as standard ActiveRecord attributes, supporting that full API.
Standard Rails dirty tracking should work properly with AttrJson::Record attributes! We have a test suite demonstrating.
We actually keep the "canonical" copy of data inside the "container attribute" hash in the ActiveRecord model. This is because this is what will actually get saved when you save. So we have two copies, that we do our best to keep in sync.
They get out of sync if you are doing unusual things like using the ActiveRecord attribute API directly (like calling write_attribute
with an attr_json attribute). Even if this happens, mostly you won't notice. But one thing it will effect is dirty tracking.
If you ever need to sync the ActiveRecord attribute values from the AttrJson "canonical" copies, you can call active_record_model.attr_json_sync_to_rails_attributes
. If you wanted to be 100% sure of dirty tracking, I suppose you could always call this method first. Sorry, this is the best we could do!
Note that ActiveRecord DirtyTracking will give you ruby objects, for instance for nested models, you might get:
record_obj.attribute_change_to_be_saved(:nested_model)
If you want to see JSON instead, you could call #as_json on the values. The Rails *_before_type_cast and *-in_database methods may also be useful.
Do you want this?
Why might you want this?
-
You have complicated data, which you want to access in object-oriented
fashion, but want to avoid very complicated normalized rdbms schema --
and are willing to trade the powerful complex querying support normalized rdbms
schema gives you.
-
Single-Table Inheritance, with sub-classes that have non-shared
data fields. You rather not make all those columns, some of which will then also appear
to inapplicable sub-classes. (note you may have trouble with ActiveRecord #becomes in some versions of Rails due to Rails bug. See https://github.com/jrochkind/attr_json/issues/189 and https://github.com/rails/rails/issues/47538))
-
A "content management system" type project, where you need complex
structured data of various types, maybe needs to be vary depending
on plugins or configuration, or for different article types -- but
doesn't need to be very queryable generally -- or you have means of querying
other than a normalized rdbms schema.
-
You want to version your models, which is tricky with associations between models.
Minimize associations by inlining the complex data into one table row.
-
Generally, we're turning postgres into a simple object-oriented
document store. That can be mixed with an rdbms. The very same
row in a table in your db can have document-oriented json data and foreign keys
and real rdbms associations to other rows. And it all just
feels like ActiveRecord, mostly.
Why might you not want this?
-
An rdbms and SQL is a wonderful thing, if you need sophisticated
querying and reporting with reasonable performance, complex data
in a single jsonb probably isn't gonna be the best.
-
This is pretty well-designed code that mostly only uses
fairly stable and public Rails API, but there is still some
risk of tying your boat to it, it's not Rails itself, and there is
some risk it won't keep up with Rails in the future.
Note on Optimistic Locking
When you save a record with any changes to any attr_jsons, it will
overwrite the whole json structure in the relevant column for that row.
Unlike ordinary AR attributes where updates just touch changed attributes.
Becuase of this, you probably want to seriously consider using ActiveRecord
Optimistic Locking
to prevent overwriting other updates from processes.
State of Code, and To Be Done
This code is solid and stable and is being used in production. If you don't see a lot of activity, it might be because it's stable, rather than abandoned. Check to see if it's passing/supported on recent Rails? We test on "edge" unreleased rails to try to stay ahead of compatibility, and has worked through multiple major Rails verisons with few if any changes needed.
In order to keep the low-maintenace scenario sustainable, I am very cautious accepting new features, especially if they increase code complexity at all. Even if you have a working PR, I may be reluctant to accept it. I'm prioritizing sustainability and stability over new features, and so far this is working out well. However, discussion is always welcome! Especially when paired with code (failing tests for the bugfix or feature you want are super helpful on their own!).
We are committed to semantic versioning and will endeavor to release no backwards breaking changes without a major version. We are also serious about minimizing backwards incompat releases altogether (ie minimiing major version releases).
Feedback of any kind of very welcome, please feel free to use the issue tracker. It is hard to get a sense of how many people are actually using this, which is helpful both for my own sense of reward and for anyone to get a sense of the size of the userbase -- feel free to say hi and let us know how you are using it!
Except for the jsonb_contains stuff using postgres jsonb contains operator, I don't believe any postgres-specific features are used. It ought to work with MySQL, testing and feedback welcome. (Or a PR to test on MySQL?). My own interest is postgres.
This is still mostly a single-maintainer operation, so has all the sustainability risks of that. Although there are other people using and contributing to it, check out the Github Issues and Pull Request tabs yourself to get a sense.
Possible future features:
-
Make AttrJson::Model lean more heavily on ActiveModel::Attributes API that did not fully exist in first version of attr_json (perhaps not, see https://github.com/jrochkind/attr_json/issues/18)
-
partial updates for json hashes would be really nice: Using postgres jsonb merge operators to only overwrite what changed. In my initial attempts, AR doesn't make it easy to customize this. [update: this is hard, probably not coming soon. See https://github.com/jrochkind/attr_json/issues/143]
-
Should we give AttrJson::Model a before_serialize hook that you might
want to use similar to AR before_save? Should AttrJson::Models
raise on trying to serialize an invalid model? [update: eh, hasn't really come up]
-
There are limits to what you can do with just jsonb_contains
queries. We could support operations like >
, <
, <>
as jsonb_accessor,
even accross keypaths. (At present, you could use a
before_savee to denormalize/renormalize copy your data into
ordinary AR columns/associations for searching. Or perhaps a postgres ts_vector for text searching. Needs to be worked out.) [update: interested, but not necessarily prioritized. This one would be interesting for a third-party PR draft!]
-
We could/should probably support jsonb_order
clauses, even
accross key paths, like jsonb_accessor. [update: interested but not necessarily prioritized]
-
Could we make these attributes work in ordinary AR where, same
as they do in jsonb_contains? Maybe. [update: probably not]
Development
While attr_json
depends only on active_record
, we run integration tests in the context of a full Rails app, in order to test working with simple_form and cocoon, among other things. (Via combustion, with app skeleton at ./spec/internal).
At present this does mean that all our automated tests are run in a full Rails environment, which is not great (any suggestions or PR's to fix this while still running integration tests under CI with full Rails app).
Tests are in rspec, run tests simply with ./bin/rspec
.
We use appraisal to test with multiple rails versions, including on travis. Locally you can run bundle exec appraisal rspec
to run tests multiple times for each rails version, or eg bundle exec appraisal rails-5-1 rspec
. If the Gemfile
or Appraisal
file changes, you may need to re-run bundle exec appraisal install
and commit changes. (Try to put dev dependencies in gemspec instead of Gemfile, but sometimes it gets weird.)
- If you've been switching between rails versions and you get integration test failures, try
rm -rf spec/internal/tmp/cache
. Rails 6 does some things in there apparently not compatible with Rails 5, at least in our setup, and vice versa.
There is a ./bin/console
that will give you a console in the context of attr_json and all it's dependencies, including the combustion rails app, and the models defined there.
Acknowledements, Prior Art, alternatives
-
The excellent work sgrif did on ActiveModel::Type
really lays the groundwork and makes this possible. Plus many other Rails developers.
Rails has a reputation for being composed of messy or poorly designed code, but
it's some really nice design in Rails that allows us to do some pretty powerful
stuff here, in surprisingly few lines of code.
-
The existing jsonb_accessor was
an inspiration, and provided some good examples of how to do some things
with AR and ActiveModel::Types. I started out trying to figure out
how to fit in nested hashes to jsonb_accessor... but ended up pretty much rewriting it entirely,
to lean on object-oriented polymorphism and ActiveModel::Type a lot heavier and have
the API and internals I wanted/imagined.
-
Took a look at existing active_model_attributes too.
-
Didn't actually notice existing json_attributes
until I was well on my way here. I think it's not updated for Rails5 or type-aware,
haven't looked at it too much.
-
store_model was created after attr_json
, and has some overlapping functionality.
-
store_attribute is also a more recent addition. while it's not specifically about JSON, it could be used with an underlying JSON coder to give you typed json attributes.