== Sequel: The Database Toolkit for Ruby
Sequel is a lightweight database access toolkit for Ruby.
- Sequel provides thread safety, connection pooling and a concise DSL
for constructing database queries and table schemas.
- Sequel also includes a lightweight but comprehensive ORM layer for
mapping records to Ruby objects and handling associated records.
- Sequel supports advanced database features such as prepared
statements, bound variables, stored procedures, master/slave
configurations, and database sharding.
- Sequel makes it easy to deal with multiple records without having
to break your teeth on SQL.
- Sequel currently has adapters for ADO, DataObjects, DB2, DBI,
Firebird, Informix, JDBC, MySQL, ODBC, OpenBase, Oracle, PostgreSQL
and SQLite3.
== Resources
To check out the source code:
git clone git://github.com/jeremyevans/sequel.git
=== Contact
If you have any comments or suggestions please post to the Google group.
== Installation
sudo gem install sequel
== A Short Example
require 'rubygems'
require 'sequel'
DB = Sequel.sqlite # memory database
DB.create_table :items do # Create a new table
primary_key :id
column :name, :text
column :price, :float
end
items = DB[:items] # Create a dataset
Populate the table
items << {:name => 'abc', :price => rand * 100}
items << {:name => 'def', :price => rand * 100}
items << {:name => 'ghi', :price => rand * 100}
Print out the number of records
puts "Item count: #{items.count}"
Print out the records in descending order by price
items.reverse_order(:price).print
Print out the average price
puts "The average price is: #{items.avg(:price)}"
== The Sequel Console
Sequel includes an IRB console for quick'n'dirty access to databases. You can use it like this:
sequel sqlite://test.db # test.db in current directory
You get an IRB session with the database object stored in DB.
== An Introduction
Sequel is designed to take the hassle away from connecting to databases and manipulating them. Sequel deals with all the boring stuff like maintaining connections, formatting SQL correctly and fetching records so you can concentrate on your application.
Sequel uses the concept of datasets to retrieve data. A Dataset object encapsulates an SQL query and supports chainability, letting you fetch data using a convenient Ruby DSL that is both concise and flexible.
For example, the following one-liner returns the average GDP for the five biggest countries in the middle east region:
DB[:countries].filter(:region => 'Middle East').reverse_order(:area).limit(5).avg(:GDP)
Which is equivalent to:
SELECT avg(GDP) FROM countries WHERE region = 'Middle East' ORDER BY area DESC LIMIT 5
Since datasets retrieve records only when needed, they can be stored and later reused. Records are fetched as hashes (they can also be fetched as custom model objects), and are accessed using an Enumerable interface:
middle_east = DB[:countries].filter(:region => 'Middle East')
middle_east.order(:name).each{|r| puts r[:name]}
Sequel also offers convenience methods for extracting data from Datasets, such as an extended map method:
middle_east.map(:name) #=> ['Egypt', 'Greece', 'Israel', ...]
Or getting results as a transposed hash, with one column as key and another as value:
middle_east.to_hash(:name, :area) #=> {'Israel' => 20000, 'Greece' => 120000, ...}
== Getting Started
=== Connecting to a database
To connect to a database you simply provide Sequel with a URL:
require 'sequel'
DB = Sequel.connect('sqlite://blog.db')
The connection URL can also include such stuff as the user name and password:
DB = Sequel.connect('postgres://cico:12345@localhost:5432/mydb')
You can also specify optional parameters, such as the connection pool size, or loggers for logging SQL queries:
DB = Sequel.connect("postgres://postgres:postgres@localhost/my_db",
:max_connections => 10, :loggers => [Logger.new('log/db.log']))
You can specify a block to connect, which will disconnect from the database after it completes:
Sequel.connect('postgres://cico:12345@localhost:5432/mydb'){|db| db[:posts].delete}
=== Arbitrary SQL queries
DB.execute_ddl("create table t (a text, b text)")
DB.execute_insert("insert into t values ('a', 'b')")
Or more succinctly:
DB << "create table t (a text, b text)"
DB << "insert into t values ('a', 'b')"
You can also create datasets based on raw SQL:
dataset = DB['select * from items']
dataset.count # will return the number of records in the result set
dataset.map(:id) # will return an array containing all values of the id column in the result set
You can also fetch records with raw SQL through the dataset:
DB['select * from items'].each do |row|
p row
end
You can use placeholders in your SQL string as well:
DB['select * from items where name = ?', name].each do |row|
p row
end
=== Getting Dataset Instances
Dataset is the primary means through which records are retrieved and manipulated. You can create an blank dataset by using the dataset method:
dataset = DB.dataset
Or by using the from methods:
posts = DB.from(:posts)
The recommended way is the equivalent shorthand:
posts = DB[:posts]
Datasets will only fetch records when you explicitly ask for them. Datasets can be manipulated to filter through records, change record order, join tables, etc..
=== Retrieving Records
You can retrieve records by using the all method:
posts.all
The all method returns an array of hashes, where each hash corresponds to a record.
You can also iterate through records one at a time:
posts.each{|row| p row}
Or perform more advanced stuff:
posts.map(:id)
posts.inject({}){|h, r| h[r[:id]] = r[:name]}
You can also retrieve the first record in a dataset:
posts.first
Or retrieve a single record with a specific value:
posts[:id => 1]
If the dataset is ordered, you can also ask for the last record:
posts.order(:stamp).last
=== Filtering Records
The simplest way to filter records is to provide a hash of values to match:
my_posts = posts.filter(:category => 'ruby', :author => 'david')
You can also specify ranges:
my_posts = posts.filter(:stamp => (Date.today - 14)..(Date.today - 7))
Or arrays of values:
my_posts = posts.filter(:category => ['ruby', 'postgres', 'linux'])
Sequel also accepts expressions:
my_posts = posts.filter{|o| o.stamp > Date.today << 1}
Some adapters (like postgresql) will also let you specify Regexps:
my_posts = posts.filter(:category => /ruby/i)
You can also use an inverse filter:
my_posts = posts.exclude(:category => /ruby/i)
my_posts = posts.filter(:category => /ruby/i).invert # same as above
You can also specify a custom WHERE clause using a string:
posts.filter('stamp IS NOT NULL')
You can use parameters in your string, as well (ActiveRecord style):
posts.filter('(stamp < ?) AND (author != ?)', Date.today - 3, author_name)
posts.filter{|o| (o.stamp < Date.today - 3) & ~{:author => author_name}} # same as above
Datasets can also be used as subqueries:
DB[:items].filter('price > ?', DB[:items].select('AVG(price) + 100'))
After filtering you can retrieve the matching records by using any of the retrieval methods:
my_posts.each{|row| p row}
See the doc/dataset_filtering.rdoc file for more details.
=== Summarizing Records
Counting records is easy:
posts.filter(:category => /ruby/i).count
And you can also query maximum/minimum values:
max_value = DB[:history].max(:value)
Or calculate a sum:
total = DB[:items].sum(:price)
=== Ordering Records
Ordering datasets is simple:
posts.order(:stamp) # ORDER BY stamp
posts.order(:stamp, :name) # ORDER BY stamp, name
Chaining order doesn't work the same as filter:
posts.order(:stamp).order(:name) # ORDER BY name
The order_more method chains this way, though:
posts.order(:stamp).order_more(:name) # ORDER BY stamp, name
You can also specify descending order
posts.order(:stamp.desc) # ORDER BY stamp DESC
=== Selecting Columns
Selecting specific columns to be returned is also simple:
posts.select(:stamp) # SELECT stamp FROM posts
posts.select(:stamp, :name) # SELECT stamp, name FROM posts
Chaining select works like order, not filter:
posts.select(:stamp).select(:name) # SELECT name FROM posts
As you might expect, there is an order_more equivalent for select:
posts.select(:stamp).select_more(:name) # SELECT stamp, name FROM posts
=== Deleting Records
Deleting records from the table is done with delete:
posts.filter('stamp < ?', Date.today - 3).delete
=== Inserting Records
Inserting records into the table is done with insert:
posts.insert(:category => 'ruby', :author => 'david')
posts << {:category => 'ruby', :author => 'david'} # same as above
=== Updating Records
Updating records in the table is done with update:
posts.filter('stamp < ?', Date.today - 7).update(:state => 'archived')
=== Joining Tables
Joining is very useful in a variety of scenarios, for example many-to-many relationships. With Sequel it's really easy:
order_items = DB[:items].join(:order_items, :item_id => :id).
filter(:order_items__order_id => 1234)
This is equivalent to the SQL:
SELECT * FROM items INNER JOIN order_items
ON order_items.item_id = items.id
WHERE order_items.order_id = 1234
You can then do anything you like with the dataset:
order_total = order_items.sum(:price)
Which is equivalent to the SQL:
SELECT sum(price) FROM items INNER JOIN order_items
ON order_items.item_id = items.id
WHERE order_items.order_id = 1234
=== Graphing Datasets
When retrieving records from joined datasets, you get the results in a single hash, which is subject to clobbering if you have columns with the same name in multiple tables:
DB[:items].join(:order_items, :item_id => :id).first
=> {:id=>(could be items.id or order_items.id), :item_id=>order_items.order_id}
Using graph, you can split the result hashes into subhashes, one per join:
DB[:items].graph(:order_items, :item_id => :id).first
=> {:items=>{:id=>items.id}, :order_items=>{:id=>order_items.id, :item_id=>order_items.item_id}}
== An aside: column references in Sequel
Sequel expects column names to be specified using symbols. In addition, returned hashes always use symbols as their keys. This allows you to freely mix literal values and column references. For example, the two following lines produce equivalent SQL:
items.filter(:x => 1) #=> "SELECT * FROM items WHERE (x = 1)"
items.filter(1 => :x) #=> "SELECT * FROM items WHERE (1 = x)"
=== Qualifying column names
Column references can be qualified by using the double underscore special notation :table__column:
items.literal(:items__price) #=> "items.price"
=== Column aliases
You can also alias columns by using the triple undersecore special notation :column___alias or :table__column___alias:
items.literal(:price___p) #=> "price AS p"
items.literal(:items__price___p) #=> "items.price AS p"
Another way to alias columns is to use the #AS method:
items.literal(:price.as(:p)) #=> "price AS p"
== Sequel Models
Models in Sequel are based on the Active Record pattern described by Martin Fowler (http://www.martinfowler.com/eaaCatalog/activeRecord.html). A model class corresponds to a table or a dataset, and an instance of that class wraps a single record in the model's underlying dataset.
Model classes are defined as regular Ruby classes:
DB = Sequel.connect('sqlite://blog.db')
class Post < Sequel::Model
end
Just like in DataMapper or ActiveRecord, Sequel model classes assume that the table name is a plural of the class name:
Post.table_name #=> :posts
You can, however, explicitly set the table name or even the dataset used:
class Post < Sequel::Model(:my_posts)
end
or:
Post.set_dataset :my_posts
or:
Post.set_dataset DB[:my_posts].where(:category => 'ruby')
=== Model instances
Model instance are identified by a primary key. By default, Sequel assumes the primary key column to be :id. The Model#[] method can be used to fetch records by their primary key:
post = Post[123]
The Model#pk method is used to retrieve the record's primary key value:
post.pk #=> 123
Sequel models allow you to use any column as a primary key, and even composite keys made from multiple columns:
class Post < Sequel::Model
set_primary_key [:category, :title]
end
post = Post['ruby', 'hello world']
post.pk #=> ['ruby', 'hello world']
You can also define a model class that does not have a primary key, but then you lose the ability to update records.
A model instance can also be fetched by specifying a condition:
post = Post[:title => 'hello world']
post = Post.find{|o| o.num_comments < 10}
=== Iterating over records
A model class lets you iterate over subsets of records by proxying many methods to the underlying dataset. This means that you can use most of the Dataset API to create customized queries that return model instances, e.g.:
Post.filter(:category => 'ruby').each{|post| p post}
You can also manipulate the records in the dataset:
Post.filter{|o| o.num_comments < 7}.delete
Post.filter(:title.like(/ruby/)).update(:category => 'ruby')
=== Accessing record values
A model instances stores its values as a hash:
post.values #=> {:id => 123, :category => 'ruby', :title => 'hello world'}
You can read the record values as object attributes (assuming the attribute names are valid columns in the model's dataset):
post.id #=> 123
post.title #=> 'hello world'
You can also change record values:
post.title = 'hey there'
post.save
Another way to change values by using the #update method:
post.update(:title => 'hey there')
=== Creating new records
New records can be created by calling Model.create:
post = Post.create(:title => 'hello world')
Another way is to construct a new instance and save it:
post = Post.new
post.title = 'hello world'
post.save
You can also supply a block to Model.new and Model.create:
post = Post.create{|p| p.title = 'hello world'}
post = Post.new do |p|
p.title = 'hello world'
p.save
end
=== Hooks
You can execute custom code when creating, updating, or deleting records by using hooks. The before_create and after_create hooks wrap record creation. The before_update and after_update wrap record updating. The before_save and after_save wrap record creation and updating. The before_destroy and after_destroy wrap destruction. The before_validation and after_validation hooks wrap validation.
Hooks are defined by supplying a block:
class Post < Sequel::Model
after_create do
author.increase_post_count
end
after_destroy do
author.decrease_post_count
end
end
=== Deleting records
You can delete individual records by calling #delete or #destroy. The only difference between the two methods is that #destroy invokes before_destroy and after_destroy hooks, while #delete does not:
post.delete #=> bypasses hooks
post.destroy #=> runs hooks
Records can also be deleted en-masse by invoking Model.delete and Model.destroy. As stated above, you can specify filters for the deleted records:
Post.filter(:category => 32).delete #=> bypasses hooks
Post.filter(:category => 32).destroy #=> runs hooks
Please note that if Model.destroy is called, each record is deleted
separately, but Model.delete deletes all relevant records with a single
SQL statement.
=== Associations
Associations are used in order to specify relationships between model classes that reflect relations between tables in the database using foreign keys.
class Post < Sequel::Model
many_to_one :author
one_to_many :comments
many_to_many :tags
end
many_to_one creates a getter and setter for each model object:
class Post < Sequel::Model
many_to_one :author
end
post = Post.create(:name => 'hi!')
post.author = Author[:name => 'Sharon']
post.author
one_to_many and many_to_many create a getter method, a method for adding an object to the association, a method for removing an object from the association, and a method for removing all associated objected from the association:
class Post < Sequel::Model
one_to_many :comments
many_to_many :tags
end
post = Post.create(:name => 'hi!')
post.comments
comment = Comment.create(:text=>'hi')
post.add_comment(comment)
post.remove_comment(comment)
post.remove_all_comments
tag = Tag.create(:tag=>'interesting')
post.add_tag(tag)
post.remove_tag(tag)
post.remove_all_tags
=== Eager Loading
Associations can be eagerly loaded via .eager and the :eager association option. Eager loading is used when loading a group of objects. It loads all associated objects for all of the current objects in one query, instead of using a separate query to get the associated objects for each current object. Eager loading requires that you retrieve all model objects at once via .all (instead of individually by .each). Eager loading can be cascaded, loading association's associated objects.
class Person < Sequel::Model
one_to_many :posts, :eager=>[:tags]
end
class Post < Sequel::Model
many_to_one :person
one_to_many :replies
many_to_many :tags
end
class Tag < Sequel::Model
many_to_many :posts
many_to_many :replies
end
class Reply < Sequel::Model
many_to_one :person
many_to_one :post
many_to_many :tags
end
Eager loading via .eager
Post.eager(:person).all
eager is a dataset method, so it works with filters/orders/limits/etc.
Post.filter{|o| o.topic > 'M'}.order(:date).limit(5).eager(:person).all
person = Person.first
Eager loading via :eager (will eagerly load the tags for this person's posts)
person.posts
These are equivalent
Post.eager(:person, :tags).all
Post.eager(:person).eager(:tags).all
Cascading via .eager
Tag.eager(:posts=>:replies).all
Will also grab all associated posts' tags (because of :eager)
Reply.eager(:person=>:posts).all
No depth limit (other than memory/stack), and will also grab posts' tags
Loads all people, their posts, their posts' tags, replies to those posts,
the person for each reply, the tag for each reply, and all posts and
replies that have that tag. Uses a total of 8 queries.
Person.eager(:posts=>{:replies=>[:person, {:tags=>{:posts, :replies}}]}).all
In addition to using eager, you can also use eager_graph, which will use a single query to get the object and all associated objects. This may be necessary if you want to filter the result set based on columns in associated tables. It works with cascading as well, the syntax is exactly the same. Note that using eager_graph to eagerly load multiple *_to_many associations will cause the result set to be a cartesian product, so you should be very careful with your filters when using it in that case.
=== Caching model instances with memcached
Sequel models can be cached using memcached based on their primary keys. The use of memcached can significantly reduce database load by keeping model instances in memory. The set_cache method is used to specify caching:
require 'memcache'
CACHE = MemCache.new 'localhost:11211', :namespace => 'blog'
class Author < Sequel::Model
set_cache CACHE, :ttl => 3600
end
Author[333] # database hit
Author[333] # cache hit
=== Extending the underlying dataset
The obvious way to add table-wide logic is to define class methods to the model class definition. That way you can define subsets of the underlying dataset, change the ordering, or perform actions on multiple records:
class Post < Sequel::Model
def self.posts_with_few_comments
filter{|o| o.num_comments < 30}
end
def self.clean_posts_with_few_comments
posts_with_few_comments.delete
end
end
You can also implement table-wide logic by defining methods on the dataset:
class Post < Sequel::Model
def_dataset_method(:posts_with_few_comments) do
filter{|o| o.num_comments < 30}
end
def_dataset_method(:clean_posts_with_few_comments) do
posts_with_few_comments.delete
end
end
This is the recommended way of implementing table-wide operations, and allows you to have access to your model API from filtered datasets as well:
Post.filter(:category => 'ruby').clean_posts_with_few_comments
Sequel models also provide a short hand notation for filters:
class Post < Sequel::Model
subset(:posts_with_few_comments){|o| o.num_comments < 30}
subset :invisible, ~:visible
end
=== Basic Model Validations
To assign default validations to a sequel model:
class MyModel < Sequel::Model
validates do
acceptance_of...
confirmation_of...
format_of...
format_of...
presence_of...
length_of...
not_string ...
numericality_of...
each...
end
end
You may also perform the usual 'longhand' way to assign default model validates directly within the model class itself:
class MyModel < Sequel::Model
validates_format_of...
validates_presence_of...
validates_acceptance_of...
validates_confirmation_of...
validates_length_of...
validates_numericality_of...
validates_format_of...
validates_each...
end