data.avl
Persistent sorted maps and sets with support for the full clojure.core
sorted collections API (in particular clojure.core/(r)?(sub)?seq
),
transients and additional logarithmic time operations: rank queries
(via clojure.core/nth
and clojure.data.avl/rank-of
), "nearest key"
lookups, splits by index or key and subsets/submaps.
Persistent AVL trees are used as the underlying data structure.
Synopsis
data.avl supports both Clojure and ClojureScript. It exports a single
namespace with nine public functions, four of which are constructor
functions which can be used as drop-in replacements for clojure.core
/ cljs.core
functions of the same names, while the remaining five
expose data.avl-specific functionality:
(require '[clojure.data.avl :as avl])
;; drop-in replacements for clojure.core counterparts
(doc avl/sorted-map)
(doc avl/sorted-map-by)
(doc avl/sorted-set)
(doc avl/sorted-set-by)
;; find rank of element as primitive long, -1 if not found
(doc avl/rank-of)
;; find element closest to the given key and </<=/>=/> according
;; to coll's comparator
(doc avl/nearest)
;; split the given collection at the given key returning
;; [left entry? right]
(doc avl/split-key)
;; split the given collection at the given index; similar to
;; clojure.core/split-at, but operates on and returns data.avl
;; collections
(doc avl/split-at)
;; return subset/submap of the given collection; accepts arguments
;; reminiscent of clojure.core/{subseq,rsubseq}
(doc avl/subrange)
All data.avl collection-returning public functions return first-class
collections (see below for a discussion).
Description
data.avl maps and sets behave like the core Clojure variants, with the
following differences:
-
They have transient counterparts:
(persistent! (assoc! (transient (avl/sorted-map)) 0 0))
;= {0 0}
and use transients during construction:
(apply avl/sorted-map (interleave (range 32) (range 32)))
;; ^- uses transients
-
They are typically noticeably faster during lookups and somewhat
slower during non-transient "updates" (assoc
, dissoc
) than the
built-in sorted collections. Note that batch "updates" using
transients typically perform better than batch "updates" on the
non-transient-enabled built-ins.
-
They add some memory overhead -- a reference and two int
s per
key. The additional node fields are used to support transients (one
reference field per key), rank queries (one int
) and the
rebalancing algorithm itself (the final int
).
Additionally, data.avl collections support several features that the
built-ins do not:
-
Logarithmic time rank queries via clojure.core/nth
and
clojure.data.avl/rank-of
:
(nth (avl/sorted-map 0 0 1 1 2 2) 1)
;= [1 1]
(nth (avl/sorted-set 0 1 2) 1)
;= 1
(avl/rank-of (avl/sorted-map-by > 0 0 1 1 2 2) 0)
2
(avl/rank-of (avl/sorted-set-by > 0 1 2) 0)
2
-
Logarithmic time lookups of "nearest entries" via
clojure.data.avl/nearest
:
(avl/nearest (avl/sorted-set 0 1 2) < 1)
;= 0
(avl/nearest (avl/sorted-set 0 1 2) <= 1) ; or >=
;= 1
(avl/nearest (avl/sorted-set 0 1 2) > 1)
;= 2
(avl/nearest (avl/sorted-set 0 1 2) > 2)
;= nil
-
Logarithmic time splitting by key:
(avl/split-key 3 (avl/sorted-set 0 1 2 3 4 5))
;= [#{0 1 2} 3 #{4 5 6}]
(avl/split-key 1 (avl/sorted-map 0 0 1 1 2 2))
;= [{0 0} [1 1] {2 2}]
(avl/split-key 2 (avl/sorted-set 0 1 3 4))
;= [#{0 1} nil #{3 4}]
The middle element of the returned vector is the entry at the given
key for maps, stored copy of the key for sets and nil
if the key
is absent from the collection.
The remaining two elements are the "left" and "right"
subcollections of the original collection argument when split with
the given key, comprising, respectively, the keys preceding and
succeeding the given key in the order determined by the input
collection's comparator.
-
Logarithmic time splitting by index:
(avl/split-at 2 (avl/sorted-set 0 1 2 3 4 5))
;= [#{0 1} #{2 3 4 5}]
-
Logarithmic time slicing:
(avl/subrange (avl/sorted-set 0 1 2 3 4 5) > 1)
;= #{2 3 4 5}
(avl/subrange (avl/sorted-set 0 1 2 3 4 5) <= 4)
;= #{0 1 2 3 4}
(avl/subrange (avl/sorted-set 0 1 2 3 4 5) >= 2 < 5)
;= #{2 3 4}
-
clojure.data.avl/split-key
, clojure.data.avl/split-at
and
clojure.data.avl/subrange
all return first-class data.avl
collections, completely independent of the originals. In
particular, they do not prevent the originals from being garbage
collected and they support insertion of arbitrary keys, including
outside original subrange
bounds.
Documentation
Releases and dependency information
data.avl requires Clojure >= 1.5.0. The ClojureScript version is
regularly tested against the most recent ClojureScript release.
data.avl releases are available from Maven Central. Development
snapshots are available from the Sonatype OSS repository.
Follow the first link above to discover the current release number.
CLI/deps.edn
dependency information:
org.clojure/data.avl {:mvn/version "${version}"}
Leiningen dependency information:
[org.clojure/data.avl "${version}"]
Maven dependency information:
<dependency>
<groupId>org.clojure</groupId>
<artifactId>data.avl</artifactId>
<version>${version}</version>
</dependency>
Gradle dependency information:
compile "org.clojure:data.avl:${version}"
Developer information
data.avl is being developed as a Clojure Contrib project, see the
What is Clojure Contrib
page for details. Patches will only be accepted from developers who
have signed the Clojure Contributor Agreement.
Clojure(Script) code reuse
data.avl sorted maps and sets support the same basic functionality
regular Clojure's sorted maps and sets do (with the additions listed
above). Some of the code supporting various Clojure(Script) interfaces
and protocols is adapted from the ClojureScript implementations of the
red-black-tree-based sorted collections, which themselves are ports of
Clojure's implementations written in Java. The Clojure(Script) source
files containing the relevant code carry the following copyright
notice:
Copyright (c) Rich Hickey. All rights reserved.
The use and distribution terms for this software are covered by the
Eclipse Public License 1.0 (https://opensource.org/licenses/eclipse-1.0.php)
which can be found in the file epl-v10.html at the root of this distribution.
By using this software in any fashion, you are agreeing to be bound by
the terms of this license.
You must not remove this notice, or any other, from this software.
Licence
Copyright © 2013-2023 Michał Marczyk, Rich Hickey and contributors
Distributed under the Eclipse Public License, the same as Clojure.