Security News
RubyGems.org Adds New Maintainer Role
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
The d3-array package is a JavaScript library that provides powerful data manipulation and analysis functions. It is part of the D3 (Data-Driven Documents) suite of tools, which are used for handling and visualizing data on the web. d3-array includes methods for statistics, searching, sorting, transforming, and more.
Statistics
Calculate statistical measures such as mean, median, min, max, sum, variance, and standard deviation.
const d3 = require('d3-array');
const data = [1, 2, 3, 4, 5];
const mean = d3.mean(data);
Searching
Search for a value in a sorted array using binary search, such as bisectLeft and bisectRight.
const d3 = require('d3-array');
const data = [1, 2, 3, 4, 5];
const index = d3.bisectLeft(data, 3);
Sorting
Sort data using natural or custom comparators.
const d3 = require('d3-array');
const data = [{name: 'Alice', age: 40}, {name: 'Bob', age: 30}];
const sortedData = data.sort(d3.comparator((a, b) => a.age - b.age));
Transforming
Transform data using methods like rollup and group to aggregate and organize data.
const d3 = require('d3-array');
const data = [1, 2, 3, 4, 5];
const rolledUp = d3.rollup(data, v => v.length, d => d);
Histogram
Generate histograms to bin data into discrete intervals.
const d3 = require('d3-array');
const data = [1, 2, 3, 4, 5];
const histogram = d3.histogram().thresholds(5)(data);
Lodash is a general utility library that offers similar array manipulation methods, such as sorting, searching, and transforming collections. It is broader in scope but does not focus on statistical functions as much as d3-array.
Underscore.js is another utility library with functions for working with arrays, objects, and functions. It is similar to lodash and provides many of the same features, but it is not as modular as d3-array.
Simple-statistics is focused on statistical methods and provides a range of functions for statistical analysis. It is similar to the statistical aspects of d3-array but does not include data transformation and manipulation features.
Crossfilter is a library for exploring large multivariate datasets in the browser. It can perform similar data manipulation tasks but is optimized for coordinated views and fast filtering and grouping of data.
Utilities for array manipulation: ordering, summarizing, searching, etc.
When using D3—and with data in general—you tend to do a lot of array manipulation. Some common forms of array manipulation include taking a contiguous slice (subset) of an array, filtering an array using a predicate function, and mapping an array to a parallel set of values using a transform function. Before looking at the set of utilities that D3 provides for arrays, you should familiarize yourself with the powerful array methods built-in to JavaScript.
JavaScript includes mutation methods that modify the array:
There are also access methods that return some representation of the array:
And finally iteration methods that apply functions to elements in the array:
If you use NPM, npm install d3-array
. Otherwise, download the latest release.
# ascending(a, b)
Returns -1 if a is less than b, or 1 if a is greater than b, or 0. This is the comparator function for natural order, and can be used in conjunction with the built-in array sort method to arrange elements in ascending order. It is implemented as:
function ascending(a, b) {
return a < b ? -1 : a > b ? 1 : a >= b ? 0 : NaN;
}
Note that if no comparator function is specified to the built-in sort method, the default order is lexicographic (alphabetical), not natural! This can lead to surprising behavior when sorting an array of numbers.
# descending(a, b)
Returns -1 if a is greater than b, or 1 if a is less than b, or 0. This is the comparator function for reverse natural order, and can be used in conjunction with the built-in array sort method to arrange elements in descending order. It is implemented as:
function descending(a, b) {
return b < a ? -1 : b > a ? 1 : b >= a ? 0 : NaN;
}
Note that if no comparator function is specified to the built-in sort method, the default order is lexicographic (alphabetical), not natural! This can lead to surprising behavior when sorting an array of numbers.
# min(array[, accessor])
Returns the minimum value in the given array using natural order. If the array is empty, returns undefined. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the minimum value.
Unlike the built-in Math.min, this method ignores undefined, null and NaN values; this is useful for ignoring missing data. In addition, elements are compared using natural order rather than numeric order. For example, the minimum of ["20", "3"] is "20", while the minimum of [20, 3] is 3.
# max(array[, accessor])
Returns the maximum value in the given array using natural order. If the array is empty, returns undefined. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the maximum value.
Unlike the built-in Math.max, this method ignores undefined values; this is useful for ignoring missing data. In addition, elements are compared using natural order rather than numeric order. For example, the maximum of ["20", "3"] is "3", while the maximum of [20, 3] is 20.
# extent(array[, accessor])
Returns the minimum and maximum value in the given array using natural order.
# sum(array[, accessor])
Returns the sum of the given array. If the array is empty, returns 0. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the sum. This method ignores undefined and NaN values; this is useful for ignoring missing data.
# mean(array[, accessor])
Returns the mean of the given array. If the array is empty, returns undefined. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the mean. This method ignores undefined and NaN values; this is useful for ignoring missing data.
# median(array[, accessor])
Returns the median of the given array using the R-7 method. If the array is empty, returns undefined. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the median. This method ignores undefined and NaN values; this is useful for ignoring missing data.
# quantile(array, p[, accessor])
Returns the p-quantile of the given sorted array of elements, where p is a number in the range [0,1]. For example, the median can be computed using p = 0.5, the first quartile at p = 0.25, and the third quartile at p = 0.75. This particular implementation uses the R-7 method, which is the default for the R programming language and Excel. For example:
var a = [0, 10, 30];
quantile(a, 0); // 0
quantile(a, 0.5); // 10
quantile(a, 1); // 30
quantile(a, 0.25); // 5
quantile(a, 0.75); // 20
quantile(a, 0.1); // 2
An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the quantile.
# variance(array[, accessor])
Returns an unbiased estimator of the population variance of the given array of numbers. If the array has fewer than two values, returns undefined. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the variance. This method ignores undefined and NaN values; this is useful for ignoring missing data.
# deviation(array[, accessor])
Returns the standard deviation, defined as the square root of the bias-corrected variance, of the given array of numbers. If the array has fewer than two values, returns undefined. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before computing the standard deviation. This method ignores undefined and NaN values; this is useful for ignoring missing data.
# shuffle(array[, lo[, hi]])
Randomizes the order of the specified array using the Fisher–Yates shuffle.
# merge(arrays)
Merges the specified arrays into a single array. This method is similar to the built-in array concat method; the only difference is that it is more convenient when you have an array of arrays.
merge([[1], [2, 3]]); // returns [1, 2, 3]
# range([start, ]stop[, step])
Returns an array containing an arithmetic progression, similar to the Python built-in range. This method is often used to iterate over a sequence of regularly-spaced numeric values, such as the indexes of an array or the ticks of a linear scale.
If step is omitted, it defaults to 1. If start is omitted, it defaults to 0. The stop value is exclusive; it is not included in the result. If step is positive, the last element is the largest start + i * step less than stop; if step is negative, the last element is the smallest start + i * step greater than stop. If the returned array would contain an infinite number of values, an empty range is returned.
The arguments are not required to be integers; however, the results are more predictable if they are. The values in the returned array are defined as start + i * step, where i is an integer from zero to one minus the total number of elements in the returned array. For example:
range(0, 1, 0.2) // [0, 0.2, 0.4, 0.6000000000000001, 0.8]
This unexpected behavior is due to IEEE 754 double-precision floating point, which defines 0.2 * 3 = 0.6000000000000001. Use d3-format to format numbers for human consumption with appropriate rounding; see also linear.tickFormat in d3-scale.
Likewise, if the returned array should have a specific length, consider using array.map on an integer range. For example:
range(0, 1, 1 / 49); // BAD: returns 50 elements!
range(49).map(function(d) { return d / 49; }); // GOOD: returns 49 elements.
# permute(array, indexes)
Returns a permutation of the specified array using the specified array of indexes. The returned array contains the corresponding element in array for each index in indexes, in order. For example, permute(["a", "b", "c"], [1, 2, 0]) returns ["b", "c", "a"]. It is acceptable for the array of indexes to be a different length from the array of elements, and for indexes to be duplicated or omitted.
This method can also be used to extract the values from an object into an array with a stable order. Extracting keyed values in order can be useful for generating data arrays in nested selections. For example:
var object = {yield: 27, variety: "Manchuria", year: 1931, site: "University Farm"},
fields = ["site", "variety", "yield"];
permute(object, fields); // returns ["University Farm", "Manchuria", 27]
# zip(arrays…)
Returns an array of arrays, where the ith array contains the ith element from each of the argument arrays. The returned array is truncated in length to the shortest array in arrays. If arrays contains only a single array, the returned array contains one-element arrays. With no arguments, the returned array is empty.
zip([1, 2], [3, 4]); // returns [[1, 3], [2, 4]]
# transpose(matrix)
Uses the zip operator as a two-dimensional matrix transpose.
# pairs(array)
For each adjacent pair of elements in the specified array, returns a new array of tuples of element i and element i - 1. For example:
pairs([1, 2, 3, 4]); // returns [[1, 2], [2, 3], [3, 4]]
If the specified array has fewer than two elements, returns the empty array.
# scan(array[, comparator])
Performs a linear scan of the specified array, returning the index of the least element according to the specified comparator. If the given array contains no comparable elements (i.e., the comparator returns NaN when comparing each element to itself), returns undefined. If comparator is not specified, it defaults to ascending. For example:
var array = [{foo: 42}, {foo: 91}];
scan(array, function(a, b) { return a.foo - b.foo; }); // 0
scan(array, function(a, b) { return b.foo - a.foo; }); // 1
This function is similar to min, except it allows the use of a comparator rather than an accessor and it returns the index instead of the accessed value. See also bisect.
# bisectLeft(array, x[, lo[, hi]])
Returns the insertion point for x in array to maintain sorted order. The arguments lo and hi may be used to specify a subset of the array which should be considered; by default the entire array is used. If x is already present in array, the insertion point will be before (to the left of) any existing entries. The return value is suitable for use as the first argument to splice assuming that array is already sorted. The returned insertion point i partitions the array into two halves so that all v < x for v in array.slice(lo, i) for the left side and all v >= x for v in array.slice(i, hi) for the right side.
# bisect(array, x[, lo[, hi]])
# bisectRight(array, x[, lo[, hi]])
Similar to bisectLeft, but returns an insertion point which comes after (to the right of) any existing entries of x in array. The returned insertion point i partitions the array into two halves so that all v <= x for v in array.slice(lo, i) for the left side and all v > x for v in array.slice(i, hi) for the right side.
# bisector(accessor)
# bisector(comparator)
Returns a new bisector using the specified accessor or comparator function. This method can be used to bisect arrays of objects instead of being limited to simple arrays of primitives. For example, given the following array of objects:
var data = [
{date: new Date(2011, 1, 1), value: 0.5},
{date: new Date(2011, 2, 1), value: 0.6},
{date: new Date(2011, 3, 1), value: 0.7},
{date: new Date(2011, 4, 1), value: 0.8}
];
A suitable bisect function could be constructed as:
var bisectDate = bisector(function(d) { return d.date; }).right;
This is equivalent to specifying a comparator:
var bisectDate = bisector(function(d, x) { return d.date - x; }).right;
And then applied as bisect(data, new Date(2011, 1, 2))
, returning an index. Note that the comparator is always passed the search value x as the second argument. Use a comparator rather than an accessor if you want values to be sorted in an order different than natural order, such as in descending rather than ascending order.
# bisector.left(array, x[, lo[, hi]])
Equivalent to bisectLeft, but uses this bisector’s associated comparator.
# bisector.right(array, x[, lo[, hi]])
Equivalent to bisectRight, but uses this bisector’s associated comparator.
Another common data type in JavaScript is the associative array, or more simply the object, which has a set of named properties. Java refers to this as a map, and Python a dictionary. JavaScript provides a standard mechanism for iterating over the keys (or property names) in an associative array: the for…in loop. However, note that the iteration order is undefined. D3 provides several operators for converting associative arrays to standard arrays with numeric indexes.
A word of caution: it is tempting to use plain objects as maps, but this causes unexpected behavior when built-in property names are used as keys, such as object["__proto__"] = 42
and "hasOwnProperty" in object
. (ES6 introduces Map and Set collections which avoid this problem, but browser support is limited.) If you cannot guarantee that map keys and set values will be safe, you should use map and set instead of plain objects.
# keys(object)
Returns an array containing the property names of the specified object (an associative array). The order of the returned array is undefined.
# values(object)
Returns an array containing the property values of the specified object (an associative array). The order of the returned array is undefined.
# entries(object)
Returns an array containing the property keys and values of the specified object (an associative array). Each entry is an object with a key and value attribute, such as {key: "foo", value: 42}
. The order of the returned array is undefined.
entries({foo: 42, bar: true}); // [{key: "foo", value: 42}, {key: "bar", value: true}]
Like ES6 Maps, but with a few differences:
# map([object[, key]])
Constructs a new map. If object is specified, copies all enumerable properties from the specified object into this map. The specified object may also be an array or another map. An optional key function may be specified to compute the key for each value in the array. For example:
var m = map([{name: "foo"}, {name: "bar"}], function(d) { return d.name; });
m.get("foo"); // {"name": "foo"}
m.get("bar"); // {"name": "bar"}
m.get("baz"); // undefined
See also nest.
# map.has(key)
Returns true if and only if this map has an entry for the specified key string. Note: the value may be null
or undefined
.
# map.get(key)
Returns the value for the specified key string. If the map does not have an entry for the specified key, returns undefined
.
# map.set(key, value)
Sets the value for the specified key string. If the map previously had an entry for the same key string, the old entry is replaced with the new value. Returns the map, allowing chaining. For example:
var m = map()
.set("foo", 1)
.set("bar", 2)
.set("baz", 3);
m.get("foo"); // 1
# map.remove(key)
If the map has an entry for the specified key string, removes the entry and returns true. Otherwise, this method does nothing and returns false.
# map.clear()
Removes all entries from this map.
# map.keys()
Returns an array of string keys for every entry in this map. The order of the returned keys is arbitrary.
# map.values()
Returns an array of values for every entry in this map. The order of the returned values is arbitrary.
# map.entries()
Returns an array of key-value objects for each entry in this map. The order of the returned entries is arbitrary. Each entry’s key is a string, but the value has arbitrary type.
# map.each(function)
Calls the specified function for each entry in this map, passing the entry’s value and key as arguments, followed by the map itself. Returns undefined. The iteration order is arbitrary.
# map.empty()
Returns true if and only if this map has zero entries.
# map.size()
Returns the number of entries in this map.
Like ES6 Sets, but with a few differences:
# set([array[, accessor]])
Constructs a new set. If array is specified, adds the given array of string values to the returned set. The specified array may also be another set. An optional accessor function may be specified, which is equivalent to calling array.map(accessor) before constructing the set.
# set.has(value)
Returns true if and only if this set has an entry for the specified value string.
# set.add(value)
Adds the specified value string to this set. Returns the set, allowing chaining. For example:
var s = set()
.add("foo")
.add("bar")
.add("baz");
s.has("foo"); // true
# set.remove(value)
If the set contains the specified value string, removes it and returns true. Otherwise, this method does nothing and returns false.
# set.clear()
Removes all values from this set.
# set.values()
Returns an array of the string values in this set. The order of the returned values is arbitrary. Can be used as a convenient way of computing the unique values for a set of strings. For example:
set(["foo", "bar", "foo", "baz"]).values(); // "foo", "bar", "baz"
# set.each(function)
Calls the specified function for each value in this set, passing the value as the first two arguments (for symmetry with map.each), followed by the set itself. Returns undefined. The iteration order is arbitrary.
# set.empty()
Returns true if and only if this set has zero values.
# set.size()
Returns the number of values in this set.
Nesting allows elements in an array to be grouped into a hierarchical tree structure; think of it like the GROUP BY operator in SQL, except you can have multiple levels of grouping, and the resulting output is a tree rather than a flat table. The levels in the tree are specified by key functions. The leaf nodes of the tree can be sorted by value, while the internal nodes can be sorted by key. An optional rollup function will collapse the elements in each leaf node using a summary function. The nest operator (the object returned by nest) is reusable, and does not retain any references to the data that is nested.
For example, consider the following tabular data structure of Barley yields, from various sites in Minnesota during 1931-2:
var yields = [
{yield: 27.00, variety: "Manchuria", year: 1931, site: "University Farm"},
{yield: 48.87, variety: "Manchuria", year: 1931, site: "Waseca"},
{yield: 27.43, variety: "Manchuria", year: 1931, site: "Morris"},
...
];
To facilitate visualization, it may be useful to nest the elements first by year, and then by variety, as follows:
var entries = nest()
.key(function(d) { return d.year; })
.key(function(d) { return d.variety; })
.entries(yields);
This returns a nested array. Each element of the outer array is a key-values pair, listing the values for each distinct key:
[{key: "1931", values: [
{key: "Manchuria", values: [
{yield: 27.00, variety: "Manchuria", year: 1931, site: "University Farm"},
{yield: 48.87, variety: "Manchuria", year: 1931, site: "Waseca"},
{yield: 27.43, variety: "Manchuria", year: 1931, site: "Morris"}, ...]},
{key: "Glabron", values: [
{yield: 43.07, variety: "Glabron", year: 1931, site: "University Farm"},
{yield: 55.20, variety: "Glabron", year: 1931, site: "Waseca"}, ...]}, ...]},
{key: "1932", values: ...}]
The nested form allows easy iteration and generation of hierarchical structures in SVG or HTML.
For a longer introduction to nesting, see:
# nest()
Creates a new nest operator. The set of keys is initially empty. If the map or entries operator is invoked before any key functions are registered, the nest operator simply returns the input array.
# nest.key(function)
Registers a new key function. The key function will be invoked for each element in the input array, and must return a string identifier that is used to assign the element to its group. Most often, the function is implemented as a simple accessor, such as the year and variety accessors in the example above. The function is not passed the input array index. Each time a key is registered, it is pushed onto the end of an internal keys array, and the resulting map or entries will have an additional hierarchy level. There is not currently a facility to remove or query the registered keys. The most-recently registered key is referred to as the current key in subsequent methods.
# nest.sortKeys(comparator)
Sorts key values for the current key using the specified comparator, such as descending. If no comparator is specified for the current key, the order in which keys will be returned is undefined. Note that this only affects the result of the entries operator; the order of keys returned by the map operator is always undefined, regardless of comparator.
var entries = nest()
.key(function(d) { return d.year; })
.sortKeys(ascending)
.entries(yields);
# nest.sortValues(comparator)
Sorts leaf elements using the specified comparator, such as descending. This is roughly equivalent to sorting the input array before applying the nest operator; however it is typically more efficient as the size of each group is smaller. If no value comparator is specified, elements will be returned in the order they appeared in the input array. This applies to both the map and entries operators.
# nest.rollup(function)
Specifies a rollup function to be applied on each group of leaf elements. The return value of the rollup function will replace the array of leaf values in either the associative array returned by the map operator, or the values attribute of each entry returned by the entries operator.
# nest.map(array)
Applies the nest operator to the specified array, returning a nested map. Each entry in the returned map corresponds to a distinct key value returned by the first key function. The entry value depends on the number of registered key functions: if there is an additional key, the value is another map; otherwise, the value is the array of elements filtered from the input array that have the given key value.
# nest.object(array)
Applies the nest operator to the specified array, returning a nested object. Each entry in the returned associative array corresponds to a distinct key value returned by the first key function. The entry value depends on the number of registered key functions: if there is an additional key, the value is another associative array; otherwise, the value is the array of elements filtered from the input array that have the given key value.
Note: this method is unsafe if any of the keys conflict with built-in JavaScript properties, such as __proto__
. If you cannot guarantee that the keys will be safe, you should use nest.map instead.
# nest.entries(array)
Applies the nest operator to the specified array, returning an array of key-values entries. Conceptually, this is similar to applying entries to the associative array returned by map, but it applies to every level of the hierarchy rather than just the first (outermost) level. Each entry in the returned array corresponds to a distinct key value returned by the first key function. The entry value depends on the number of registered key functions: if there is an additional key, the value is another nested array of entries; otherwise, the value is the array of elements filtered from the input array that have the given key value.
Histograms bin many discrete samples into a smaller number of consecutive, non-overlapping intervals. They are often used to visualize the distribution of numerical data.
# histogram()
Constructs a new histogram generator with the default settings.
# histogram(data)
Computes the histogram for the given array of data samples. Returns an array of bins, where each bin is an array containing the associated elements from the input data. Thus, the length
of the bin is the number of elements in that bin. Each bin has two additional attributes:
x0
- the lower bound of the bin (inclusive).x1
- the upper bound of the bin (exclusive, except for the last bin).# histogram.value([value])
If value is specified, sets the value accessor to the specified function or number and returns this histogram generator. If value is not specified, returns the current value accessor, which defaults to the identity function.
When a histogram is generated, the value accessor will be invoked for each element in the input data array, being passed the element d
, the index i
, and the array data
as three arguments. The default value accessor assumes that the input data are numbers, or that they are coercible to numbers using valueOf. If your data are not simply numbers, then you should specify an accessor that returns the corresponding numeric value for a given datum.
This is similar to mapping your data to values before invoking the histogram generator, but has the benefit that the input data remains associated with the returned bins, thereby making it easier to access other fields of the data.
# histogram.domain([domain])
If domain is specified, sets the domain accessor to the specified function or array and returns this histogram generator. If domain is not specified, returns the current domain accessor, which defaults to extent. The histogram domain is defined as an array [min, max], where min is the minimum observable value and max is the maximum observable value; both values are inclusive. Any value outside of this domain will be ignored when the histogram is generated.
For example, if you are using the the histogram in conjunction with a linear scale x
, you might say:
var h = histogram()
.domain(x.domain())
.thresholds(x.ticks(20));
You can then compute the bins from an array of numbers like so:
var bins = h(numbers);
Note that the domain accessor is invoked on the materialized array of values, not on the input data array.
# histogram.thresholds([count])
# histogram.thresholds([thresholds])
If thresholds is specified, sets the threshold accessor to the specified function or array and returns this histogram generator. If thresholds is not specified, returns the current threshold accessor, which by default implements Sturges’ formula. Thresholds are defined as an array of numbers [x0, x1, …]. Any value less than x0 will be placed in the first bin; any value greater than or equal to x0 but less than x1 will be placed in the second bin; and so on. Thus, the generated histogram will have thresholds.length + 1 bins. See histogram thresholds for more information.
If a count is specified instead of an array of thresholds, then the domain will be uniformly divided into count + 1 bins.
These functions are typically not used directly; instead, pass them to histogram.thresholds. You may also implement your own threshold accessor function taking three arguments: the observable domain, represented as min and max, and the array of input values derived from the data. The accessor must then return the array of numeric thresholds.
# thresholdFreedmanDiaconis(min, max, values)
Returns the array of bin thresholds according to the Freedman–Diaconis rule.
# thresholdScott(min, max, values)
Returns the array of bin thresholds according to Scott’s normal reference rule.
# thresholdSturges(min, max, values)
Returns the array of bin thresholds according to Sturges’ formula.
this
context.FAQs
Array manipulation, ordering, searching, summarizing, etc.
The npm package d3-array receives a total of 4,154,263 weekly downloads. As such, d3-array popularity was classified as popular.
We found that d3-array demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 2 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
RubyGems.org has added a new "maintainer" role that allows for publishing new versions of gems. This new permission type is aimed at improving security for gem owners and the service overall.
Security News
Node.js will be enforcing stricter semver-major PR policies a month before major releases to enhance stability and ensure reliable release candidates.
Security News
Research
Socket's threat research team has detected five malicious npm packages targeting Roblox developers, deploying malware to steal credentials and personal data.