Package pointer implements Andersen's analysis, an inclusion-based pointer analysis algorithm first described in (Andersen, 1994). A pointer analysis relates every pointer expression in a whole program to the set of memory locations to which it might point. This information can be used to construct a call graph of the program that precisely represents the destinations of dynamic function and method calls. It can also be used to determine, for example, which pairs of channel operations operate on the same channel. The package allows the client to request a set of expressions of interest for which the points-to information will be returned once the analysis is complete. In addition, the client may request that a callgraph is constructed. The example program in example_test.go demonstrates both of these features. Clients should not request more information than they need since it may increase the cost of the analysis significantly. Our algorithm is INCLUSION-BASED: the points-to sets for x and y will be related by pts(y) ⊇ pts(x) if the program contains the statement y = x. It is FLOW-INSENSITIVE: it ignores all control flow constructs and the order of statements in a program. It is therefore a "MAY ALIAS" analysis: its facts are of the form "P may/may not point to L", not "P must point to L". It is FIELD-SENSITIVE: it builds separate points-to sets for distinct fields, such as x and y in struct { x, y *int }. It is mostly CONTEXT-INSENSITIVE: most functions are analyzed once, so values can flow in at one call to the function and return out at another. Only some smaller functions are analyzed with consideration of their calling context. It has a CONTEXT-SENSITIVE HEAP: objects are named by both allocation site and context, so the objects returned by two distinct calls to f: are distinguished up to the limits of the calling context. It is a WHOLE PROGRAM analysis: it requires SSA-form IR for the complete Go program and summaries for native code. See the (Hind, PASTE'01) survey paper for an explanation of these terms. The analysis is fully sound when invoked on pure Go programs that do not use reflection or unsafe.Pointer conversions. In other words, if there is any possible execution of the program in which pointer P may point to object O, the analysis will report that fact. By default, the "reflect" library is ignored by the analysis, as if all its functions were no-ops, but if the client enables the Reflection flag, the analysis will make a reasonable attempt to model the effects of calls into this library. However, this comes at a significant performance cost, and not all features of that library are yet implemented. In addition, some simplifying approximations must be made to ensure that the analysis terminates; for example, reflection can be used to construct an infinite set of types and values of those types, but the analysis arbitrarily bounds the depth of such types. Most but not all reflection operations are supported. In particular, addressable reflect.Values are not yet implemented, so operations such as (reflect.Value).Set have no analytic effect. The pointer analysis makes no attempt to understand aliasing between the operand x and result y of an unsafe.Pointer conversion: It is as if the conversion allocated an entirely new object: The analysis cannot model the aliasing effects of functions written in languages other than Go, such as runtime intrinsics in C or assembly, or code accessed via cgo. The result is as if such functions are no-ops. However, various important intrinsics are understood by the analysis, along with built-ins such as append. The analysis currently provides no way for users to specify the aliasing effects of native code. ------------------------------------------------------------------------ The remaining documentation is intended for package maintainers and pointer analysis specialists. Maintainers should have a solid understanding of the referenced papers (especially those by H&L and PKH) before making making significant changes. The implementation is similar to that described in (Pearce et al, PASTE'04). Unlike many algorithms which interleave constraint generation and solving, constructing the callgraph as they go, this implementation for the most part observes a phase ordering (generation before solving), with only simple (copy) constraints being generated during solving. (The exception is reflection, which creates various constraints during solving as new types flow to reflect.Value operations.) This improves the traction of presolver optimisations, but imposes certain restrictions, e.g. potential context sensitivity is limited since all variants must be created a priori. A type is said to be "pointer-like" if it is a reference to an object. Pointer-like types include pointers and also interfaces, maps, channels, functions and slices. We occasionally use C's x->f notation to distinguish the case where x is a struct pointer from x.f where is a struct value. Pointer analysis literature (and our comments) often uses the notation dst=*src+offset to mean something different than what it means in Go. It means: for each node index p in pts(src), the node index p+offset is in pts(dst). Similarly *dst+offset=src is used for store constraints and dst=src+offset for offset-address constraints. Nodes are the key datastructure of the analysis, and have a dual role: they represent both constraint variables (equivalence classes of pointers) and members of points-to sets (things that can be pointed at, i.e. "labels"). Nodes are naturally numbered. The numbering enables compact representations of sets of nodes such as bitvectors (or BDDs); and the ordering enables a very cheap way to group related nodes together. For example, passing n parameters consists of generating n parallel constraints from caller+i to callee+i for 0<=i<n. The zero nodeid means "not a pointer". For simplicity, we generate flow constraints even for non-pointer types such as int. The pointer equivalence (PE) presolver optimization detects which variables cannot point to anything; this includes not only all variables of non-pointer types (such as int) but also variables of pointer-like types if they are always nil, or are parameters to a function that is never called. Each node represents a scalar part of a value or object. Aggregate types (structs, tuples, arrays) are recursively flattened out into a sequential list of scalar component types, and all the elements of an array are represented by a single node. (The flattening of a basic type is a list containing a single node.) Nodes are connected into a graph with various kinds of labelled edges: simple edges (or copy constraints) represent value flow. Complex edges (load, store, etc) trigger the creation of new simple edges during the solving phase. Conceptually, an "object" is a contiguous sequence of nodes denoting an addressable location: something that a pointer can point to. The first node of an object has a non-nil obj field containing information about the allocation: its size, context, and ssa.Value. Objects include: Many objects have no Go types. For example, the func, map and chan type kinds in Go are all varieties of pointers, but their respective objects are actual functions (executable code), maps (hash tables), and channels (synchronized queues). Given the way we model interfaces, they too are pointers to "tagged" objects with no Go type. And an *ssa.Global denotes the address of a global variable, but the object for a Global is the actual data. So, the types of an ssa.Value that creates an object is "off by one indirection": a pointer to the object. The individual nodes of an object are sometimes referred to as "labels". For uniformity, all objects have a non-zero number of fields, even those of the empty type struct{}. (All arrays are treated as if of length 1, so there are no empty arrays. The empty tuple is never address-taken, so is never an object.) An tagged object has the following layout: The T node's typ field is the dynamic type of the "payload": the value v which follows, flattened out. The T node's obj has the otTagged flag. Tagged objects are needed when generalizing across types: interfaces, reflect.Values, reflect.Types. Each of these three types is modelled as a pointer that exclusively points to tagged objects. Tagged objects may be indirect (obj.flags ⊇ {otIndirect}) meaning that the value v is not of type T but *T; this is used only for reflect.Values that represent lvalues. (These are not implemented yet.) Variables of the following "scalar" types may be represented by a single node: basic types, pointers, channels, maps, slices, 'func' pointers, interfaces. Pointers: Nothing to say here, oddly. Basic types (bool, string, numbers, unsafe.Pointer): Currently all fields in the flattening of a type, including non-pointer basic types such as int, are represented in objects and values. Though non-pointer nodes within values are uninteresting, non-pointer nodes in objects may be useful (if address-taken) because they permit the analysis to deduce, in this example, that p points to s.x. If we ignored such object fields, we could only say that p points somewhere within s. All other basic types are ignored. Expressions of these types have zero nodeid, and fields of these types within aggregate other types are omitted. unsafe.Pointers are not modelled as pointers, so a conversion of an unsafe.Pointer to *T is (unsoundly) treated equivalent to new(T). Channels: An expression of type 'chan T' is a kind of pointer that points exclusively to channel objects, i.e. objects created by MakeChan (or reflection). 'chan T' is treated like *T. *ssa.MakeChan is treated as equivalent to new(T). *ssa.Send and receive (*ssa.UnOp(ARROW)) and are equivalent to store Maps: An expression of type 'map[K]V' is a kind of pointer that points exclusively to map objects, i.e. objects created by MakeMap (or reflection). map K[V] is treated like *M where M = struct{k K; v V}. *ssa.MakeMap is equivalent to new(M). *ssa.MapUpdate is equivalent to *y=x where *y and x have type M. *ssa.Lookup is equivalent to y=x.v where x has type *M. Slices: A slice []T, which dynamically resembles a struct{array *T, len, cap int}, is treated as if it were just a *T pointer; the len and cap fields are ignored. *ssa.MakeSlice is treated like new([1]T): an allocation of a *ssa.Index on a slice is equivalent to a load. *ssa.IndexAddr on a slice returns the address of the sole element of the slice, i.e. the same address. *ssa.Slice is treated as a simple copy. Functions: An expression of type 'func...' is a kind of pointer that points exclusively to function objects. A function object has the following layout: There may be multiple function objects for the same *ssa.Function due to context-sensitive treatment of some functions. The first node is the function's identity node. Associated with every callsite is a special "targets" variable, whose pts() contains the identity node of each function to which the call may dispatch. Identity words are not otherwise used during the analysis, but we construct the call graph from the pts() solution for such nodes. The following block of contiguous nodes represents the flattened-out types of the parameters ("P-block") and results ("R-block") of the function object. The treatment of free variables of closures (*ssa.FreeVar) is like that of global variables; it is not context-sensitive. *ssa.MakeClosure instructions create copy edges to Captures. A Go value of type 'func' (i.e. a pointer to one or more functions) is a pointer whose pts() contains function objects. The valueNode() for an *ssa.Function returns a singleton for that function. Interfaces: An expression of type 'interface{...}' is a kind of pointer that points exclusively to tagged objects. All tagged objects pointed to by an interface are direct (the otIndirect flag is clear) and concrete (the tag type T is not itself an interface type). The associated ssa.Value for an interface's tagged objects may be an *ssa.MakeInterface instruction, or nil if the tagged object was created by an instrinsic (e.g. reflection). Constructing an interface value causes generation of constraints for all of the concrete type's methods; we can't tell a priori which ones may be called. TypeAssert y = x.(T) is implemented by a dynamic constraint triggered by each tagged object O added to pts(x): a typeFilter constraint if T is an interface type, or an untag constraint if T is a concrete type. A typeFilter tests whether O.typ implements T; if so, O is added to pts(y). An untagFilter tests whether O.typ is assignable to T,and if so, a copy edge O.v -> y is added. ChangeInterface is a simple copy because the representation of tagged objects is independent of the interface type (in contrast to the "method tables" approach used by the gc runtime). y := Invoke x.m(...) is implemented by allocating contiguous P/R blocks for the callsite and adding a dynamic rule triggered by each tagged object added to pts(x). The rule adds param/results copy edges to/from each discovered concrete method. (Q. Why do we model an interface as a pointer to a pair of type and value, rather than as a pair of a pointer to type and a pointer to value? A. Control-flow joins would merge interfaces ({T1}, {V1}) and ({T2}, {V2}) to make ({T1,T2}, {V1,V2}), leading to the infeasible and type-unsafe combination (T1,V2). Treating the value and its concrete type as inseparable makes the analysis type-safe.) Type parameters: Type parameters are not directly supported by the analysis. Calls to generic functions will be left as if they had empty bodies. Users of the package are expected to use the ssa.InstantiateGenerics builder mode when building code that uses or depends on code containing generics. reflect.Value: A reflect.Value is modelled very similar to an interface{}, i.e. as a pointer exclusively to tagged objects, but with two generalizations. 1. a reflect.Value that represents an lvalue points to an indirect (obj.flags ⊇ {otIndirect}) tagged object, which has a similar layout to an tagged object except that the value is a pointer to the dynamic type. Indirect tagged objects preserve the correct aliasing so that mutations made by (reflect.Value).Set can be observed. Indirect objects only arise when an lvalue is derived from an rvalue by indirection, e.g. the following code: Whether indirect or not, the concrete type of the tagged object corresponds to the user-visible dynamic type, and the existence of a pointer is an implementation detail. (NB: indirect tagged objects are not yet implemented) 2. The dynamic type tag of a tagged object pointed to by a reflect.Value may be an interface type; it need not be concrete. This arises in code such as this: pts(eface) is a singleton containing an interface{}-tagged object. That tagged object's payload is an interface{} value, i.e. the pts of the payload contains only concrete-tagged objects, although in this example it's the zero interface{} value, so its pts is empty. reflect.Type: Just as in the real "reflect" library, we represent a reflect.Type as an interface whose sole implementation is the concrete type, *reflect.rtype. (This choice is forced on us by go/types: clients cannot fabricate types with arbitrary method sets.) rtype instances are canonical: there is at most one per dynamic type. (rtypes are in fact large structs but since identity is all that matters, we represent them by a single node.) The payload of each *rtype-tagged object is an *rtype pointer that points to exactly one such canonical rtype object. We exploit this by setting the node.typ of the payload to the dynamic type, not '*rtype'. This saves us an indirection in each resolution rule. As an optimisation, *rtype-tagged objects are canonicalized too. Aggregate types: Aggregate types are treated as if all directly contained aggregates are recursively flattened out. Structs: *ssa.Field y = x.f creates a simple edge to y from x's node at f's offset. *ssa.FieldAddr y = &x->f requires a dynamic closure rule to create The nodes of a struct consist of a special 'identity' node (whose type is that of the struct itself), followed by the nodes for all the struct's fields, recursively flattened out. A pointer to the struct is a pointer to its identity node. That node allows us to distinguish a pointer to a struct from a pointer to its first field. Field offsets are logical field offsets (plus one for the identity node), so the sizes of the fields can be ignored by the analysis. (The identity node is non-traditional but enables the distinction described above, which is valuable for code comprehension tools. Typical pointer analyses for C, whose purpose is compiler optimization, must soundly model unsafe.Pointer (void*) conversions, and this requires fidelity to the actual memory layout using physical field offsets.) *ssa.Field y = x.f creates a simple edge to y from x's node at f's offset. *ssa.FieldAddr y = &x->f requires a dynamic closure rule to create Arrays: We model an array by an identity node (whose type is that of the array itself) followed by a node representing all the elements of the array; the analysis does not distinguish elements with different indices. Effectively, an array is treated like struct{elem T}, a load y=x[i] like y=x.elem, and a store x[i]=y like x.elem=y; the index i is ignored. A pointer to an array is pointer to its identity node. (A slice is also a pointer to an array's identity node.) The identity node allows us to distinguish a pointer to an array from a pointer to one of its elements, but it is rather costly because it introduces more offset constraints into the system. Furthermore, sound treatment of unsafe.Pointer would require us to dispense with this node. Arrays may be allocated by Alloc, by make([]T), by calls to append, and via reflection. Tuples (T, ...): Tuples are treated like structs with naturally numbered fields. *ssa.Extract is analogous to *ssa.Field. However, tuples have no identity field since by construction, they cannot be address-taken. There are three kinds of function call: Cases 1 and 2 apply equally to methods and standalone functions. Static calls: A static call consists three steps: A static function call is little more than two struct value copies between the P/R blocks of caller and callee: Context sensitivity: Static calls (alone) may be treated context sensitively, i.e. each callsite may cause a distinct re-analysis of the callee, improving precision. Our current context-sensitivity policy treats all intrinsics and getter/setter methods in this manner since such functions are small and seem like an obvious source of spurious confluences, though this has not yet been evaluated. Dynamic function calls: Dynamic calls work in a similar manner except that the creation of copy edges occurs dynamically, in a similar fashion to a pair of struct copies in which the callee is indirect: (Recall that the function object's P- and R-blocks are contiguous.) Interface method invocation: For invoke-mode calls, we create a params/results block for the callsite and attach a dynamic closure rule to the interface. For each new tagged object that flows to the interface, we look up the concrete method, find its function object, and connect its P/R blocks to the callsite's P/R blocks, adding copy edges to the graph during solving. Recording call targets: The analysis notifies its clients of each callsite it encounters, passing a CallSite interface. Among other things, the CallSite contains a synthetic constraint variable ("targets") whose points-to solution includes the set of all function objects to which the call may dispatch. It is via this mechanism that the callgraph is made available. Clients may also elect to be notified of callgraph edges directly; internally this just iterates all "targets" variables' pts(·)s. We implement Hash-Value Numbering (HVN), a pre-solver constraint optimization described in Hardekopf & Lin, SAS'07. This is documented in more detail in hvn.go. We intend to add its cousins HR and HU in future. The solver is currently a naive Andersen-style implementation; it does not perform online cycle detection, though we plan to add solver optimisations such as Hybrid- and Lazy- Cycle Detection from (Hardekopf & Lin, PLDI'07). It uses difference propagation (Pearce et al, SQC'04) to avoid redundant re-triggering of closure rules for values already seen. Points-to sets are represented using sparse bit vectors (similar to those used in LLVM and gcc), which are more space- and time-efficient than sets based on Go's built-in map type or dense bit vectors. Nodes are permuted prior to solving so that object nodes (which may appear in points-to sets) are lower numbered than non-object (var) nodes. This improves the density of the set over which the PTSs range, and thus the efficiency of the representation. Partly thanks to avoiding map iteration, the execution of the solver is 100% deterministic, a great help during debugging. Andersen, L. O. 1994. Program analysis and specialization for the C programming language. Ph.D. dissertation. DIKU, University of Copenhagen. David J. Pearce, Paul H. J. Kelly, and Chris Hankin. 2004. Efficient field-sensitive pointer analysis for C. In Proceedings of the 5th ACM SIGPLAN-SIGSOFT workshop on Program analysis for software tools and engineering (PASTE '04). ACM, New York, NY, USA, 37-42. http://doi.acm.org/10.1145/996821.996835 David J. Pearce, Paul H. J. Kelly, and Chris Hankin. 2004. Online Cycle Detection and Difference Propagation: Applications to Pointer Analysis. Software Quality Control 12, 4 (December 2004), 311-337. http://dx.doi.org/10.1023/B:SQJO.0000039791.93071.a2 David Grove and Craig Chambers. 2001. A framework for call graph construction algorithms. ACM Trans. Program. Lang. Syst. 23, 6 (November 2001), 685-746. http://doi.acm.org/10.1145/506315.506316 Ben Hardekopf and Calvin Lin. 2007. The ant and the grasshopper: fast and accurate pointer analysis for millions of lines of code. In Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (PLDI '07). ACM, New York, NY, USA, 290-299. http://doi.acm.org/10.1145/1250734.1250767 Ben Hardekopf and Calvin Lin. 2007. Exploiting pointer and location equivalence to optimize pointer analysis. In Proceedings of the 14th international conference on Static Analysis (SAS'07), Hanne Riis Nielson and Gilberto Filé (Eds.). Springer-Verlag, Berlin, Heidelberg, 265-280. Atanas Rountev and Satish Chandra. 2000. Off-line variable substitution for scaling points-to analysis. In Proceedings of the ACM SIGPLAN 2000 conference on Programming language design and implementation (PLDI '00). ACM, New York, NY, USA, 47-56. DOI=10.1145/349299.349310 http://doi.acm.org/10.1145/349299.349310 This program demonstrates how to use the pointer analysis to obtain a conservative call-graph of a Go program. It also shows how to compute the points-to set of a variable, in this case, (C).f's ch parameter.
Package grpcreplay supports the capture and replay of gRPC calls. Its main goal is to improve testing. Once you capture the calls of a test that runs against a real service, you have an "automatic mock" that can be replayed against the same test, yielding a unit test that is fast and flake-free. To record a sequence of gRPC calls to a file, create a Recorder and pass its DialOptions to grpc.Dial: It is essential to close the Recorder when the interaction is finished. There is also a NewRecorderWriter function for capturing to an arbitrary io.Writer. To replay a captured file, create a Replayer and ask it for a (fake) connection. We don't actually have to dial a server. (Since we're reading the file and not writing it, we don't have to be as careful about the error returned from Close). A test might use random or time-sensitive values, for instance to create unique resources for isolation from other tests. The test therefore has initial values, such as the current time, or a random seed, that differ from run to run. You must record this initial state and re-establish it on replay. To record the initial state, serialize it into a []byte and pass it as the second argument to NewRecorder: On replay, get the bytes from Replayer.Initial: Recorders and replayers have support for running callbacks before messages are written to or read from the replay file. A Recorder has a BeforeFunc that can modify a request or response before it is written to the replay file. The actual RPCs sent to the service during recording remain unaltered; only what is saved in the replay file can be changed. A Replayer has a BeforeFunc that can modify a request before it is sent for matching. Example uses for these callbacks include customized logging, or scrubbing data before RPCs are written to the replay file. If requests are modified by the callbacks during recording, it is important to perform the same modifications to the requests when replaying, or RPC matching on replay will fail. A common way to analyze and modify the various messages is to use a type switch. A nondeterministic program may invoke RPCs in a different order each time it is run. The order in which RPCs are called during recording may differ from the order during replay. The replayer matches incoming to recorded requests by method name and request contents, so nondeterminism is only a concern for identical requests that result in different responses. A nondeterministic program whose behavior differs depending on the order of such RPCs probably has a race condition: since both the recorded sequence of RPCs and the sequence during replay are valid orderings, the program should behave the same under both. The same is not true of streaming RPCs. The replayer matches streams only by method name, since it has no other information at the time the stream is opened. Two streams with the same method name that are started concurrently may replay in the wrong order. Besides the differences in replay mentioned above, other differences may cause issues for some programs. We list them here. The Replayer delivers a response to an RPC immediately, without waiting for other incoming RPCs. This can violate causality. For example, in a Pub/Sub program where one goroutine publishes and another subscribes, during replay the Subscribe call may finish before the Publish call begins. For streaming RPCs, the Replayer delivers the result of Send and Recv calls in the order they were recorded. No attempt is made to match message contents. At present, this package does not record or replay stream headers and trailers, or the result of the CloseSend method.
Package analysis provides methods to work with a Swagger specification document from package go-openapi/spec. An analysed specification object (type Spec) provides methods to work with swagger definition. Flattening a specification bundles all remote $ref in the main spec document. Depending on flattening options, additional preprocessing may take place: Mixin several specifications merges all Swagger constructs, and warns about found conflicts. Unmarshalling a specification with golang [json] unmarshalling may lead to some unwanted result on present but empty fields. Swagger schemas are analyzed to determine their complexity and qualify their content.
Package jump implements the "jump consistent hash" algorithm. Example Reference C++ implementation[1] Jump consistent hash works by computing when its output changes as the number of buckets increases. Let ch(key, num_buckets) be the consistent hash for the key when there are num_buckets buckets. Clearly, for any key, k, ch(k, 1) is 0, since there is only the one bucket. In order for the consistent hash function to balanced, ch(k, 2) will have to stay at 0 for half the keys, k, while it will have to jump to 1 for the other half. In general, ch(k, n+1) has to stay the same as ch(k, n) for n/(n+1) of the keys, and jump to n for the other 1/(n+1) of the keys. Here are examples of the consistent hash values for three keys, k1, k2, and k3, as num_buckets goes up: A linear time algorithm can be defined by using the formula for the probability of ch(key, j) jumping when j increases. It essentially walks across a row of this table. Given a key and number of buckets, the algorithm considers each successive bucket, j, from 1 to num_buckets1, and uses ch(key, j) to compute ch(key, j+1). At each bucket, j, it decides whether to keep ch(k, j+1) the same as ch(k, j), or to jump its value to j. In order to jump for the right fraction of keys, it uses a pseudorandom number generator with the key as its seed. To jump for 1/(j+1) of keys, it generates a uniform random number between 0.0 and 1.0, and jumps if the value is less than 1/(j+1). At the end of the loop, it has computed ch(k, num_buckets), which is the desired answer. In code: We can convert this to a logarithmic time algorithm by exploiting that ch(key, j+1) is usually unchanged as j increases, only jumping occasionally. The algorithm will only compute the destinations of jumps the j’s for which ch(key, j+1) ≠ ch(key, j). Also notice that for these j’s, ch(key, j+1) = j. To develop the algorithm, we will treat ch(key, j) as a random variable, so that we can use the notation for random variables to analyze the fractions of keys for which various propositions are true. That will lead us to a closed form expression for a pseudorandom variable whose value gives the destination of the next jump. Suppose that the algorithm is tracking the bucket numbers of the jumps for a particular key, k. And suppose that b was the destination of the last jump, that is, ch(k, b) ≠ ch(k, b+1), and ch(k, b+1) = b. Now, we want to find the next jump, the smallest j such that ch(k, j+1) ≠ ch(k, b+1), or equivalently, the largest j such that ch(k, j) = ch(k, b+1). We will make a pseudorandom variable whose value is that j. To get a probabilistic constraint on j, note that for any bucket number, i, we have j ≥ i if and only if the consistent hash hasn’t changed by i, that is, if and only if ch(k, i) = ch(k, b+1). Hence, the distribution of j must satisfy Fortunately, it is easy to compute that probability. Notice that since P( ch(k, 10) = ch(k, 11) ) is 10/11, and P( ch(k, 11) = ch(k, 12) ) is 11/12, then P( ch(k, 10) = ch(k, 12) ) is 10/11 * 11/12 = 10/12. In general, if n ≥ m, P( ch(k, n) = ch(k, m) ) = m / n. Thus for any i > b, Now, we generate a pseudorandom variable, r, (depending on k and j) that is uniformly distributed between 0 and 1. Since we want P(j ≥ i) = (b+1) / i, we set P(j ≥ i) iff r ≤ (b+1) / i. Solving the inequality for i yields P(j ≥ i) iff i ≤ (b+1) / r. Since i is a lower bound on j, j will equal the largest i for which P(j ≥ i), thus the largest i satisfying i ≤ (b+1) / r. Thus, by the definition of the floor function, j = floor((b+1) / r). Using this formula, jump consistent hash finds ch(key, num_buckets) by choosing successive jump destinations until it finds a position at or past num_buckets. It then knows that the previous jump destination is the answer. To turn this into the actual code of figure 1, we need to implement random. We want it to be fast, and yet to also to have well distributed successive values. We use a 64bit linear congruential generator; the particular multiplier we use produces random numbers that are especially well distributed in higher dimensions (i.e., when successive random values are used to form tuples). We use the key as the seed. (For keys that don’t fit into 64 bits, a 64 bit hash of the key should be used.) The congruential generator updates the seed on each iteration, and the code derives a double from the current seed. Tests show that this generator has good speed and distribution. It is worth noting that unlike the algorithm of Karger et al., jump consistent hash does not require the key to be hashed if it is already an integer. This is because jump consistent hash has an embedded pseudorandom number generator that essentially rehashes the key on every iteration. The hash is not especially good (i.e., linear congruential), but since it is applied repeatedly, additional hashing of the input key is not necessary. [1] http://arxiv.org/pdf/1406.2294v1.pdf
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
Package analysis provides methods to work with a Swagger specification document from package protodev-site/spec. An analysed specification object (type Spec) provides methods to work with swagger definition. Flattening a specification bundles all remote $ref in the main spec document. Depending on flattening options, additional preprocessing may take place: Mixin several specifications merges all Swagger constructs, and warns about found conflicts. Unmarshalling a specification with golang json unmarshalling may lead to some unwanted result on present but empty fields. Swagger schemas are analyzed to determine their complexity and qualify their content.
Go is a tool for managing Go source code. Usage: The commands are: Use "go help <command>" for more information about a command. Additional help topics: Use "go help <topic>" for more information about that topic. Usage: Bug opens the default browser and starts a new bug report. The report includes useful system information. Usage: Build compiles the packages named by the import paths, along with their dependencies, but it does not install the results. If the arguments to build are a list of .go files from a single directory, build treats them as a list of source files specifying a single package. When compiling packages, build ignores files that end in '_test.go'. When compiling a single main package, build writes the resulting executable to an output file named after the last non-major-version component of the package import path. The '.exe' suffix is added when writing a Windows executable. So 'go build example/sam' writes 'sam' or 'sam.exe'. 'go build example.com/foo/v2' writes 'foo' or 'foo.exe', not 'v2.exe'. When compiling a package from a list of .go files, the executable is named after the first source file. 'go build ed.go rx.go' writes 'ed' or 'ed.exe'. When compiling multiple packages or a single non-main package, build compiles the packages but discards the resulting object, serving only as a check that the packages can be built. The -o flag forces build to write the resulting executable or object to the named output file or directory, instead of the default behavior described in the last two paragraphs. If the named output is an existing directory or ends with a slash or backslash, then any resulting executables will be written to that directory. The build flags are shared by the build, clean, get, install, list, run, and test commands: The -asmflags, -gccgoflags, -gcflags, and -ldflags flags accept a space-separated list of arguments to pass to an underlying tool during the build. To embed spaces in an element in the list, surround it with either single or double quotes. The argument list may be preceded by a package pattern and an equal sign, which restricts the use of that argument list to the building of packages matching that pattern (see 'go help packages' for a description of package patterns). Without a pattern, the argument list applies only to the packages named on the command line. The flags may be repeated with different patterns in order to specify different arguments for different sets of packages. If a package matches patterns given in multiple flags, the latest match on the command line wins. For example, 'go build -gcflags=-S fmt' prints the disassembly only for package fmt, while 'go build -gcflags=all=-S fmt' prints the disassembly for fmt and all its dependencies. For more about specifying packages, see 'go help packages'. For more about where packages and binaries are installed, run 'go help gopath'. For more about calling between Go and C/C++, run 'go help c'. Note: Build adheres to certain conventions such as those described by 'go help gopath'. Not all projects can follow these conventions, however. Installations that have their own conventions or that use a separate software build system may choose to use lower-level invocations such as 'go tool compile' and 'go tool link' to avoid some of the overheads and design decisions of the build tool. See also: go install, go get, go clean. Usage: Clean removes object files from package source directories. The go command builds most objects in a temporary directory, so go clean is mainly concerned with object files left by other tools or by manual invocations of go build. If a package argument is given or the -i or -r flag is set, clean removes the following files from each of the source directories corresponding to the import paths: In the list, DIR represents the final path element of the directory, and MAINFILE is the base name of any Go source file in the directory that is not included when building the package. The -i flag causes clean to remove the corresponding installed archive or binary (what 'go install' would create). The -n flag causes clean to print the remove commands it would execute, but not run them. The -r flag causes clean to be applied recursively to all the dependencies of the packages named by the import paths. The -x flag causes clean to print remove commands as it executes them. The -cache flag causes clean to remove the entire go build cache. The -testcache flag causes clean to expire all test results in the go build cache. The -modcache flag causes clean to remove the entire module download cache, including unpacked source code of versioned dependencies. The -fuzzcache flag causes clean to remove files stored in the Go build cache for fuzz testing. The fuzzing engine caches files that expand code coverage, so removing them may make fuzzing less effective until new inputs are found that provide the same coverage. These files are distinct from those stored in testdata directory; clean does not remove those files. For more about build flags, see 'go help build'. For more about specifying packages, see 'go help packages'. Usage: Doc prints the documentation comments associated with the item identified by its arguments (a package, const, func, type, var, method, or struct field) followed by a one-line summary of each of the first-level items "under" that item (package-level declarations for a package, methods for a type, etc.). Doc accepts zero, one, or two arguments. Given no arguments, that is, when run as it prints the package documentation for the package in the current directory. If the package is a command (package main), the exported symbols of the package are elided from the presentation unless the -cmd flag is provided. When run with one argument, the argument is treated as a Go-syntax-like representation of the item to be documented. What the argument selects depends on what is installed in GOROOT and GOPATH, as well as the form of the argument, which is schematically one of these: The first item in this list matched by the argument is the one whose documentation is printed. (See the examples below.) However, if the argument starts with a capital letter it is assumed to identify a symbol or method in the current directory. For packages, the order of scanning is determined lexically in breadth-first order. That is, the package presented is the one that matches the search and is nearest the root and lexically first at its level of the hierarchy. The GOROOT tree is always scanned in its entirety before GOPATH. If there is no package specified or matched, the package in the current directory is selected, so "go doc Foo" shows the documentation for symbol Foo in the current package. The package path must be either a qualified path or a proper suffix of a path. The go tool's usual package mechanism does not apply: package path elements like . and ... are not implemented by go doc. When run with two arguments, the first is a package path (full path or suffix), and the second is a symbol, or symbol with method or struct field: In all forms, when matching symbols, lower-case letters in the argument match either case but upper-case letters match exactly. This means that there may be multiple matches of a lower-case argument in a package if different symbols have different cases. If this occurs, documentation for all matches is printed. Examples: Flags: Usage: Env prints Go environment information. By default env prints information as a shell script (on Windows, a batch file). If one or more variable names is given as arguments, env prints the value of each named variable on its own line. The -json flag prints the environment in JSON format instead of as a shell script. The -u flag requires one or more arguments and unsets the default setting for the named environment variables, if one has been set with 'go env -w'. The -w flag requires one or more arguments of the form NAME=VALUE and changes the default settings of the named environment variables to the given values. The -changed flag prints only those settings whose effective value differs from the default value that would be obtained in an empty environment with no prior uses of the -w flag. For more about environment variables, see 'go help environment'. Usage: Fix runs the Go fix tool (cmd/fix) on the named packages and applies suggested fixes. It supports these flags: The -fixtool=prog flag selects a different analysis tool with alternative or additional fixers; see the documentation for go vet's -vettool flag for details. The default fix tool is 'go tool fix' or cmd/fix. For help on its fixers and their flags, run 'go tool fix help'. For details of a specific fixer such as 'hostport', see 'go tool fix help hostport'. For more about specifying packages, see 'go help packages'. The build flags supported by go fix are those that control package resolution and execution, such as -C, -n, -x, -v, -tags, and -toolexec. For more about these flags, see 'go help build'. See also: go fmt, go vet. Usage: Fmt runs the command 'gofmt -l -w' on the packages named by the import paths. It prints the names of the files that are modified. For more about gofmt, see 'go doc cmd/gofmt'. For more about specifying packages, see 'go help packages'. The -n flag prints commands that would be executed. The -x flag prints commands as they are executed. The -mod flag's value sets which module download mode to use: readonly or vendor. See 'go help modules' for more. To run gofmt with specific options, run gofmt itself. See also: go fix, go vet. Usage: Generate runs commands described by directives within existing files. Those commands can run any process but the intent is to create or update Go source files. Go generate is never run automatically by go build, go test, and so on. It must be run explicitly. Go generate scans the file for directives, which are lines of the form, (note: no leading spaces and no space in "//go") where command is the generator to be run, corresponding to an executable file that can be run locally. It must either be in the shell path (gofmt), a fully qualified path (/usr/you/bin/mytool), or a command alias, described below. Note that go generate does not parse the file, so lines that look like directives in comments or multiline strings will be treated as directives. The arguments to the directive are space-separated tokens or double-quoted strings passed to the generator as individual arguments when it is run. Quoted strings use Go syntax and are evaluated before execution; a quoted string appears as a single argument to the generator. To convey to humans and machine tools that code is generated, generated source should have a line that matches the following regular expression (in Go syntax): This line must appear before the first non-comment, non-blank text in the file. Go generate sets several variables when it runs the generator: Other than variable substitution and quoted-string evaluation, no special processing such as "globbing" is performed on the command line. As a last step before running the command, any invocations of any environment variables with alphanumeric names, such as $GOFILE or $HOME, are expanded throughout the command line. The syntax for variable expansion is $NAME on all operating systems. Due to the order of evaluation, variables are expanded even inside quoted strings. If the variable NAME is not set, $NAME expands to the empty string. A directive of the form, specifies, for the remainder of this source file only, that the string xxx represents the command identified by the arguments. This can be used to create aliases or to handle multiword generators. For example, specifies that the command "foo" represents the generator "go tool foo". Generate processes packages in the order given on the command line, one at a time. If the command line lists .go files from a single directory, they are treated as a single package. Within a package, generate processes the source files in a package in file name order, one at a time. Within a source file, generate runs generators in the order they appear in the file, one at a time. The go generate tool also sets the build tag "generate" so that files may be examined by go generate but ignored during build. For packages with invalid code, generate processes only source files with a valid package clause. If any generator returns an error exit status, "go generate" skips all further processing for that package. The generator is run in the package's source directory. Go generate accepts two specific flags: It also accepts the standard build flags including -v, -n, and -x. The -v flag prints the names of packages and files as they are processed. The -n flag prints commands that would be executed. The -x flag prints commands as they are executed. For more about build flags, see 'go help build'. For more about specifying packages, see 'go help packages'. Usage: Get resolves its command-line arguments to packages at specific module versions, updates go.mod to require those versions, and downloads source code into the module cache. To add a dependency for a package or upgrade it to its latest version: To upgrade or downgrade a package to a specific version: To remove a dependency on a module and downgrade modules that require it: To upgrade the minimum required Go version to the latest released Go version: To upgrade the Go toolchain to the latest patch release of the current Go toolchain: See https://golang.org/ref/mod#go-get for details. In earlier versions of Go, 'go get' was used to build and install packages. Now, 'go get' is dedicated to adjusting dependencies in go.mod. 'go install' may be used to build and install commands instead. When a version is specified, 'go install' runs in module-aware mode and ignores the go.mod file in the current directory. For example: See 'go help install' or https://golang.org/ref/mod#go-install for details. 'go get' accepts the following flags. The -t flag instructs get to consider modules needed to build tests of packages specified on the command line. The -u flag instructs get to update modules providing dependencies of packages named on the command line to use newer minor or patch releases when available. The -u=patch flag (not -u patch) also instructs get to update dependencies, but changes the default to select patch releases. When the -t and -u flags are used together, get will update test dependencies as well. The -tool flag instructs go to add a matching tool line to go.mod for each listed package. If -tool is used with @none, the line will be removed. The -x flag prints commands as they are executed. This is useful for debugging version control commands when a module is downloaded directly from a repository. For more about build flags, see 'go help build'. For more about modules, see https://golang.org/ref/mod. For more about using 'go get' to update the minimum Go version and suggested Go toolchain, see https://go.dev/doc/toolchain. For more about specifying packages, see 'go help packages'. See also: go build, go install, go clean, go mod. Usage: Install compiles and installs the packages named by the import paths. Executables are installed in the directory named by the GOBIN environment variable, which defaults to $GOPATH/bin or $HOME/go/bin if the GOPATH environment variable is not set. Executables in $GOROOT are installed in $GOROOT/bin or $GOTOOLDIR instead of $GOBIN. Cross compiled binaries are installed in $GOOS_$GOARCH subdirectories of the above. If the arguments have version suffixes (like @latest or @v1.0.0), "go install" builds packages in module-aware mode, ignoring the go.mod file in the current directory or any parent directory, if there is one. This is useful for installing executables without affecting the dependencies of the main module. To eliminate ambiguity about which module versions are used in the build, the arguments must satisfy the following constraints: - Arguments must be package paths or package patterns (with "..." wildcards). They must not be standard packages (like fmt), meta-patterns (std, cmd, all), or relative or absolute file paths. - All arguments must have the same version suffix. Different queries are not allowed, even if they refer to the same version. - All arguments must refer to packages in the same module at the same version. - Package path arguments must refer to main packages. Pattern arguments will only match main packages. - No module is considered the "main" module. If the module containing packages named on the command line has a go.mod file, it must not contain directives (replace and exclude) that would cause it to be interpreted differently than if it were the main module. The module must not require a higher version of itself. - Vendor directories are not used in any module. (Vendor directories are not included in the module zip files downloaded by 'go install'.) If the arguments don't have version suffixes, "go install" may run in module-aware mode or GOPATH mode, depending on the GO111MODULE environment variable and the presence of a go.mod file. See 'go help modules' for details. If module-aware mode is enabled, "go install" runs in the context of the main module. When module-aware mode is disabled, non-main packages are installed in the directory $GOPATH/pkg/$GOOS_$GOARCH. When module-aware mode is enabled, non-main packages are built and cached but not installed. Before Go 1.20, the standard library was installed to $GOROOT/pkg/$GOOS_$GOARCH. Starting in Go 1.20, the standard library is built and cached but not installed. Setting GODEBUG=installgoroot=all restores the use of $GOROOT/pkg/$GOOS_$GOARCH. For more about build flags, see 'go help build'. For more about specifying packages, see 'go help packages'. See also: go build, go get, go clean. Usage: List lists the named packages, one per line. The most commonly-used flags are -f and -json, which control the form of the output printed for each package. Other list flags, documented below, control more specific details. The default output shows the package import path: The -f flag specifies an alternate format for the list, using the syntax of package template. The default output is equivalent to -f '{{.ImportPath}}'. The struct being passed to the template is: Packages stored in vendor directories report an ImportPath that includes the path to the vendor directory (for example, "d/vendor/p" instead of "p"), so that the ImportPath uniquely identifies a given copy of a package. The Imports, Deps, TestImports, and XTestImports lists also contain these expanded import paths. See golang.org/s/go15vendor for more about vendoring. The error information, if any, is The module information is a Module struct, defined in the discussion of list -m below. The template function "join" calls strings.Join. The template function "context" returns the build context, defined as: For more information about the meaning of these fields see the documentation for the go/build package's Context type. The -json flag causes the package data to be printed in JSON format instead of using the template format. The JSON flag can optionally be provided with a set of comma-separated required field names to be output. If so, those required fields will always appear in JSON output, but others may be omitted to save work in computing the JSON struct. The -compiled flag causes list to set CompiledGoFiles to the Go source files presented to the compiler. Typically this means that it repeats the files listed in GoFiles and then also adds the Go code generated by processing CgoFiles and SwigFiles. The Imports list contains the union of all imports from both GoFiles and CompiledGoFiles. The -deps flag causes list to iterate over not just the named packages but also all their dependencies. It visits them in a depth-first post-order traversal, so that a package is listed only after all its dependencies. Packages not explicitly listed on the command line will have the DepOnly field set to true. The -e flag changes the handling of erroneous packages, those that cannot be found or are malformed. By default, the list command prints an error to standard error for each erroneous package and omits the packages from consideration during the usual printing. With the -e flag, the list command never prints errors to standard error and instead processes the erroneous packages with the usual printing. Erroneous packages will have a non-empty ImportPath and a non-nil Error field; other information may or may not be missing (zeroed). The -export flag causes list to set the Export field to the name of a file containing up-to-date export information for the given package, and the BuildID field to the build ID of the compiled package. The -find flag causes list to identify the named packages but not resolve their dependencies: the Imports and Deps lists will be empty. With the -find flag, the -deps, -test and -export commands cannot be used. The -test flag causes list to report not only the named packages but also their test binaries (for packages with tests), to convey to source code analysis tools exactly how test binaries are constructed. The reported import path for a test binary is the import path of the package followed by a ".test" suffix, as in "math/rand.test". When building a test, it is sometimes necessary to rebuild certain dependencies specially for that test (most commonly the tested package itself). The reported import path of a package recompiled for a particular test binary is followed by a space and the name of the test binary in brackets, as in "math/rand math/rand.test" or "regexp [sort.test]". The ForTest field is also set to the name of the package being tested ("math/rand" or "sort" in the previous examples). The Dir, Target, Shlib, Root, ConflictDir, and Export file paths are all absolute paths. By default, the lists GoFiles, CgoFiles, and so on hold names of files in Dir (that is, paths relative to Dir, not absolute paths). The generated files added when using the -compiled and -test flags are absolute paths referring to cached copies of generated Go source files. Although they are Go source files, the paths may not end in ".go". The -m flag causes list to list modules instead of packages. When listing modules, the -f flag still specifies a format template applied to a Go struct, but now a Module struct: The file GoMod refers to may be outside the module directory if the module is in the module cache or if the -modfile flag is used. The default output is to print the module path and then information about the version and replacement if any. For example, 'go list -m all' might print: The Module struct has a String method that formats this line of output, so that the default format is equivalent to -f '{{.String}}'. Note that when a module has been replaced, its Replace field describes the replacement module, and its Dir field is set to the replacement's source code, if present. (That is, if Replace is non-nil, then Dir is set to Replace.Dir, with no access to the replaced source code.) The -u flag adds information about available upgrades. When the latest version of a given module is newer than the current one, list -u sets the Module's Update field to information about the newer module. list -u will also set the module's Retracted field if the current version is retracted. The Module's String method indicates an available upgrade by formatting the newer version in brackets after the current version. If a version is retracted, the string "(retracted)" will follow it. For example, 'go list -m -u all' might print: (For tools, 'go list -m -u -json all' may be more convenient to parse.) The -versions flag causes list to set the Module's Versions field to a list of all known versions of that module, ordered according to semantic versioning, earliest to latest. The flag also changes the default output format to display the module path followed by the space-separated version list. The -retracted flag causes list to report information about retracted module versions. When -retracted is used with -f or -json, the Retracted field explains why the version was retracted. The strings are taken from comments on the retract directive in the module's go.mod file. When -retracted is used with -versions, retracted versions are listed together with unretracted versions. The -retracted flag may be used with or without -m. The arguments to list -m are interpreted as a list of modules, not packages. The main module is the module containing the current directory. The active modules are the main module and its dependencies. With no arguments, list -m shows the main module. With arguments, list -m shows the modules specified by the arguments. Any of the active modules can be specified by its module path. The special pattern "all" specifies all the active modules, first the main module and then dependencies sorted by module path. A pattern containing "..." specifies the active modules whose module paths match the pattern. A query of the form path@version specifies the result of that query, which is not limited to active modules. See 'go help modules' for more about module queries. The template function "module" takes a single string argument that must be a module path or query and returns the specified module as a Module struct. If an error occurs, the result will be a Module struct with a non-nil Error field. When using -m, the -reuse=old.json flag accepts the name of file containing the JSON output of a previous 'go list -m -json' invocation with the same set of modifier flags (such as -u, -retracted, and -versions). The go command may use this file to determine that a module is unchanged since the previous invocation and avoid redownloading information about it. Modules that are not redownloaded will be marked in the new output by setting the Reuse field to true. Normally the module cache provides this kind of reuse automatically; the -reuse flag can be useful on systems that do not preserve the module cache. For more about build flags, see 'go help build'. For more about specifying packages, see 'go help packages'. For more about modules, see https://golang.org/ref/mod. Go mod provides access to operations on modules. Note that support for modules is built into all the go commands, not just 'go mod'. For example, day-to-day adding, removing, upgrading, and downgrading of dependencies should be done using 'go get'. See 'go help modules' for an overview of module functionality. Usage: The commands are: Use "go help mod <command>" for more information about a command. Usage: Download downloads the named modules, which can be module patterns selecting dependencies of the main module or module queries of the form path@version. With no arguments, download applies to the modules needed to build and test the packages in the main module: the modules explicitly required by the main module if it is at 'go 1.17' or higher, or all transitively-required modules if at 'go 1.16' or lower. The go command will automatically download modules as needed during ordinary execution. The "go mod download" command is useful mainly for pre-filling the local cache or to compute the answers for a Go module proxy. By default, download writes nothing to standard output. It may print progress messages and errors to standard error. The -json flag causes download to print a sequence of JSON objects to standard output, describing each downloaded module (or failure), corresponding to this Go struct: The -reuse flag accepts the name of file containing the JSON output of a previous 'go mod download -json' invocation. The go command may use this file to determine that a module is unchanged since the previous invocation and avoid redownloading it. Modules that are not redownloaded will be marked in the new output by setting the Reuse field to true. Normally the module cache provides this kind of reuse automatically; the -reuse flag can be useful on systems that do not preserve the module cache. The -x flag causes download to print the commands download executes. See https://golang.org/ref/mod#go-mod-download for more about 'go mod download'. See https://golang.org/ref/mod#version-queries for more about version queries. Usage: Edit provides a command-line interface for editing go.mod, for use primarily by tools or scripts. It reads only go.mod; it does not look up information about the modules involved. By default, edit reads and writes the go.mod file of the main module, but a different target file can be specified after the editing flags. The editing flags specify a sequence of editing operations. The -fmt flag reformats the go.mod file without making other changes. This reformatting is also implied by any other modifications that use or rewrite the go.mod file. The only time this flag is needed is if no other flags are specified, as in 'go mod edit -fmt'. The -module flag changes the module's path (the go.mod file's module line). The -godebug=key=value flag adds a godebug key=value line, replacing any existing godebug lines with the given key. The -dropgodebug=key flag drops any existing godebug lines with the given key. The -require=path@version and -droprequire=path flags add and drop a requirement on the given module path and version. Note that -require overrides any existing requirements on path. These flags are mainly for tools that understand the module graph. Users should prefer 'go get path@version' or 'go get path@none', which make other go.mod adjustments as needed to satisfy constraints imposed by other modules. The -go=version flag sets the expected Go language version. This flag is mainly for tools that understand Go version dependencies. Users should prefer 'go get go@version'. The -toolchain=version flag sets the Go toolchain to use. This flag is mainly for tools that understand Go version dependencies. Users should prefer 'go get toolchain@version'. The -exclude=path@version and -dropexclude=path@version flags add and drop an exclusion for the given module path and version. Note that -exclude=path@version is a no-op if that exclusion already exists. The -replace=old[@v]=new[@v] flag adds a replacement of the given module path and version pair. If the @v in old@v is omitted, a replacement without a version on the left side is added, which applies to all versions of the old module path. If the @v in new@v is omitted, the new path should be a local module root directory, not a module path. Note that -replace overrides any redundant replacements for old[@v], so omitting @v will drop existing replacements for specific versions. The -dropreplace=old[@v] flag drops a replacement of the given module path and version pair. If the @v is omitted, a replacement without a version on the left side is dropped. The -retract=version and -dropretract=version flags add and drop a retraction on the given version. The version may be a single version like "v1.2.3" or a closed interval like "[v1.1.0,v1.1.9]". Note that -retract=version is a no-op if that retraction already exists. The -tool=path and -droptool=path flags add and drop a tool declaration for the given path. The -ignore=path and -dropignore=path flags add and drop a ignore declaration for the given path. The -godebug, -dropgodebug, -require, -droprequire, -exclude, -dropexclude, -replace, -dropreplace, -retract, -dropretract, -tool, -droptool, -ignore, and -dropignore editing flags may be repeated, and the changes are applied in the order given. The -print flag prints the final go.mod in its text format instead of writing it back to go.mod. The -json flag prints the final go.mod file in JSON format instead of writing it back to go.mod. The JSON output corresponds to these Go types: Retract entries representing a single version (not an interval) will have the "Low" and "High" fields set to the same value. Note that this only describes the go.mod file itself, not other modules referred to indirectly. For the full set of modules available to a build, use 'go list -m -json all'. Edit also provides the -C, -n, and -x build flags. See https://golang.org/ref/mod#go-mod-edit for more about 'go mod edit'. Usage: Graph prints the module requirement graph (with replacements applied) in text form. Each line in the output has two space-separated fields: a module and one of its requirements. Each module is identified as a string of the form path@version, except for the main module, which has no @version suffix. The -go flag causes graph to report the module graph as loaded by the given Go version, instead of the version indicated by the 'go' directive in the go.mod file. The -x flag causes graph to print the commands graph executes. See https://golang.org/ref/mod#go-mod-graph for more about 'go mod graph'. Usage: Init initializes and writes a new go.mod file in the current directory, in effect creating a new module rooted at the current directory. The go.mod file must not already exist. Init accepts one optional argument, the module path for the new module. If the module path argument is omitted, init will attempt to infer the module path using import comments in .go files and the current directory (if in GOPATH). See https://golang.org/ref/mod#go-mod-init for more about 'go mod init'. Usage: Tidy makes sure go.mod matches the source code in the module. It adds any missing modules necessary to build the current module's packages and dependencies, and it removes unused modules that don't provide any relevant packages. It also adds any missing entries to go.sum and removes any unnecessary ones. The -v flag causes tidy to print information about removed modules to standard error. The -e flag causes tidy to attempt to proceed despite errors encountered while loading packages. The -diff flag causes tidy not to modify go.mod or go.sum but instead print the necessary changes as a unified diff. It exits with a non-zero code if the diff is not empty. The -go flag causes tidy to update the 'go' directive in the go.mod file to the given version, which may change which module dependencies are retained as explicit requirements in the go.mod file. (Go versions 1.17 and higher retain more requirements in order to support lazy module loading.) The -compat flag preserves any additional checksums needed for the 'go' command from the indicated major Go release to successfully load the module graph, and causes tidy to error out if that version of the 'go' command would load any imported package from a different module version. By default, tidy acts as if the -compat flag were set to the version prior to the one indicated by the 'go' directive in the go.mod file. The -x flag causes tidy to print the commands download executes. See https://golang.org/ref/mod#go-mod-tidy for more about 'go mod tidy'. Usage: Vendor resets the main module's vendor directory to include all packages needed to build and test all the main module's packages. It does not include test code for vendored packages. The -v flag causes vendor to print the names of vendored modules and packages to standard error. The -e flag causes vendor to attempt to proceed despite errors encountered while loading packages. The -o flag causes vendor to create the vendor directory at the given path instead of "vendor". The go command can only use a vendor directory named "vendor" within the module root directory, so this flag is primarily useful for other tools. See https://golang.org/ref/mod#go-mod-vendor for more about 'go mod vendor'. Usage: Verify checks that the dependencies of the current module, which are stored in a local downloaded source cache, have not been modified since being downloaded. If all the modules are unmodified, verify prints "all modules verified." Otherwise it reports which modules have been changed and causes 'go mod' to exit with a non-zero status. See https://golang.org/ref/mod#go-mod-verify for more about 'go mod verify'. Usage: Why shows a shortest path in the import graph from the main module to each of the listed packages. If the -m flag is given, why treats the arguments as a list of modules and finds a path to any package in each of the modules. By default, why queries the graph of packages matched by "go list all", which includes tests for reachable packages. The -vendor flag causes why to exclude tests of dependencies. The output is a sequence of stanzas, one for each package or module name on the command line, separated by blank lines. Each stanza begins with a comment line "# package" or "# module" giving the target package or module. Subsequent lines give a path through the import graph, one package per line. If the package or module is not referenced from the main module, the stanza will display a single parenthesized note indicating that fact. For example: See https://golang.org/ref/mod#go-mod-why for more about 'go mod why'. Work provides access to operations on workspaces. Note that support for workspaces is built into many other commands, not just 'go work'. See 'go help modules' for information about Go's module system of which workspaces are a part. See https://go.dev/ref/mod#workspaces for an in-depth reference on workspaces. See https://go.dev/doc/tutorial/workspaces for an introductory tutorial on workspaces. A workspace is specified by a go.work file that specifies a set of module directories with the "use" directive. These modules are used as root modules by the go command for builds and related operations. A workspace that does not specify modules to be used cannot be used to do builds from local modules. go.work files are line-oriented. Each line holds a single directive, made up of a keyword followed by arguments. For example: The leading keyword can be factored out of adjacent lines to create a block, like in Go imports. The use directive specifies a module to be included in the workspace's set of main modules. The argument to the use directive is the directory containing the module's go.mod file. The go directive specifies the version of Go the file was written at. It is possible there may be future changes in the semantics of workspaces that could be controlled by this version, but for now the version specified has no effect. The replace directive has the same syntax as the replace directive in a go.mod file and takes precedence over replaces in go.mod files. It is primarily intended to override conflicting replaces in different workspace modules. To determine whether the go command is operating in workspace mode, use the "go env GOWORK" command. This will specify the workspace file being used. Usage: The commands are: Use "go help work <command>" for more information about a command. Usage: Edit provides a command-line interface for editing go.work, for use primarily by tools or scripts. It only reads go.work; it does not look up information about the modules involved. If no file is specified, Edit looks for a go.work file in the current directory and its parent directories The editing flags specify a sequence of editing operations. The -fmt flag reformats the go.work file without making other changes. This reformatting is also implied by any other modifications that use or rewrite the go.mod file. The only time this flag is needed is if no other flags are specified, as in 'go work edit -fmt'. The -godebug=key=value flag adds a godebug key=value line, replacing any existing godebug lines with the given key. The -dropgodebug=key flag drops any existing godebug lines with the given key. The -use=path and -dropuse=path flags add and drop a use directive from the go.work file's set of module directories. The -replace=old[@v]=new[@v] flag adds a replacement of the given module path and version pair. If the @v in old@v is omitted, a replacement without a version on the left side is added, which applies to all versions of the old module path. If the @v in new@v is omitted, the new path should be a local module root directory, not a module path. Note that -replace overrides any redundant replacements for old[@v], so omitting @v will drop existing replacements for specific versions. The -dropreplace=old[@v] flag drops a replacement of the given module path and version pair. If the @v is omitted, a replacement without a version on the left side is dropped. The -use, -dropuse, -replace, and -dropreplace, editing flags may be repeated, and the changes are applied in the order given. The -go=version flag sets the expected Go language version. The -toolchain=name flag sets the Go toolchain to use. The -print flag prints the final go.work in its text format instead of writing it back to go.mod. The -json flag prints the final go.work file in JSON format instead of writing it back to go.mod. The JSON output corresponds to these Go types: See the workspaces reference at https://go.dev/ref/mod#workspaces for more information. Usage: Init initializes and writes a new go.work file in the current directory, in effect creating a new workspace at the current directory. go work init optionally accepts paths to the workspace modules as arguments. If the argument is omitted, an empty workspace with no modules will be created. Each argument path is added to a use directive in the go.work file. The current go version will also be listed in the go.work file. See the workspaces reference at https://go.dev/ref/mod#workspaces for more information. Usage: Sync syncs the workspace's build list back to the workspace's modules The workspace's build list is the set of versions of all the (transitive) dependency modules used to do builds in the workspace. go work sync generates that build list using the Minimal Version Selection algorithm, and then syncs those versions back to each of modules specified in the workspace (with use directives). The syncing is done by sequentially upgrading each of the dependency modules specified in a workspace module to the version in the build list if the dependency module's version is not already the same as the build list's version. Note that Minimal Version Selection guarantees that the build list's version of each module is always the same or higher than that in each workspace module. See the workspaces reference at https://go.dev/ref/mod#workspaces for more information. Usage: Use provides a command-line interface for adding directories, optionally recursively, to a go.work file. A use directive will be added to the go.work file for each argument directory listed on the command line go.work file, if it exists, or removed from the go.work file if it does not exist. Use fails if any remaining use directives refer to modules that do not exist. Use updates the go line in go.work to specify a version at least as new as all the go lines in the used modules, both preexisting ones and newly added ones. With no arguments, this update is the only thing that go work use does. The -r flag searches recursively for modules in the argument directories, and the use command operates as if each of the directories were specified as arguments. See the workspaces reference at https://go.dev/ref/mod#workspaces for more information. Usage: Vendor resets the workspace's vendor directory to include all packages needed to build and test all the workspace's packages. It does not include test code for vendored packages. The -v flag causes vendor to print the names of vendored modules and packages to standard error. The -e flag causes vendor to attempt to proceed despite errors encountered while loading packages. The -o flag causes vendor to create the vendor directory at the given path instead of "vendor". The go command can only use a vendor directory named "vendor" within the module root directory, so this flag is primarily useful for other tools. Usage: Run compiles and runs the named main Go package. Typically the package is specified as a list of .go source files from a single directory, but it may also be an import path, file system path, or pattern matching a single known package, as in 'go run .' or 'go run my/cmd'. If the package argument has a version suffix (like @latest or @v1.0.0), "go run" builds the program in module-aware mode, ignoring the go.mod file in the current directory or any parent directory, if there is one. This is useful for running programs without affecting the dependencies of the main module. If the package argument doesn't have a version suffix, "go run" may run in module-aware mode or GOPATH mode, depending on the GO111MODULE environment variable and the presence of a go.mod file. See 'go help modules' for details. If module-aware mode is enabled, "go run" runs in the context of the main module. By default, 'go run' runs the compiled binary directly: 'a.out arguments...'. If the -exec flag is given, 'go run' invokes the binary using xprog: If the -exec flag is not given, GOOS or GOARCH is different from the system default, and a program named go_$GOOS_$GOARCH_exec can be found on the current search path, 'go run' invokes the binary using that program, for example 'go_js_wasm_exec a.out arguments...'. This allows execution of cross-compiled programs when a simulator or other execution method is available. By default, 'go run' compiles the binary without generating the information used by debuggers, to reduce build time. To include debugger information in the binary, use 'go build'. The exit status of Run is not the exit status of the compiled binary. For more about build flags, see 'go help build'. For more about specifying packages, see 'go help packages'. See also: go build. Usage: Telemetry is used to manage Go telemetry data and settings. Telemetry can be in one of three modes: off, local, or on. When telemetry is in local mode, counter data is written to the local file system, but will not be uploaded to remote servers. When telemetry is off, local counter data is neither collected nor uploaded. When telemetry is on, telemetry data is written to the local file system and periodically sent to https://telemetry.go.dev/. Uploaded data is used to help improve the Go toolchain and related tools, and it will be published as part of a public dataset. For more details, see https://telemetry.go.dev/privacy. This data is collected in accordance with the Google Privacy Policy (https://policies.google.com/privacy). To view the current telemetry mode, run "go telemetry". To disable telemetry uploading, but keep local data collection, run "go telemetry local". To enable both collection and uploading, run “go telemetry on”. To disable both collection and uploading, run "go telemetry off". The current telemetry mode is also available as the value of the non-settable "GOTELEMETRY" go env variable. The directory in the local file system that telemetry data is written to is available as the value of the non-settable "GOTELEMETRYDIR" go env variable. See https://go.dev/doc/telemetry for more information on telemetry. Usage: 'Go test' automates testing the packages named by the import paths. It prints a summary of the test results in the format: followed by detailed output for each failed package. 'Go test' recompiles each package along with any files with names matching the file pattern "*_test.go". These additional files can contain test functions, benchmark functions, fuzz tests and example functions. See 'go help testfunc' for more. Each listed package causes the execution of a separate test binary. Files whose names begin with "_" (including "_test.go") or "." are ignored. Test files that declare a package with the suffix "_test" will be compiled as a separate package, and then linked and run with the main test binary. The go tool will ignore a directory named "testdata", making it available to hold ancillary data needed by the tests. As part of building a test binary, go test runs go vet on the package and its test source files to identify significant problems. If go vet finds any problems, go test reports those and does not run the test binary. Only a high-confidence subset of the default go vet checks are used. That subset is: atomic, bool, buildtags, directive, errorsas, ifaceassert, nilfunc, printf, stringintconv, and tests. You can see the documentation for these and other vet tests via "go doc cmd/vet". To disable the running of go vet, use the -vet=off flag. To run all checks, use the -vet=all flag. All test output and summary lines are printed to the go command's standard output, even if the test printed them to its own standard error. (The go command's standard error is reserved for printing errors building the tests.) The go command places $GOROOT/bin at the beginning of $PATH in the test's environment, so that tests that execute 'go' commands use the same 'go' as the parent 'go test' command. Go test runs in two different modes: The first, called local directory mode, occurs when go test is invoked with no package arguments (for example, 'go test' or 'go test -v'). In this mode, go test compiles the package sources and tests found in the current directory and then runs the resulting test binary. In this mode, caching (discussed below) is disabled. After the package test finishes, go test prints a summary line showing the test status ('ok' or 'FAIL'), package name, and elapsed time. The second, called package list mode, occurs when go test is invoked with explicit package arguments (for example 'go test math', 'go test ./...', and even 'go test .'). In this mode, go test compiles and tests each of the packages listed on the command line. If a package test passes, go test prints only the final 'ok' summary line. If a package test fails, go test prints the full test output. If invoked with the -bench or -v flag, go test prints the full output even for passing package tests, in order to display the requested benchmark results or verbose logging. After the package tests for all of the listed packages finish, and their output is printed, go test prints a final 'FAIL' status if any package test has failed. In package list mode only, go test caches successful package test results to avoid unnecessary repeated running of tests. When the result of a test can be recovered from the cache, go test will redisplay the previous output instead of running the test binary again. When this happens, go test prints '(cached)' in place of the elapsed time in the summary line. The rule for a match in the cache is that the run involves the same test binary and the flags on the command line come entirely from a restricted set of 'cacheable' test flags, defined as -benchtime, -coverprofile, -cpu, -failfast, -fullpath, -list, -outputdir, -parallel, -run, -short, -skip, -timeout and -v. If a run of go test has any test or non-test flags outside this set, the result is not cached. To disable test caching, use any test flag or argument other than the cacheable flags. The idiomatic way to disable test caching explicitly is to use -count=1. Tests that open files within the package's module or that consult environment variables only match future runs in which the files and environment variables are unchanged. A cached test result is treated as executing in no time at all, so a successful package test result will be cached and reused regardless of -timeout setting. In addition to the build flags, the flags handled by 'go test' itself are: The test binary also accepts flags that control execution of the test; these flags are also accessible by 'go test'. See 'go help testflag' for details. For more about build flags, see 'go help build'. For more about specifying packages, see 'go help packages'. See also: go build, go vet. Usage: Tool runs the go tool command identified by the arguments. Go ships with a number of builtin tools, and additional tools may be defined in the go.mod of the current module. With no arguments it prints the list of known tools. The -n flag causes tool to print the command that would be executed but not execute it. The -modfile=file.mod build flag causes tool to use an alternate file instead of the go.mod in the module root directory. Tool also provides the -C, -overlay, and -modcacherw build flags. For more about build flags, see 'go help build'. For more about each builtin tool command, see 'go doc cmd/<command>'. Usage: Version prints the build information for Go binary files. Go version reports the Go version used to build each of the named files. If no files are named on the command line, go version prints its own version information. If a directory is named, go version walks that directory, recursively, looking for recognized Go binaries and reporting their versions. By default, go version does not report unrecognized files found during a directory scan. The -v flag causes it to report unrecognized files. The -m flag causes go version to print each file's embedded module version information, when available. In the output, the module information consists of multiple lines following the version line, each indented by a leading tab character. The -json flag is similar to -m but outputs the runtime/debug.BuildInfo in JSON format. If flag -json is specified without -m, go version reports an error. See also: go doc runtime/debug.BuildInfo. Usage: Vet runs the Go vet tool (cmd/vet) on the named packages and reports diagnostics. It supports these flags: The -vettool=prog flag selects a different analysis tool with alternative or additional checks. For example, the 'shadow' analyzer can be built and run using these commands: Alternative vet tools should be built atop golang.org/x/tools/go/analysis/unitchecker, which handles the interaction with go vet. The default vet tool is 'go tool vet' or cmd/vet. For help on its checkers and their flags, run 'go tool vet help'. For details of a specific checker such as 'printf', see 'go tool vet help printf'. For more about specifying packages, see 'go help packages'. The build flags supported by go vet are those that control package resolution and execution, such as -C, -n, -x, -v, -tags, and -toolexec. For more about these flags, see 'go help build'. See also: go fmt, go fix. A build constraint, also known as a build tag, is a condition under which a file should be included in the package. Build constraints are given by a line comment that begins Build constraints can also be used to downgrade the language version used to compile a file. Constraints may appear in any kind of source file (not just Go), but they must appear near the top of the file, preceded only by blank lines and other comments. These rules mean that in Go files a build constraint must appear before the package clause. To distinguish build constraints from package documentation, a build constraint should be followed by a blank line. A build constraint comment is evaluated as an expression containing build tags combined by ||, &&, and ! operators and parentheses. Operators have the same meaning as in Go. For example, the following build constraint constrains a file to build when the "linux" and "386" constraints are satisfied, or when "darwin" is satisfied and "cgo" is not: It is an error for a file to have more than one //go:build line. During a particular build, the following build tags are satisfied: There are no separate build tags for beta or minor releases. If a file's name, after stripping the extension and a possible _test suffix, matches any of the following patterns: (example: source_windows_amd64.go) where GOOS and GOARCH represent any known operating system and architecture values respectively, then the file is considered to have an implicit build constraint requiring those terms (in addition to any explicit constraints in the file). Using GOOS=android matches build tags and files as for GOOS=linux in addition to android tags and files. Using GOOS=illumos matches build tags and files as for GOOS=solaris in addition to illumos tags and files. Using GOOS=ios matches build tags and files as for GOOS=darwin in addition to ios tags and files. The defined architecture feature build tags are: For GOARCH=amd64, arm, ppc64, ppc64le, and riscv64, a particular feature level sets the feature build tags for all previous levels as well. For example, GOAMD64=v2 sets the amd64.v1 and amd64.v2 feature flags. This ensures that code making use of v2 features continues to compile when, say, GOAMD64=v4 is introduced. Code handling the absence of a particular feature level should use a negation: To keep a file from being considered for any build: (Any other unsatisfied word will work as well, but "ignore" is conventional.) To build a file only when using cgo, and only on Linux and OS X: Such a file is usually paired with another file implementing the default functionality for other systems, which in this case would carry the constraint: Naming a file dns_windows.go will cause it to be included only when building the package for Windows; similarly, math_386.s will be included only when building the package for 32-bit x86. By convention, packages with assembly implementations may provide a go-only version under the "purego" build constraint. This does not limit the use of cgo (use the "cgo" build constraint) or unsafe. For example: Go versions 1.16 and earlier used a different syntax for build constraints, with a "// +build" prefix. The gofmt command will add an equivalent //go:build constraint when encountering the older syntax. In modules with a Go version of 1.21 or later, if a file's build constraint has a term for a Go major release, the language version used when compiling the file will be the minimum version implied by the build constraint. The 'go build', 'go install', and 'go test' commands take a -json flag that reports build output and failures as structured JSON output on standard output. The JSON stream is a newline-separated sequence of BuildEvent objects corresponding to the Go struct: The ImportPath field gives the package ID of the package being built. This matches the Package.ImportPath field of go list -json and the TestEvent.FailedBuild field of go test -json. Note that it does not match TestEvent.Package. The Action field is one of the following: The Output field is set for Action == "build-output" and is a portion of the build's output. The concatenation of the Output fields of all output events is the exact output of the build. A single event may contain one or more lines of output and there may be more than one output event for a given ImportPath. This matches the definition of the TestEvent.Output field produced by go test -json. For go test -json, this struct is designed so that parsers can distinguish interleaved TestEvents and BuildEvents by inspecting the Action field. Furthermore, as with TestEvent, parsers can simply concatenate the Output fields of all events to reconstruct the text format output, as it would have appeared from go build without the -json flag. Note that there may also be non-JSON error text on standard error, even with the -json flag. Typically, this indicates an early, serious error. Consumers should be robust to this. The 'go build' and 'go install' commands take a -buildmode argument which indicates which kind of object file is to be built. Currently supported values are: On AIX, when linking a C program that uses a Go archive built with -buildmode=c-archive, you must pass -Wl,-bnoobjreorder to the C compiler. There are two different ways to call between Go and C/C++ code. The first is the cgo tool, which is part of the Go distribution. For information on how to use it see the cgo documentation (go doc cmd/cgo). The second is the SWIG program, which is a general tool for interfacing between languages. For information on SWIG see https://swig.org/. When running go build, any file with a .swig extension will be passed to SWIG. Any file with a .swigcxx extension will be passed to SWIG with the -c++ option. A package can't be just a .swig or .swigcxx file; there must be at least one .go file, even if it has just a package clause. When either cgo or SWIG is used, go build will pass any .c, .m, .s, .S or .sx files to the C compiler, and any .cc, .cpp, .cxx files to the C++ compiler. The CC or CXX environment variables may be set to determine the C or C++ compiler, respectively, to use. The go command caches build outputs for reuse in future builds. The default location for cache data is a subdirectory named go-build in the standard user cache directory for the current operating system. The cache is safe for concurrent invocations of the go command. Setting the GOCACHE environment variable overrides this default, and running 'go env GOCACHE' prints the current cache directory. The go command periodically deletes cached data that has not been used recently. Running 'go clean -cache' deletes all cached data. The build cache correctly accounts for changes to Go source files, compilers, compiler options, and so on: cleaning the cache explicitly should not be necessary in typical use. However, the build cache does not detect changes to C libraries imported with cgo. If you have made changes to the C libraries on your system, you will need to clean the cache explicitly or else use the -a build flag (see 'go help build') to force rebuilding of packages that depend on the updated C libraries. The go command also caches successful package test results. See 'go help test' for details. Running 'go clean -testcache' removes all cached test results (but not cached build results). The go command also caches values used in fuzzing with 'go test -fuzz', specifically, values that expanded code coverage when passed to a fuzz function. These values are not used for regular building and testing, but they're stored in a subdirectory of the build cache. Running 'go clean -fuzzcache' removes all cached fuzzing values. This may make fuzzing less effective, temporarily. The GODEBUG environment variable can enable printing of debugging information about the state of the cache: GODEBUG=gocacheverify=1 causes the go command to bypass the use of any cache entries and instead rebuild everything and check that the results match existing cache entries. GODEBUG=gocachehash=1 causes the go command to print the inputs for all of the content hashes it uses to construct cache lookup keys. The output is voluminous but can be useful for debugging the cache. GODEBUG=gocachetest=1 causes the go command to print details of its decisions about whether to reuse a cached test result. The go command and the tools it invokes consult environment variables for configuration. If an environment variable is unset or empty, the go command uses a sensible default setting. To see the effective setting of the variable <NAME>, run 'go env <NAME>'. To change the default setting, run 'go env -w <NAME>=<VALUE>'. Defaults changed using 'go env -w' are recorded in a Go environment configuration file stored in the per-user configuration directory, as reported by os.UserConfigDir. The location of the configuration file can be changed by setting the environment variable GOENV, and 'go env GOENV' prints the effective location, but 'go env -w' cannot change the default location. See 'go help env' for details. General-purpose environment variables: Environment variables for use with cgo: Architecture-specific environment variables: Environment variables for use with code coverage: Special-purpose environment variables: Additional information available from 'go env' but not read from the environment: The go command examines the contents of a restricted set of files in each directory. It identifies which files to examine based on the extension of the file name. These extensions are: Files of each of these types except .syso may contain build constraints, but the go command stops scanning for build constraints at the first item in the file that is not a blank line or //-style line comment. See the go/build package documentation for more details. GOAUTH is a semicolon-separated list of authentication commands for go-import and HTTPS module mirror interactions. The default is netrc. The supported authentication commands are: off netrc git dir command Before the first HTTPS fetch, the go command will invoke each GOAUTH command in the list with no additional arguments and no input. If the server responds with any 4xx code, the go command will invoke the GOAUTH commands again with the URL as an additional command-line argument and the HTTP Response to the program's stdin. If the server responds with an error again, the fetch fails: a URL-specific GOAUTH will only be attempted once per fetch. A module version is defined by a tree of source files, with a go.mod file in its root. When the go command is run, it looks in the current directory and then successive parent directories to find the go.mod marking the root of the main (current) module. The go.mod file format is described in detail at https://golang.org/ref/mod#go-mod-file. To create a new go.mod file, use 'go mod init'. For details see 'go help mod init' or https://golang.org/ref/mod#go-mod-init. To add missing module requirements or remove unneeded requirements, use 'go mod tidy'. For details, see 'go help mod tidy' or https://golang.org/ref/mod#go-mod-tidy. To add, upgrade, downgrade, or remove a specific module requirement, use 'go get'. For details, see 'go help module-get' or https://golang.org/ref/mod#go-get. To make other changes or to parse go.mod as JSON for use by other tools, use 'go mod edit'. See 'go help mod edit' or https://golang.org/ref/mod#go-mod-edit. The Go path is used to resolve import statements. It is implemented by and documented in the go/build package. The GOPATH environment variable lists places to look for Go code. On Unix, the value is a colon-separated string. On Windows, the value is a semicolon-separated string. On Plan 9, the value is a list. If the environment variable is unset, GOPATH defaults to a subdirectory named "go" in the user's home directory ($HOME/go on Unix, %USERPROFILE%\go on Windows), unless that directory holds a Go distribution. Run "go env GOPATH" to see the current GOPATH. See https://golang.org/wiki/SettingGOPATH to set a custom GOPATH. Each directory listed in GOPATH must have a prescribed structure: The src directory holds source code. The path below src determines the import path or executable name. The pkg directory holds installed package objects. As in the Go tree, each target operating system and architecture pair has its own subdirectory of pkg (pkg/GOOS_GOARCH). If DIR is a directory listed in the GOPATH, a package with source in DIR/src/foo/bar can be imported as "foo/bar" and has its compiled form installed to "DIR/pkg/GOOS_GOARCH/foo/bar.a". The bin directory holds compiled commands. Each command is named for its source directory, but only the final element, not the entire path. That is, the command with source in DIR/src/foo/quux is installed into DIR/bin/quux, not DIR/bin/foo/quux. The "foo/" prefix is stripped so that you can add DIR/bin to your PATH to get at the installed commands. If the GOBIN environment variable is set, commands are installed to the directory it names instead of DIR/bin. GOBIN must be an absolute path. Here's an example directory layout: Go searches each directory listed in GOPATH to find source code, but new packages are always downloaded into the first directory in the list. See https://golang.org/doc/code.html for an example. When using modules, GOPATH is no longer used for resolving imports. However, it is still used to store downloaded source code (in GOPATH/pkg/mod) and compiled commands (in GOPATH/bin). Code in or below a directory named "internal" is importable only by code in the directory tree rooted at the parent of "internal". Here's an extended version of the directory layout above: The code in z.go is imported as "foo/internal/baz", but that import statement can only appear in source files in the subtree rooted at foo. The source files foo/f.go, foo/bar/x.go, and foo/quux/y.go can all import "foo/internal/baz", but the source file crash/bang/b.go cannot. See https://golang.org/s/go14internal for details. Go 1.6 includes support for using local copies of external dependencies to satisfy imports of those dependencies, often referred to as vendoring. Code below a directory named "vendor" is importable only by code in the directory tree rooted at the parent of "vendor", and only using an import path that omits the prefix up to and including the vendor element. Here's the example from the previous section, but with the "internal" directory renamed to "vendor" and a new foo/vendor/crash/bang directory added: The same visibility rules apply as for internal, but the code in z.go is imported as "baz", not as "foo/vendor/baz". Code in vendor directories deeper in the source tree shadows code in higher directories. Within the subtree rooted at foo, an import of "crash/bang" resolves to "foo/vendor/crash/bang", not the top-level "crash/bang". Code in vendor directories is not subject to import path checking (see 'go help importpath'). When 'go get' checks out or updates a git repository, it now also updates submodules. Vendor directories do not affect the placement of new repositories being checked out for the first time by 'go get': those are always placed in the main GOPATH, never in a vendor subtree. See https://golang.org/s/go15vendor for details. A Go module proxy is any web server that can respond to GET requests for URLs of a specified form. The requests have no query parameters, so even a site serving from a fixed file system (including a file:/// URL) can be a module proxy. For details on the GOPROXY protocol, see https://golang.org/ref/mod#goproxy-protocol. An import path (see 'go help packages') denotes a package stored in the local file system. In general, an import path denotes either a standard package (such as "unicode/utf8") or a package found in one of the work spaces (For more details see: 'go help gopath'). An import path beginning with ./ or ../ is called a relative path. The toolchain supports relative import paths as a shortcut in two ways. First, a relative path can be used as a shorthand on the command line. If you are working in the directory containing the code imported as "unicode" and want to run the tests for "unicode/utf8", you can type "go test ./utf8" instead of needing to specify the full path. Similarly, in the reverse situation, "go test .." will test "unicode" from the "unicode/utf8" directory. Relative patterns are also allowed, like "go test ./..." to test all subdirectories. See 'go help packages' for details on the pattern syntax. Second, if you are compiling a Go program not in a work space, you can use a relative path in an import statement in that program to refer to nearby code also not in a work space. This makes it easy to experiment with small multipackage programs outside of the usual work spaces, but such programs cannot be installed with "go install" (there is no work space in which to install them), so they are rebuilt from scratch each time they are built. To avoid ambiguity, Go programs cannot use relative import paths within a work space. Certain import paths also describe how to obtain the source code for the package using a revision control system. A few common code hosting sites have special syntax: For code hosted on other servers, import paths may either be qualified with the version control type, or the go tool can dynamically fetch the import path over https/http and discover where the code resides from a <meta> tag in the HTML. To declare the code location, an import path of the form specifies the given repository, with or without the .vcs suffix, using the named version control system, and then the path inside that repository. The supported version control systems are: For example, denotes the root directory of the Mercurial repository at example.org/user/foo, and denotes the foo/bar directory of the Git repository at example.org/repo. When a version control system supports multiple protocols, each is tried in turn when downloading. For example, a Git download tries https://, then git+ssh://. By default, downloads are restricted to known secure protocols (e.g. https, ssh). To override this setting for Git downloads, the GIT_ALLOW_PROTOCOL environment variable can be set (For more details see: 'go help environment'). If the import path is not a known code hosting site and also lacks a version control qualifier, the go tool attempts to fetch the import over https/http and looks for a <meta> tag in the document's HTML <head>. The meta tag has the form: Starting in Go 1.25, an optional subdirectory will be recognized by the go command: The import-prefix is the import path corresponding to the repository root. It must be a prefix or an exact match of the package being fetched with "go get". If it's not an exact match, another http request is made at the prefix to verify the <meta> tags match. The meta tag should appear as early in the file as possible. In particular, it should appear before any raw JavaScript or CSS, to avoid confusing the go command's restricted parser. The vcs is one of "bzr", "fossil", "git", "hg", "svn". The repo-root is the root of the version control system containing a scheme and not containing a .vcs qualifier. The subdir specifies the directory within the repo-root where the Go module's root (including its go.mod file) is located. It allows you to organize your repository with the Go module code in a subdirectory rather than directly at the repository's root. If set, all vcs tags must be prefixed with "subdir". i.e. "subdir/v1.2.3" For example, will result in the following requests: If that page contains the meta tag the go tool will verify that https://example.org/?go-get=1 contains the same meta tag and then download the code from the Git repository at https://code.org/r/p/exproj If that page contains the meta tag the go tool will verify that https://example.org/?go-get=1 contains the same meta tag and then download the code from the "foo/subdir" subdirectory within the Git repository at https://code.org/r/p/exproj Downloaded packages are stored in the module cache. See https://golang.org/ref/mod#module-cache. When using modules, an additional variant of the go-import meta tag is recognized and is preferred over those listing version control systems. That variant uses "mod" as the vcs in the content value, as in: This tag means to fetch modules with paths beginning with example.org from the module proxy available at the URL https://code.org/moduleproxy. See https://golang.org/ref/mod#goproxy-protocol for details about the proxy protocol. When the custom import path feature described above redirects to a known code hosting site, each of the resulting packages has two possible import paths, using the custom domain or the known hosting site. A package statement is said to have an "import comment" if it is immediately followed (before the next newline) by a comment of one of these two forms: The go command will refuse to install a package with an import comment unless it is being referred to by that import path. In this way, import comments let package authors make sure the custom import path is used and not a direct path to the underlying code hosting site. Import path checking is disabled for code found within vendor trees. This makes it possible to copy code into alternate locations in vendor trees without needing to update import comments. Import path checking is also disabled when using modules. Import path comments are obsoleted by the go.mod file's module statement. See https://golang.org/s/go14customimport for details. Modules are how Go manages dependencies. A module is a collection of packages that are released, versioned, and distributed together. Modules may be downloaded directly from version control repositories or from module proxy servers. For a series of tutorials on modules, see https://golang.org/doc/tutorial/create-module. For a detailed reference on modules, see https://golang.org/ref/mod. By default, the go command may download modules from https://proxy.golang.org. It may authenticate modules using the checksum database at https://sum.golang.org. Both services are operated by the Go team at Google. The privacy policies for these services are available at https://proxy.golang.org/privacy and https://sum.golang.org/privacy, respectively. The go command's download behavior may be configured using GOPROXY, GOSUMDB, GOPRIVATE, and other environment variables. See 'go help environment' and https://golang.org/ref/mod#private-module-privacy for more information. When the go command downloads a module zip file or go.mod file into the module cache, it computes a cryptographic hash and compares it with a known value to verify the file hasn't changed since it was first downloaded. Known hashes are stored in a file in the module root directory named go.sum. Hashes may also be downloaded from the checksum database depending on the values of GOSUMDB, GOPRIVATE, and GONOSUMDB. For details, see https://golang.org/ref/mod#authenticating. Many commands apply to a set of packages: Usually, [packages] is a list of import paths. An import path that is a rooted path or that begins with a . or .. element is interpreted as a file system path and denotes the package in that directory. Otherwise, the import path P denotes the package found in the directory DIR/src/P for some DIR listed in the GOPATH environment variable (For more details see: 'go help gopath'). If no import paths are given, the action applies to the package in the current directory. There are five reserved names for paths that should not be used for packages to be built with the go tool: - "main" denotes the top-level package in a stand-alone executable. - "all" expands to all packages in the main module (or workspace modules) and their dependencies, including dependencies needed by tests of any of those. In GOPATH mode, "all" expands to all packages found in all the GOPATH trees. - "std" is like all but expands to just the packages in the standard Go library. - "cmd" expands to the Go repository's commands and their internal libraries. - "tool" expands to the tools defined in the current module's go.mod file. Package names match against fully-qualified import paths or patterns that match against any number of import paths. For instance, "fmt" refers to the standard library's package fmt, but "http" alone for package http would not match the import path "net/http" from the standard library. Instead, the complete import path "net/http" must be used. Import paths beginning with "cmd/" only match source code in the Go repository. An import path is a pattern if it includes one or more "..." wildcards, each of which can match any string, including the empty string and strings containing slashes. Such a pattern expands to all package directories found in the GOPATH trees with names matching the patterns. To make common patterns more convenient, there are two special cases. First, /... at the end of the pattern can match an empty string, so that net/... matches both net and packages in its subdirectories, like net/http. Second, any slash-separated pattern element containing a wildcard never participates in a match of the "vendor" element in the path of a vendored package, so that ./... does not match packages in subdirectories of ./vendor or ./mycode/vendor, but ./vendor/... and ./mycode/vendor/... do. Note, however, that a directory named vendor that itself contains code is not a vendored package: cmd/vendor would be a command named vendor, and the pattern cmd/... matches it. See golang.org/s/go15vendor for more about vendoring. An import path can also name a package to be downloaded from a remote repository. Run 'go help importpath' for details. Every package in a program must have a unique import path. By convention, this is arranged by starting each path with a unique prefix that belongs to you. For example, paths used internally at Google all begin with 'google', and paths denoting remote repositories begin with the path to the code, such as 'github.com/user/repo'. Package patterns should include this prefix. For instance, a package called 'http' residing under 'github.com/user/repo', would be addressed with the fully-qualified pattern: 'github.com/user/repo/http'. Packages in a program need not have unique package names, but there are two reserved package names with special meaning. The name main indicates a command, not a library. Commands are built into binaries and cannot be imported. The name documentation indicates documentation for a non-Go program in the directory. Files in package documentation are ignored by the go command. As a special case, if the package list is a list of .go files from a single directory, the command is applied to a single synthesized package made up of exactly those files, ignoring any build constraints in those files and ignoring any other files in the directory. Directory and file names that begin with "." or "_" are ignored by the go tool, as are directories named "testdata". The go command defaults to downloading modules from the public Go module mirror at proxy.golang.org. It also defaults to validating downloaded modules, regardless of source, against the public Go checksum database at sum.golang.org. These defaults work well for publicly available source code. The GOPRIVATE environment variable controls which modules the go command considers to be private (not available publicly) and should therefore not use the proxy or checksum database. The variable is a comma-separated list of glob patterns (in the syntax of Go's path.Match) of module path prefixes. For example, causes the go command to treat as private any module with a path prefix matching either pattern, including git.corp.example.com/xyzzy, rsc.io/private, and rsc.io/private/quux. For fine-grained control over module download and validation, the GONOPROXY and GONOSUMDB environment variables accept the same kind of glob list and override GOPRIVATE for the specific decision of whether to use the proxy and checksum database, respectively. For example, if a company ran a module proxy serving private modules, users would configure go using: The GOPRIVATE variable is also used to define the "public" and "private" patterns for the GOVCS variable; see 'go help vcs'. For that usage, GOPRIVATE applies even in GOPATH mode. In that case, it matches import paths instead of module paths. The 'go env -w' command (see 'go help env') can be used to set these variables for future go command invocations. For more details, see https://golang.org/ref/mod#private-modules. The 'go test' command takes both flags that apply to 'go test' itself and flags that apply to the resulting test binary. Several of the flags control profiling and write an execution profile suitable for "go tool pprof"; run "go tool pprof -h" for more information. The -sample_index=alloc_space, -sample_index=alloc_objects, and -show_bytes options of pprof control how the information is presented. The following flags are recognized by the 'go test' command and control the execution of any test: The following flags are also recognized by 'go test' and can be used to profile the tests during execution: Each of these flags is also recognized with an optional 'test.' prefix, as in -test.v. When invoking the generated test binary (the result of 'go test -c') directly, however, the prefix is mandatory. The 'go test' command rewrites or removes recognized flags, as appropriate, both before and after the optional package list, before invoking the test binary. For instance, the command will compile the test binary and then run it as (The -x flag is removed because it applies only to the go command's execution, not to the test itself.) The test flags that generate profiles (other than for coverage) also leave the test binary in pkg.test for use when analyzing the profiles. When 'go test' runs a test binary, it does so from within the corresponding package's source code directory. Depending on the test, it may be necessary to do the same when invoking a generated test binary directly. Because that directory may be located within the module cache, which may be read-only and is verified by checksums, the test must not write to it or any other directory within the module unless explicitly requested by the user (such as with the -fuzz flag, which writes failures to testdata/fuzz). The command-line package list, if present, must appear before any flag not known to the go test command. Continuing the example above, the package list would have to appear before -myflag, but could appear on either side of -v. When 'go test' runs in package list mode, 'go test' caches successful package test results to avoid unnecessary repeated running of tests. To disable test caching, use any test flag or argument other than the cacheable flags. The idiomatic way to disable test caching explicitly is to use -count=1. To keep an argument for a test binary from being interpreted as a known flag or a package name, use -args (see 'go help test') which passes the remainder of the command line through to the test binary uninterpreted and unaltered. For instance, the command will compile the test binary and then run it as Similarly, will compile the test binary and then run it as In the first example, the -x and the second -v are passed through to the test binary unchanged and with no effect on the go command itself. In the second example, the argument math is passed through to the test binary, instead of being interpreted as the package list. The 'go test' command expects to find test, benchmark, and example functions in the "*_test.go" files corresponding to the package under test. A test function is one named TestXxx (where Xxx does not start with a lower case letter) and should have the signature, A benchmark function is one named BenchmarkXxx and should have the signature, A fuzz test is one named FuzzXxx and should have the signature, An example function is similar to a test function but, instead of using *testing.T to report success or failure, prints output to os.Stdout. If the last comment in the function starts with "Output:" then the output is compared exactly against the comment (see examples below). If the last comment begins with "Unordered output:" then the output is compared to the comment, however the order of the lines is ignored. An example with no such comment is compiled but not executed. An example with no text after "Output:" is compiled, executed, and expected to produce no output. Godoc displays the body of ExampleXxx to demonstrate the use of the function, constant, or variable Xxx. An example of a method M with receiver type T or *T is named ExampleT_M. There may be multiple examples for a given function, constant, or variable, distinguished by a trailing _xxx, where xxx is a suffix not beginning with an upper case letter. Here is an example of an example: Here is another example where the ordering of the output is ignored: The entire test file is presented as the example when it contains a single example function, at least one other function, type, variable, or constant declaration, and no tests, benchmarks, or fuzz tests. See the documentation of the testing package for more information. The 'go get' command can run version control commands like git to download imported code. This functionality is critical to the decentralized Go package ecosystem, in which code can be imported from any server, but it is also a potential security problem, if a malicious server finds a way to cause the invoked version control command to run unintended code. To balance the functionality and security concerns, the 'go get' command by default will only use git and hg to download code from public servers. But it will use any known version control system (bzr, fossil, git, hg, svn) to download code from private servers, defined as those hosting packages matching the GOPRIVATE variable (see 'go help private'). The rationale behind allowing only Git and Mercurial is that these two systems have had the most attention to issues of being run as clients of untrusted servers. In contrast, Bazaar, Fossil, and Subversion have primarily been used in trusted, authenticated environments and are not as well scrutinized as attack surfaces. The version control command restrictions only apply when using direct version control access to download code. When downloading modules from a proxy, 'go get' uses the proxy protocol instead, which is always permitted. By default, the 'go get' command uses the Go module mirror (proxy.golang.org) for public packages and only falls back to version control for private packages or when the mirror refuses to serve a public package (typically for legal reasons). Therefore, clients can still access public code served from Bazaar, Fossil, or Subversion repositories by default, because those downloads use the Go module mirror, which takes on the security risk of running the version control commands using a custom sandbox. The GOVCS variable can be used to change the allowed version control systems for specific packages (identified by a module or import path). The GOVCS variable applies when building package in both module-aware mode and GOPATH mode. When using modules, the patterns match against the module path. When using GOPATH, the patterns match against the import path corresponding to the root of the version control repository. The general form of the GOVCS setting is a comma-separated list of pattern:vcslist rules. The pattern is a glob pattern that must match one or more leading elements of the module or import path. The vcslist is a pipe-separated list of allowed version control commands, or "all" to allow use of any known command, or "off" to disallow all commands. Note that if a module matches a pattern with vcslist "off", it may still be downloaded if the origin server uses the "mod" scheme, which instructs the go command to download the module using the GOPROXY protocol. The earliest matching pattern in the list applies, even if later patterns might also match. For example, consider: With this setting, code with a module or import path beginning with github.com/ can only use git; paths on evil.com cannot use any version control command, and all other paths (* matches everything) can use only git or hg. The special patterns "public" and "private" match public and private module or import paths. A path is private if it matches the GOPRIVATE variable; otherwise it is public. If no rules in the GOVCS variable match a particular module or import path, the 'go get' command applies its default rule, which can now be summarized in GOVCS notation as 'public:git|hg,private:all'. To allow unfettered use of any version control system for any package, use: To disable all use of version control, use: The 'go env -w' command (see 'go help env') can be used to set the GOVCS variable for future go command invocations.
Package dspy is a Go implementation of the DSPy framework for using language models to solve complex tasks through composable steps and prompting techniques. DSPy-Go provides a collection of modules, optimizers, and tools for building reliable LLM-powered applications. It focuses on making it easy to: Key Components: Core: Fundamental abstractions like Module, Signature, LLM and Program for defining and executing LLM-based workflows. Modules: Building blocks for composing LLM workflows: Predict: Basic prediction module for simple LLM interactions ChainOfThought: Implements step-by-step reasoning with rationale tracking ReAct: Implements Reasoning and Acting with tool integration Refine: Quality improvement through multiple attempts with reward functions and temperature variation Parallel: Concurrent execution wrapper for batch processing with any module MultiChainComparison: Compares multiple reasoning attempts and synthesizes holistic evaluation Optimizers: Tools for improving prompt effectiveness: BootstrapFewShot: Automatically selects high-quality examples for few-shot learning MIPRO: Multi-step interactive prompt optimization Copro: Collaborative prompt optimization SIMBA: Stochastic Introspective Mini-Batch Ascent with self-analysis GEPA: Generative Evolutionary Prompt Adaptation with multi-objective Pareto optimization, LLM-based self-reflection, semantic diversity metrics, and elite archive management TPE: Tree-structured Parzen Estimator for Bayesian optimization Agents: Advanced patterns for building sophisticated AI systems: Memory: Different memory implementations for tracking conversation history Tools: Integration with external tools and APIs, including: Smart Tool Registry: Intelligent tool selection using Bayesian inference Performance Tracking: Real-time metrics and reliability scoring Auto-Discovery: Dynamic tool registration from MCP servers MCP (Model Context Protocol) support for seamless integrations Tool Chaining: Sequential execution of tools in pipelines with data transformation Tool Composition: Combining multiple tools into reusable composite units Parallel Execution: Advanced parallel tool execution with intelligent scheduling Dependency Resolution: Automatic execution planning based on tool dependencies Workflows: Chain: Sequential execution of steps Parallel: Concurrent execution of multiple workflow steps Router: Dynamic routing based on classification Advanced Patterns: ForEach, While, Until loops with conditional execution Orchestrator: Flexible task decomposition and execution Integration with multiple LLM providers: Anthropic Claude Google Gemini (with multimodal support) OpenAI (with flexible configuration for compatible APIs) Ollama LlamaCPP Multimodal Capabilities: Image Analysis: Analyze and describe images with natural language Vision Question Answering: Ask specific questions about visual content Multimodal Chat: Interactive conversations with images Streaming Multimodal: Real-time processing of multimodal content Multiple Image Analysis: Compare and analyze multiple images simultaneously Content Block System: Flexible handling of text, image, and future audio content Simple Example: Prefer helper functions such as core.SetDefaultLLM, core.SetTeacherLLM, and core.GetConcurrencyLevel over mutating core.GlobalConfig directly. OpenAI-Compatible API Example: Multimodal Example: GEPA Optimizer Example: Advanced Features: Tracing and Logging: Detailed tracing and structured logging for debugging and optimization Execution context is tracked and passed through the pipeline for debugging and analysis. Error Handling: Comprehensive error management with custom error types and centralized handling Metric-Based Optimization: Improve module performance based on custom evaluation metrics Smart Tool Management: Intelligent tool selection, performance tracking, auto-discovery, chaining, and composition for building complex tool workflows Custom Tool Integration: Extend ReAct modules with domain-specific tools Workflow Retry Logic: Resilient execution with configurable retry mechanisms and backoff strategies Streaming Support: Process LLM outputs incrementally as they're generated Text streaming for regular LLM interactions Multimodal streaming for image analysis and vision tasks Data Storage: Integration with various storage backends for persistence of examples and results Dataset Management: Built-in support for downloading and managing datasets like GSM8K and HotPotQA Arrow Support: Integration with Apache Arrow for efficient data handling and processing Working with Smart Tool Registry: Working with Tool Chaining and Composition: Working with Multimodal Streaming: Working with Workflows: For more examples and detailed documentation, visit: https://github.com/XiaoConstantine/dspy-go DSPy-Go is released under the MIT License.
Package deps analyzes and recursively installs Go package dependencies. It is library functionality similar to `go get`.
Package grpcreplay supports the capture and replay of gRPC calls. Its main goal is to improve testing. Once you capture the calls of a test that runs against a real service, you have an "automatic mock" that can be replayed against the same test, yielding a unit test that is fast and flake-free. To record a sequence of gRPC calls to a file, create a Recorder and pass its DialOptions to grpc.Dial: It is essential to close the Recorder when the interaction is finished. There is also a NewRecorderWriter function for capturing to an arbitrary io.Writer. To replay a captured file, create a Replayer and ask it for a (fake) connection. We don't actually have to dial a server. (Since we're reading the file and not writing it, we don't have to be as careful about the error returned from Close). A test might use random or time-sensitive values, for instance to create unique resources for isolation from other tests. The test therefore has initial values, such as the current time, or a random seed, that differ from run to run. You must record this initial state and re-establish it on replay. To record the initial state, serialize it into a []byte and pass it as the second argument to NewRecorder: On replay, get the bytes from Replayer.Initial: Recorders and replayers have support for running callbacks before messages are written to or read from the replay file. A Recorder has a BeforeFunc that can modify a request or response before it is written to the replay file. The actual RPCs sent to the service during recording remain unaltered; only what is saved in the replay file can be changed. A Replayer has a BeforeFunc that can modify a request before it is sent for matching. Example uses for these callbacks include customized logging, or scrubbing data before RPCs are written to the replay file. If requests are modified by the callbacks during recording, it is important to perform the same modifications to the requests when replaying, or RPC matching on replay will fail. A common way to analyze and modify the various messages is to use a type switch. A nondeterministic program may invoke RPCs in a different order each time it is run. The order in which RPCs are called during recording may differ from the order during replay. The replayer matches incoming to recorded requests by method name and request contents, so nondeterminism is only a concern for identical requests that result in different responses. A nondeterministic program whose behavior differs depending on the order of such RPCs probably has a race condition: since both the recorded sequence of RPCs and the sequence during replay are valid orderings, the program should behave the same under both. The same is not true of streaming RPCs. The replayer matches streams only by method name, since it has no other information at the time the stream is opened. Two streams with the same method name that are started concurrently may replay in the wrong order. Besides the differences in replay mentioned above, other differences may cause issues for some programs. We list them here. The Replayer delivers a response to an RPC immediately, without waiting for other incoming RPCs. This can violate causality. For example, in a Pub/Sub program where one goroutine publishes and another subscribes, during replay the Subscribe call may finish before the Publish call begins. For streaming RPCs, the Replayer delivers the result of Send and Recv calls in the order they were recorded. No attempt is made to match message contents. At present, this package does not record or replay stream headers and trailers, or the result of the CloseSend method.
Package jump implements the "jump consistent hash" algorithm. Example Reference C++ implementation[1] Jump consistent hash works by computing when its output changes as the number of buckets increases. Let ch(key, num_buckets) be the consistent hash for the key when there are num_buckets buckets. Clearly, for any key, k, ch(k, 1) is 0, since there is only the one bucket. In order for the consistent hash function to balanced, ch(k, 2) will have to stay at 0 for half the keys, k, while it will have to jump to 1 for the other half. In general, ch(k, n+1) has to stay the same as ch(k, n) for n/(n+1) of the keys, and jump to n for the other 1/(n+1) of the keys. Here are examples of the consistent hash values for three keys, k1, k2, and k3, as num_buckets goes up: A linear time algorithm can be defined by using the formula for the probability of ch(key, j) jumping when j increases. It essentially walks across a row of this table. Given a key and number of buckets, the algorithm considers each successive bucket, j, from 1 to num_buckets1, and uses ch(key, j) to compute ch(key, j+1). At each bucket, j, it decides whether to keep ch(k, j+1) the same as ch(k, j), or to jump its value to j. In order to jump for the right fraction of keys, it uses a pseudorandom number generator with the key as its seed. To jump for 1/(j+1) of keys, it generates a uniform random number between 0.0 and 1.0, and jumps if the value is less than 1/(j+1). At the end of the loop, it has computed ch(k, num_buckets), which is the desired answer. In code: We can convert this to a logarithmic time algorithm by exploiting that ch(key, j+1) is usually unchanged as j increases, only jumping occasionally. The algorithm will only compute the destinations of jumps the j’s for which ch(key, j+1) ≠ ch(key, j). Also notice that for these j’s, ch(key, j+1) = j. To develop the algorithm, we will treat ch(key, j) as a random variable, so that we can use the notation for random variables to analyze the fractions of keys for which various propositions are true. That will lead us to a closed form expression for a pseudorandom variable whose value gives the destination of the next jump. Suppose that the algorithm is tracking the bucket numbers of the jumps for a particular key, k. And suppose that b was the destination of the last jump, that is, ch(k, b) ≠ ch(k, b+1), and ch(k, b+1) = b. Now, we want to find the next jump, the smallest j such that ch(k, j+1) ≠ ch(k, b+1), or equivalently, the largest j such that ch(k, j) = ch(k, b+1). We will make a pseudorandom variable whose value is that j. To get a probabilistic constraint on j, note that for any bucket number, i, we have j ≥ i if and only if the consistent hash hasn’t changed by i, that is, if and only if ch(k, i) = ch(k, b+1). Hence, the distribution of j must satisfy Fortunately, it is easy to compute that probability. Notice that since P( ch(k, 10) = ch(k, 11) ) is 10/11, and P( ch(k, 11) = ch(k, 12) ) is 11/12, then P( ch(k, 10) = ch(k, 12) ) is 10/11 * 11/12 = 10/12. In general, if n ≥ m, P( ch(k, n) = ch(k, m) ) = m / n. Thus for any i > b, Now, we generate a pseudorandom variable, r, (depending on k and j) that is uniformly distributed between 0 and 1. Since we want P(j ≥ i) = (b+1) / i, we set P(j ≥ i) iff r ≤ (b+1) / i. Solving the inequality for i yields P(j ≥ i) iff i ≤ (b+1) / r. Since i is a lower bound on j, j will equal the largest i for which P(j ≥ i), thus the largest i satisfying i ≤ (b+1) / r. Thus, by the definition of the floor function, j = floor((b+1) / r). Using this formula, jump consistent hash finds ch(key, num_buckets) by choosing successive jump destinations until it finds a position at or past num_buckets. It then knows that the previous jump destination is the answer. To turn this into the actual code of figure 1, we need to implement random. We want it to be fast, and yet to also to have well distributed successive values. We use a 64bit linear congruential generator; the particular multiplier we use produces random numbers that are especially well distributed in higher dimensions (i.e., when successive random values are used to form tuples). We use the key as the seed. (For keys that don’t fit into 64 bits, a 64 bit hash of the key should be used.) The congruential generator updates the seed on each iteration, and the code derives a double from the current seed. Tests show that this generator has good speed and distribution. It is worth noting that unlike the algorithm of Karger et al., jump consistent hash does not require the key to be hashed if it is already an integer. This is because jump consistent hash has an embedded pseudorandom number generator that essentially rehashes the key on every iteration. The hash is not especially good (i.e., linear congruential), but since it is applied repeatedly, additional hashing of the input key is not necessary. [1] http://arxiv.org/pdf/1406.2294v1.pdf
A dynamic and extensible music library organizer Demlo is a music library organizer. It can encode, fix case, change folder hierarchy according to tags or file properties, tag from an online database, copy covers while ignoring duplicates or those below a quality threshold, and much more. It makes it possible to manage your libraries uniformly and dynamically. You can write your own rules to fit your needs best. Demlo aims at being as lightweight and portable as possible. Its major runtime dependency is the transcoder FFmpeg. The scripts are written in Lua for portability and speed while allowing virtually unlimited extensibility. Usage: For usage options, see: First Demlo creates a list of all input files. When a folder is specified, all files matching the extensions from the 'extensions' variable will be appended to the list. Identical files are appended only once. Next all files get analyzed: - The audio file details (tags, stream properties, format properties, etc.) are stored into the 'input' variable. The 'output' variable gets its default values from 'input', or from an index file if specified from command-line. If no index has been specified and if an attached cuesheet is found, all cuesheet details are appended accordingly. Cuesheet tags override stream tags, which override format tags. Finally, still without index, tags can be retrieved from Internet if the command-line option is set. - If a prescript has been specified, it gets executed. It makes it possible to adjust the input values and global variables before running the other scripts. - The scripts, if any, get executed in the lexicographic order of their basename. The 'output' variable is transformed accordingly. Scripts may contain rules such as defining a new file name, new tags, new encoding properties, etc. You can use conditions on input values to set the output properties, which makes it virtually possible to process a full music library in one single run. - If a postscript has been specified, it gets executed. It makes it possible to adjust the output of the script for the current run only. - Demlo makes some last-minute tweaking if need be: it adjusts the bitrate, the path, the encoding parameters, and so on. - A preview of changes is displayed. - When applying changes, the covers get copied if required and the audio file gets processed: tags are modified as specified, the file is re-encoded if required, and the output is written to the appropriate folder. When destination already exists, the 'exist' action is executed. The program's default behaviour can be changed from the user configuration file. (See the 'Files' section for a template.) Most command-line flags default value can be changed. The configuration file is loaded on startup, before parsing the command-line options. Review the default value of the CLI flags with 'demlo -h'. If you wish to use no configuration file, set the environment variable DEMLORC to ".". Scripts can contain any safe Lua code. Some functions like 'os.execute' are not available for security reasons. It is not possible to print to the standard output/error unless running in debug mode and using the 'debug' function. See the 'sandbox.go' file for a list of allowed functions and variables. Lua patterns are replaced by Go regexps. See https://github.com/google/re2/wiki/Syntax. Scripts have no requirements at all. However, to be useful, they should set values of the 'output' table detailed in the 'Variables' section. You can use the full power of the Lua to set the variables dynamically. For instance: 'input' and 'output' are both accessible from any script. All default functions and variables (excluding 'output') are reset on every script call to enforce consistency. Local variables are lost from one script call to another. Global variables are preserved. Use this feature to pass data like options or new functions. 'output' structure consistency is guaranteed at the start of every script. Demlo will only extract the fields with the right type as described in the 'Variables' section. Warning: Do not abuse of global variables, especially when processing non-fixed size data (e.g. tables). Data could grow big and slow down the program. By default, when the destination exists, Demlo will append a suffix to the output destination. This behaviour can be changed from the 'exist' action specified by the user. Demlo comes with a few default actions. The 'exist' action works just like scripts with the following differences: - Any change to 'output.path' will be skipped. - An additional variable is accessible from the action: 'existinfo' holds the file details of the existing files in the same fashion as 'input'. This allows for comparing the input file and the existing destination. The writing rules can be tweaked the following way: Word of caution: overwriting breaks Demlo's rule of not altering existing files. It can lead to undesired results if the overwritten file is also part of the (yet to be processed) input. The overwrite capability can be useful when syncing music libraries however. The user scripts should be generic. Therefore they may not properly handle some uncommon input values. Tweak the input with temporary overrides from command-line. The prescript and postscript defined on command-line will let you run arbitrary code that is run before and after all other scripts, respectively. Use global variables to transfer data and parameters along. If the prescript and postscript end up being too long, consider writing a demlo script. You can also define shell aliases or use wrapper scripts as convenience. The 'input' table describes the file: Bitrate is in bits per seconds (bps). That is, for 320 kbps you would specify The 'time' is the modification time of the file. It holds the sec seconds and nsec nanoseconds since January 1, 1970 UTC. The entry 'streams' and 'format' are as returned by It gives access to most metadata that FFmpeg can return. For instance, to get the duration of the track in seconds, query the variable 'input.format.duration'. Since there may be more than one stream (covers, other data), the first audio stream is assumed to be the music stream. For convenience, the index of the music stream is stored in 'audioindex'. The tags returned by FFmpeg are found in streams, format and in the cuesheet. To make tag queries easier, all tags are stored in the 'tags' table, with the following precedence: You can remove a tag by setting it to 'nil' or the empty string. This is equivalent, except that 'nil' saves some memory during the process. The 'output' table describes the transformation to apply to the file: The 'parameters' array holds the CLI parameters passed to FFmpeg. It can be anything supported by FFmpeg, although this variable is supposed to hold encoding information. See the 'Examples' section. The 'embeddedcovers', 'externalcovers' and 'onlinecover' variables are detailed in the 'Covers' section. The 'write' variable is covered in the 'Existing destination' section. The 'rmsrc' variable is a boolean: when true, Demlo removes the source file after processing. This can speed up the process when not re-encoding. This option is ignored for multi-track files. For convenience, the following shortcuts are provided: Demlo provides some non-standard Lua functions to ease scripting. Display a message on stderr if debug mode is on. Return lowercase string without non-alphanumeric characters nor leading zeros. Return the relation coefficient of the two input strings. The result is a float in 0.0...1.0, 0.0 means no relation at all, 1.0 means identical strings. A format is a container in FFmpeg's terminology. 'output.parameters' contains CLI flags passed to FFmpeg. They are meant to set the stream codec, the bitrate, etc. If 'output.parameters' is {'-c:a', 'copy'} and the format is identical, then taglib will be used instead of FFmpeg. Use this rule from a (post)script to disable encoding by setting the same format and the copy parameters. This speeds up the process. The official scripts are usually very smart at guessing the right values. They might make mistakes however. If you are unsure, you can (and you are advised to) preview the results before proceeding. The 'diff' preview is printed to stderr. A JSON preview of the changes is printed to stdout if stdout is redirected. The initial values of the 'output' table can be completed with tags fetched from the MusicBrainz database. Audio files are fingerprinted for the queries, so even with initially wrong file names and tags, the right values should still be retrieved. The front album cover can also be retrieved. Proxy parameters will be fetched automatically from the 'http_proxy' and 'https_proxy' environment variables. As this process requires network access it can be quite slow. Nevertheless, Demlo is specifically optimized for albums, so that network queries are used for only one track per album, when possible. Some tracks can be released on different albums: Demlo tries to guess it from the tags, but if the tags are wrong there is no way to know which one it is. There is a case where the selection can be controlled: let's assume we have tracks A, B and C from the same album Z. A and B were also released in album Y, whereas C was release in Z only. Tags for A will be checked online; let's assume it gets tagged to album Y. B will use A details, so album Y too. Then C does not match neither A's nor B's album, so another online query will be made and it will be tagged to album Z. This is slow and does not yield the expected result. Now let's call Tags for C will be queried online, and C will be tagged to Z. Then both A and B will match album Z so they will be tagged using C details, which is the desired result. Conclusion: when using online tagging, the first argument should be the lesser known track of the album. Demlo can set the output variables according to the values set in a text file before calling the script. The input values are ignored as well as online tagging, but it is still possible to access the input table from scripts. This 'index' file is formatted in JSON. It corresponds to what Demlo outputs when printing the JSON preview. This is valid JSON except for the missing beginning and the missing end. It makes it possible to concatenate and to append to existing index files. Demlo will automatically complete the missing parts so that it becomes valid JSON. The index file is useful when you want to edit tags manually: You can redirect the output to a file, edit the content manually with your favorite text editor, then run Demlo again with the index as argument. See the 'Examples' section. This feature can also be used to interface Demlo with other programs. Demlo can manage embedded covers as well as external covers. External covers are queried from files matching known extensions in the file's folder. Embedded covers are queried from static video streams in the file. Covers are accessed from The embedded covers are indexed numerically by order of appearance in the streams. The first cover will be at index 1 and so on. This is not necessarily the index of the stream. 'inputcover' is the following structure: 'format' is the picture format. FFmpeg makes a distinction between format and codec, but it is not useful for covers. The name of the format is specified by Demlo, not by FFmpeg. Hence the 'jpeg' name, instead of 'mjpeg' as FFmpeg puts it. 'width' and 'height' hold the size in pixels. 'checksum' can be used to identify files uniquely. For performance reasons, only a partial checksum is performed. This variable is typically used for skipping duplicates. Cover transformations are specified in 'outputcover' has the following structure: The format is specified by FFmpeg this time. See the comments on 'format' for 'inputcover'. 'parameters' is used in the same fashion as 'output.parameters'. User configuration: This must be a Lua file. See the 'demlorc' file provided with this package for an exhaustive list of options. Folder containing the official scripts: User script folder: Create this folder and add your own scripts inside. This folder takes precedence over the system folder, so scripts with the same name will be found in the user folder first. The following examples will not proceed unless the '-p' command-line option is true. Important: you _must_ use single quotes for the runtime Lua command to prevent expansion. Inside the Lua code, use double quotes for strings and escape single quotes. Show default options: Preview changes made by the default scripts: Use 'alternate' script if found in user or system script folder (user folder first): Add the Lua file to the list of scripts. This feature is convenient if you want to write scripts that are too complex to fit on the command-line, but not generic enough to fit the user or system script folders. Remove all script from the list, then add '30-case' and '60-path' scripts. Note that '30-case' will be run before '60-path'. Do not use any script but '60-path'. The file content is unchanged and the file is renamed to a dynamically computed destination. Demlo performs an instant rename if destination is on the same device. Otherwise it copies the file and removes the source. Use the default scripts (if set in configuration file), but do not re-encode: Set 'artist' to the value of 'composer', and 'title' to be preceded by the new value of 'artist', then apply the default script. Do not re-encode. Order in runtime script matters. Mind the double quotes. Set track number to first number in input file name: Use the default scripts but keep original value for the 'artist' tag: 1) Preview default scripts transformation and save it to an index. 2) Edit file to fix any potential mistake. 3) Run Demlo over the same files using the index information only. Same as above but generate output filename according to the custom '61-rename' script. The numeric prefix is important: it ensures that '61-rename' will be run after all the default tag related scripts and after '60-path'. Otherwise, if a change in tags would occur later on, it would not affect the renaming script. Retrieve tags from Internet: Same as above but for a whole album, and saving the result to an index: Only download the cover for the album corresponding to the track. Use 'rmsrc' to avoid duplicating the audio file. Change tags inplace with entries from MusicBrainz: Set tags to titlecase while casing AC-DC correctly: To easily switch between formats from command-line, create one script per format (see 50-encoding.lua), e.g. ogg.lua and flac.lua. Then Add support for non-default formats from CLI: Overwrite existing destination if input is newer: ffmpeg(1), ffprobe(1), http://www.lua.org/pil/contents.html
Copyright (c) 2020 Michael Saigachenko
Package analysis provides methods to work with a Swagger specification document from package circl-dev/spec. An analysed specification object (type Spec) provides methods to work with swagger definition. Flattening a specification bundles all remote $ref in the main spec document. Depending on flattening options, additional preprocessing may take place: Mixin several specifications merges all Swagger constructs, and warns about found conflicts. Unmarshalling a specification with golang json unmarshalling may lead to some unwanted result on present but empty fields. Swagger schemas are analyzed to determine their complexity and qualify their content.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/IlyaLab/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/IlyaLab/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/IlyaLab/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/IlyaLab/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Example Reference C++ implementation[1] Jump consistent hash works by computing when its output changes as the number of buckets increases. Let ch(key, num_buckets) be the consistent hash for the key when there are num_buckets buckets. Clearly, for any key, k, ch(k, 1) is 0, since there is only the one bucket. In order for the consistent hash function to balanced, ch(k, 2) will have to stay at 0 for half the keys, k, while it will have to jump to 1 for the other half. In general, ch(k, n+1) has to stay the same as ch(k, n) for n/(n+1) of the keys, and jump to n for the other 1/(n+1) of the keys. Here are examples of the consistent hash values for three keys, k1, k2, and k3, as num_buckets goes up: A linear time algorithm can be defined by using the formula for the probability of ch(key, j) jumping when j increases. It essentially walks across a row of this table. Given a key and number of buckets, the algorithm considers each successive bucket, j, from 1 to num_buckets1, and uses ch(key, j) to compute ch(key, j+1). At each bucket, j, it decides whether to keep ch(k, j+1) the same as ch(k, j), or to jump its value to j. In order to jump for the right fraction of keys, it uses a pseudorandom number generator with the key as its seed. To jump for 1/(j+1) of keys, it generates a uniform random number between 0.0 and 1.0, and jumps if the value is less than 1/(j+1). At the end of the loop, it has computed ch(k, num_buckets), which is the desired answer. In code: We can convert this to a logarithmic time algorithm by exploiting that ch(key, j+1) is usually unchanged as j increases, only jumping occasionally. The algorithm will only compute the destinations of jumps the j’s for which ch(key, j+1) ≠ ch(key, j). Also notice that for these j’s, ch(key, j+1) = j. To develop the algorithm, we will treat ch(key, j) as a random variable, so that we can use the notation for random variables to analyze the fractions of keys for which various propositions are true. That will lead us to a closed form expression for a pseudorandom variable whose value gives the destination of the next jump. Suppose that the algorithm is tracking the bucket numbers of the jumps for a particular key, k. And suppose that b was the destination of the last jump, that is, ch(k, b) ≠ ch(k, b+1), and ch(k, b+1) = b. Now, we want to find the next jump, the smallest j such that ch(k, j+1) ≠ ch(k, b+1), or equivalently, the largest j such that ch(k, j) = ch(k, b+1). We will make a pseudorandom variable whose value is that j. To get a probabilistic constraint on j, note that for any bucket number, i, we have j ≥ i if and only if the consistent hash hasn’t changed by i, that is, if and only if ch(k, i) = ch(k, b+1). Hence, the distribution of j must satisfy Fortunately, it is easy to compute that probability. Notice that since P( ch(k, 10) = ch(k, 11) ) is 10/11, and P( ch(k, 11) = ch(k, 12) ) is 11/12, then P( ch(k, 10) = ch(k, 12) ) is 10/11 * 11/12 = 10/12. In general, if n ≥ m, P( ch(k, n) = ch(k, m) ) = m / n. Thus for any i > b, Now, we generate a pseudorandom variable, r, (depending on k and j) that is uniformly distributed between 0 and 1. Since we want P(j ≥ i) = (b+1) / i, we set P(j ≥ i) iff r ≤ (b+1) / i. Solving the inequality for i yields P(j ≥ i) iff i ≤ (b+1) / r. Since i is a lower bound on j, j will equal the largest i for which P(j ≥ i), thus the largest i satisfying i ≤ (b+1) / r. Thus, by the definition of the floor function, j = floor((b+1) / r). Using this formula, jump consistent hash finds ch(key, num_buckets) by choosing successive jump destinations until it finds a position at or past num_buckets. It then knows that the previous jump destination is the answer. To turn this into the actual code of figure 1, we need to implement random. We want it to be fast, and yet to also to have well distributed successive values. We use a 64bit linear congruential generator; the particular multiplier we use produces random numbers that are especially well distributed in higher dimensions (i.e., when successive random values are used to form tuples). We use the key as the seed. (For keys that don’t fit into 64 bits, a 64 bit hash of the key should be used.) The congruential generator updates the seed on each iteration, and the code derives a double from the current seed. Tests show that this generator has good speed and distribution. It is worth noting that unlike the algorithm of Karger et al., jump consistent hash does not require the key to be hashed if it is already an integer. This is because jump consistent hash has an embedded pseudorandom number generator that essentially rehashes the key on every iteration. The hash is not especially good (i.e., linear congruential), but since it is applied repeatedly, additional hashing of the input key is not necessary. [1] http://arxiv.org/pdf/1406.2294v1.pdf
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
A dynamic and extensible music library organizer Demlo is a music library organizer. It can encode, fix case, change folder hierarchy according to tags or file properties, tag from an online database, copy covers while ignoring duplicates or those below a quality threshold, and much more. It makes it possible to manage your libraries uniformly and dynamically. You can write your own rules to fit your needs best. Demlo aims at being as lightweight and portable as possible. Its major runtime dependency is the transcoder FFmpeg. The scripts are written in Lua for portability and speed while allowing virtually unlimited extensibility. Usage: For usage options, see: First Demlo creates a list of all input files. When a folder is specified, all files matching the extensions from the 'extensions' variable will be appended to the list. Identical files are appended only once. Next all files get analyzed: - The audio file details (tags, stream properties, format properties, etc.) are stored into the 'input' variable. The 'output' variable gets its default values from 'input', or from an index file if specified from command-line. If no index has been specified and if an attached cuesheet is found, all cuesheet details are appended accordingly. Cuesheet tags override stream tags, which override format tags. Finally, still without index, tags can be retrieved from Internet if the command-line option is set. - If a prescript has been specified, it gets executed. It makes it possible to adjust the input values and global variables before running the other scripts. - The scripts, if any, get executed in the lexicographic order of their basename. The 'output' variable is transformed accordingly. Scripts may contain rules such as defining a new file name, new tags, new encoding properties, etc. You can use conditions on input values to set the output properties, which makes it virtually possible to process a full music library in one single run. - If a postscript has been specified, it gets executed. It makes it possible to adjust the output of the script for the current run only. - Demlo makes some last-minute tweaking if need be: it adjusts the bitrate, the path, the encoding parameters, and so on. - A preview of changes is displayed. - When applying changes, the covers get copied if required and the audio file gets processed: tags are modified as specified, the file is re-encoded if required, and the output is written to the appropriate folder. When destination already exists, the 'exist' action is executed. The program's default behaviour can be changed from the user configuration file. (See the 'Files' section for a template.) Most command-line flags default value can be changed. The configuration file is loaded on startup, before parsing the command-line options. Review the default value of the CLI flags with 'demlo -h'. If you wish to use no configuration file, set the environment variable DEMLORC to ".". Scripts can contain any safe Lua code. Some functions like 'os.execute' are not available for security reasons. It is not possible to print to the standard output/error unless running in debug mode and using the 'debug' function. See the 'sandbox.go' file for a list of allowed functions and variables. Lua patterns are replaced by Go regexps. See https://github.com/google/re2/wiki/Syntax. Scripts have no requirements at all. However, to be useful, they should set values of the 'output' table detailed in the 'Variables' section. You can use the full power of the Lua to set the variables dynamically. For instance: 'input' and 'output' are both accessible from any script. All default functions and variables (excluding 'output') are reset on every script call to enforce consistency. Local variables are lost from one script call to another. Global variables are preserved. Use this feature to pass data like options or new functions. 'output' structure consistency is guaranteed at the start of every script. Demlo will only extract the fields with the right type as described in the 'Variables' section. Warning: Do not abuse of global variables, especially when processing non-fixed size data (e.g. tables). Data could grow big and slow down the program. By default, when the destination exists, Demlo will append a suffix to the output destination. This behaviour can be changed from the 'exist' action specified by the user. Demlo comes with a few default actions. The 'exist' action works just like scripts with the following differences: - Any change to 'output.path' will be skipped. - An additional variable is accessible from the action: 'existinfo' holds the file details of the existing files in the same fashion as 'input'. This allows for comparing the input file and the existing destination. The writing rules can be tweaked the following way: Word of caution: overwriting breaks Demlo's rule of not altering existing files. It can lead to undesired results if the overwritten file is also part of the (yet to be processed) input. The overwrite capability can be useful when syncing music libraries however. The user scripts should be generic. Therefore they may not properly handle some uncommon input values. Tweak the input with temporary overrides from command-line. The prescript and postscript defined on command-line will let you run arbitrary code that is run before and after all other scripts, respectively. Use global variables to transfer data and parameters along. If the prescript and postscript end up being too long, consider writing a demlo script. You can also define shell aliases or use wrapper scripts as convenience. The 'input' table describes the file: Bitrate is in bits per seconds (bps). That is, for 320 kbps you would specify The 'time' is the modification time of the file. It holds the sec seconds and nsec nanoseconds since January 1, 1970 UTC. The entry 'streams' and 'format' are as returned by It gives access to most metadata that FFmpeg can return. For instance, to get the duration of the track in seconds, query the variable 'input.format.duration'. Since there may be more than one stream (covers, other data), the first audio stream is assumed to be the music stream. For convenience, the index of the music stream is stored in 'audioindex'. The tags returned by FFmpeg are found in streams, format and in the cuesheet. To make tag queries easier, all tags are stored in the 'tags' table, with the following precedence: You can remove a tag by setting it to 'nil' or the empty string. This is equivalent, except that 'nil' saves some memory during the process. The 'output' table describes the transformation to apply to the file: The 'parameters' array holds the CLI parameters passed to FFmpeg. It can be anything supported by FFmpeg, although this variable is supposed to hold encoding information. See the 'Examples' section. The 'embeddedcovers', 'externalcovers' and 'onlinecover' variables are detailed in the 'Covers' section. The 'write' variable is covered in the 'Existing destination' section. The 'rmsrc' variable is a boolean: when true, Demlo removes the source file after processing. This can speed up the process when not re-encoding. This option is ignored for multi-track files. For convenience, the following shortcuts are provided: Demlo provides some non-standard Lua functions to ease scripting. Display a message on stderr if debug mode is on. Return lowercase string without non-alphanumeric characters nor leading zeros. Return the relation coefficient of the two input strings. The result is a float in 0.0...1.0, 0.0 means no relation at all, 1.0 means identical strings. A format is a container in FFmpeg's terminology. 'output.parameters' contains CLI flags passed to FFmpeg. They are meant to set the stream codec, the bitrate, etc. If 'output.parameters' is {'-c:a', 'copy'} and the format is identical, then taglib will be used instead of FFmpeg. Use this rule from a (post)script to disable encoding by setting the same format and the copy parameters. This speeds up the process. The official scripts are usually very smart at guessing the right values. They might make mistakes however. If you are unsure, you can (and you are advised to) preview the results before proceeding. The 'diff' preview is printed to stderr. A JSON preview of the changes is printed to stdout if stdout is redirected. The initial values of the 'output' table can be completed with tags fetched from the MusicBrainz database. Audio files are fingerprinted for the queries, so even with initially wrong file names and tags, the right values should still be retrieved. The front album cover can also be retrieved. Proxy parameters will be fetched automatically from the 'http_proxy' and 'https_proxy' environment variables. As this process requires network access it can be quite slow. Nevertheless, Demlo is specifically optimized for albums, so that network queries are used for only one track per album, when possible. Some tracks can be released on different albums: Demlo tries to guess it from the tags, but if the tags are wrong there is no way to know which one it is. There is a case where the selection can be controlled: let's assume we have tracks A, B and C from the same album Z. A and B were also released in album Y, whereas C was release in Z only. Tags for A will be checked online; let's assume it gets tagged to album Y. B will use A details, so album Y too. Then C does not match neither A's nor B's album, so another online query will be made and it will be tagged to album Z. This is slow and does not yield the expected result. Now let's call Tags for C will be queried online, and C will be tagged to Z. Then both A and B will match album Z so they will be tagged using C details, which is the desired result. Conclusion: when using online tagging, the first argument should be the lesser known track of the album. Demlo can set the output variables according to the values set in a text file before calling the script. The input values are ignored as well as online tagging, but it is still possible to access the input table from scripts. This 'index' file is formatted in JSON. It corresponds to what Demlo outputs when printing the JSON preview. This is valid JSON except for the missing beginning and the missing end. It makes it possible to concatenate and to append to existing index files. Demlo will automatically complete the missing parts so that it becomes valid JSON. The index file is useful when you want to edit tags manually: You can redirect the output to a file, edit the content manually with your favorite text editor, then run Demlo again with the index as argument. See the 'Examples' section. This feature can also be used to interface Demlo with other programs. Demlo can manage embedded covers as well as external covers. External covers are queried from files matching known extensions in the file's folder. Embedded covers are queried from static video streams in the file. Covers are accessed from The embedded covers are indexed numerically by order of appearance in the streams. The first cover will be at index 1 and so on. This is not necessarily the index of the stream. 'inputcover' is the following structure: 'format' is the picture format. FFmpeg makes a distinction between format and codec, but it is not useful for covers. The name of the format is specified by Demlo, not by FFmpeg. Hence the 'jpeg' name, instead of 'mjpeg' as FFmpeg puts it. 'width' and 'height' hold the size in pixels. 'checksum' can be used to identify files uniquely. For performance reasons, only a partial checksum is performed. This variable is typically used for skipping duplicates. Cover transformations are specified in 'outputcover' has the following structure: The format is specified by FFmpeg this time. See the comments on 'format' for 'inputcover'. 'parameters' is used in the same fashion as 'output.parameters'. User configuration: This must be a Lua file. See the 'demlorc' file provided with this package for an exhaustive list of options. Folder containing the official scripts: User script folder: Create this folder and add your own scripts inside. This folder takes precedence over the system folder, so scripts with the same name will be found in the user folder first. The following examples will not proceed unless the '-p' command-line option is true. Important: you _must_ use single quotes for the runtime Lua command to prevent expansion. Inside the Lua code, use double quotes for strings and escape single quotes. Show default options: Preview changes made by the default scripts: Use 'alternate' script if found in user or system script folder (user folder first): Add the Lua file to the list of scripts. This feature is convenient if you want to write scripts that are too complex to fit on the command-line, but not generic enough to fit the user or system script folders. Remove all script from the list, then add '30-case' and '60-path' scripts. Note that '30-case' will be run before '60-path'. Do not use any script but '60-path'. The file content is unchanged and the file is renamed to a dynamically computed destination. Demlo performs an instant rename if destination is on the same device. Otherwise it copies the file and removes the source. Use the default scripts (if set in configuration file), but do not re-encode: Set 'artist' to the value of 'composer', and 'title' to be preceded by the new value of 'artist', then apply the default script. Do not re-encode. Order in runtime script matters. Mind the double quotes. Set track number to first number in input file name: Use the default scripts but keep original value for the 'artist' tag: 1) Preview default scripts transformation and save it to an index. 2) Edit file to fix any potential mistake. 3) Run Demlo over the same files using the index information only. Same as above but generate output filename according to the custom '61-rename' script. The numeric prefix is important: it ensures that '61-rename' will be run after all the default tag related scripts and after '60-path'. Otherwise, if a change in tags would occur later on, it would not affect the renaming script. Retrieve tags from Internet: Same as above but for a whole album, and saving the result to an index: Only download the cover for the album corresponding to the track. Use 'rmsrc' to avoid duplicating the audio file. Change tags inplace with entries from MusicBrainz: Set tags to titlecase while casing AC-DC correctly: To easily switch between formats from command-line, create one script per format (see 50-encoding.lua), e.g. ogg.lua and flac.lua. Then Add support for non-default formats from CLI: Overwrite existing destination if input is newer: ffmpeg(1), ffprobe(1), http://www.lua.org/pil/contents.html
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang). It includes implementations of Breiman and Cutler's Random Forest for classification and regression on heterogeneous numerical/categorical data with missing values and several related algorithms including entropy and cost driven classification, L1 regression and feature selection with artificial contrasts and hooks for modifying the algorithms for your needs. Command line utilities to grow, apply and analyze forests are provided in sub directories. CloudForest is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. Documentation has been generated with godoc and can be viewed live at: http://godoc.org/github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome; Code Repo and Issue tracker can be found at: https://github.com/ryanbressler/CloudForest CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the origional slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record Variable Importance in CloudForest is calculated as the mean decrease in impurity over all of the splits made using a feature. To provide a baseline for evaluating importance, artificial contrast features can be used by including shuffled copies of existing features. In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface. When compiled with go1.1 CloudForest achieves running times similar to implementations in other languages. Using gccgo (4.8.0 at least) results in longer running times and is not recommended until full go1.1 support is implemented in gcc 4.8.1. Development of CloudForest is being driven by our needs as we analyze large biomedical data sets. As such new and modified analysis will be added as needed. The basic functionality has stabilized but we have discussed several possible changes that may require additional abstraction and/or changes in the api. These include: Allow additional types of candidate features. Some multidimensional data types may not be best served by decomposition into categorical and numerical features. It would be possible to allow arbitrary feature types by adding CanidateFeature (which should expose BestSplit), CodedSplitter and Splitter abstraction. Allowing data to reside anywhere. This would involve abstracting FeatureMatrix to allow database etc driven implementations. "growforest" trains a forest using the following parameters which can be listed with -h "applyforest" applies a forest to the specified feature matrix and outputs predictions as a two column (caselabel predictedvalue) tsv. errorrate calculates the error of a forest vs a testing data set and reports it to standard out leafcount outputs counts of case case co-occurrence on leaf nodes (Brieman's proximity) and counts of the number of times a feature is used to split a node containing each case (a measure of relative/local importance). CloudForest borrows the annotated feature matrix (.afm) and stoicastic forest (.sf) file formats from Timo Erkkila's rf-ace which can be found at https://code.google.com/p/rf-ace/ An annotated feature matrix (.afm) file is a tab delineated file with column and row headers. Columns represent cases and rows represent features. A row header/feature id includes a prefix to specify the feature type Categorical and boolean features use strings for their category labels. Missing values are represented by "?","nan","na", or "null" (case insensitive). A short example: A stochastic forest (.sf) file contains a forest of decision trees. The main advantage of this format as opposed to an established format like json is that an sf file can be written iteratively tree by tree and multiple .sf files can be combined with minimal logic required allowing for massively parallel growth of forests with low memory use. An .sf file consists of lines each of which is a comma separated list of key value pairs. Lines can designate either a FOREST, TREE, or NODE. Each tree belongs to the preceding forest and each node to the preceding tree. Nodes must be written in order of increasing depth. CloudForest generates fewer fields then rf-ace but requires the following. Other fields will be ignored Forest requires forest type (only RF currently), target and ntrees: Tree requires only an int and the value is ignored though the line is needed to designate a new tree: Node requires a path encoded so that the root node is specified by "*" and each split left or right as "L" or "R". Leaf nodes should also define PRED such as "PRED=1.5" or "PRED=red". Splitter nodes should define SPLITTER with a feature id inside of double quotes, SPLITTERTYPE=[CATEGORICAL|NUMERICAL] and a LVALUE term which can be either a float inside of double quotes representing the highest value sent left or a ":" separated list of categorical values sent left. An example .sf file: Cloud forest can parse and apply .sf files generated by at least some versions of rf-ace. The idea for (and trademark of the term) Random Forests originated with Leo Brieman and Adele Cuttler. Their code and paper's can be found at: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm All code in CloudForest is original but some ideas for methods and optimizations were inspired by Timo Erkilla's rf-ace and Andy Liaw and Matthew Wiener randomForest R package based on Brieman and Cuttler's code: https://code.google.com/p/rf-ace/ http://cran.r-project.org/web/packages/randomForest/index.html The idea for Artificial Contrasts was found in: Eugene Tuv, Alexander Borisov, George Runger and Kari Torkkola's paper "Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination" http://www.researchgate.net/publication/220320233_Feature_Selection_with_Ensembles_Artificial_Variables_and_Redundancy_Elimination/file/d912f5058a153a8b35.pdf The idea for growing trees to minimize categorical entropy comes from Ross Quinlan's ID3: http://en.wikipedia.org/wiki/ID3_algorithm "The Elements of Statistical Learning" 2nd edition by Trevor Hastie, Robert Tibshirani and Jerome Friedman was also consulted during development.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
Example Reference C++ implementation[1] Jump consistent hash works by computing when its output changes as the number of buckets increases. Let ch(key, num_buckets) be the consistent hash for the key when there are num_buckets buckets. Clearly, for any key, k, ch(k, 1) is 0, since there is only the one bucket. In order for the consistent hash function to balanced, ch(k, 2) will have to stay at 0 for half the keys, k, while it will have to jump to 1 for the other half. In general, ch(k, n+1) has to stay the same as ch(k, n) for n/(n+1) of the keys, and jump to n for the other 1/(n+1) of the keys. Here are examples of the consistent hash values for three keys, k1, k2, and k3, as num_buckets goes up: A linear time algorithm can be defined by using the formula for the probability of ch(key, j) jumping when j increases. It essentially walks across a row of this table. Given a key and number of buckets, the algorithm considers each successive bucket, j, from 1 to num_buckets1, and uses ch(key, j) to compute ch(key, j+1). At each bucket, j, it decides whether to keep ch(k, j+1) the same as ch(k, j), or to jump its value to j. In order to jump for the right fraction of keys, it uses a pseudorandom number generator with the key as its seed. To jump for 1/(j+1) of keys, it generates a uniform random number between 0.0 and 1.0, and jumps if the value is less than 1/(j+1). At the end of the loop, it has computed ch(k, num_buckets), which is the desired answer. In code: We can convert this to a logarithmic time algorithm by exploiting that ch(key, j+1) is usually unchanged as j increases, only jumping occasionally. The algorithm will only compute the destinations of jumps the j’s for which ch(key, j+1) ≠ ch(key, j). Also notice that for these j’s, ch(key, j+1) = j. To develop the algorithm, we will treat ch(key, j) as a random variable, so that we can use the notation for random variables to analyze the fractions of keys for which various propositions are true. That will lead us to a closed form expression for a pseudorandom variable whose value gives the destination of the next jump. Suppose that the algorithm is tracking the bucket numbers of the jumps for a particular key, k. And suppose that b was the destination of the last jump, that is, ch(k, b) ≠ ch(k, b+1), and ch(k, b+1) = b. Now, we want to find the next jump, the smallest j such that ch(k, j+1) ≠ ch(k, b+1), or equivalently, the largest j such that ch(k, j) = ch(k, b+1). We will make a pseudorandom variable whose value is that j. To get a probabilistic constraint on j, note that for any bucket number, i, we have j ≥ i if and only if the consistent hash hasn’t changed by i, that is, if and only if ch(k, i) = ch(k, b+1). Hence, the distribution of j must satisfy Fortunately, it is easy to compute that probability. Notice that since P( ch(k, 10) = ch(k, 11) ) is 10/11, and P( ch(k, 11) = ch(k, 12) ) is 11/12, then P( ch(k, 10) = ch(k, 12) ) is 10/11 * 11/12 = 10/12. In general, if n ≥ m, P( ch(k, n) = ch(k, m) ) = m / n. Thus for any i > b, Now, we generate a pseudorandom variable, r, (depending on k and j) that is uniformly distributed between 0 and 1. Since we want P(j ≥ i) = (b+1) / i, we set P(j ≥ i) iff r ≤ (b+1) / i. Solving the inequality for i yields P(j ≥ i) iff i ≤ (b+1) / r. Since i is a lower bound on j, j will equal the largest i for which P(j ≥ i), thus the largest i satisfying i ≤ (b+1) / r. Thus, by the definition of the floor function, j = floor((b+1) / r). Using this formula, jump consistent hash finds ch(key, num_buckets) by choosing successive jump destinations until it finds a position at or past num_buckets. It then knows that the previous jump destination is the answer. To turn this into the actual code of figure 1, we need to implement random. We want it to be fast, and yet to also to have well distributed successive values. We use a 64bit linear congruential generator; the particular multiplier we use produces random numbers that are especially well distributed in higher dimensions (i.e., when successive random values are used to form tuples). We use the key as the seed. (For keys that don’t fit into 64 bits, a 64 bit hash of the key should be used.) The congruential generator updates the seed on each iteration, and the code derives a double from the current seed. Tests show that this generator has good speed and distribution. It is worth noting that unlike the algorithm of Karger et al., jump consistent hash does not require the key to be hashed if it is already an integer. This is because jump consistent hash has an embedded pseudorandom number generator that essentially rehashes the key on every iteration. The hash is not especially good (i.e., linear congruential), but since it is applied repeatedly, additional hashing of the input key is not necessary. [1] http://arxiv.org/pdf/1406.2294v1.pdf
Package pbc provides structures for building pairing-based cryptosystems. It is a wrapper around the Pairing-Based Cryptography (PBC) Library authored by Ben Lynn (https://crypto.stanford.edu/pbc/). This wrapper provides access to all PBC functions. It supports generation of various types of elliptic curves and pairings, element initialization, I/O, and arithmetic. These features can be used to quickly build pairing-based or conventional cryptosystems. The PBC library is designed to be extremely fast. Internally, it uses GMP for arbitrary-precision arithmetic. It also includes a wide variety of optimizations that make pairing-based cryptography highly efficient. To improve performance, PBC does not perform type checking to ensure that operations actually make sense. The Go wrapper provides the ability to add compatibility checks to most operations, or to use unchecked elements to maximize performance. Since this library provides low-level access to pairing primitives, it is very easy to accidentally construct insecure systems. This library is intended to be used by cryptographers or to implement well-analyzed cryptosystems. Cryptographic pairings are defined over three mathematical groups: G1, G2, and GT, where each group is typically of the same order r. Additionally, a bilinear map e maps a pair of elements — one from G1 and another from G2 — to an element in GT. This map e has the following additional property: If G1 == G2, then a pairing is said to be symmetric. Otherwise, it is asymmetric. Pairings can be used to construct a variety of efficient cryptosystems. The PBC library currently supports 5 different types of pairings, each with configurable parameters. These types are designated alphabetically, roughly in chronological order of introduction. Type A, D, E, F, and G pairings are implemented in the library. Each type has different time and space requirements. For more information about the types, see the documentation for the corresponding generator calls, or the PBC manual page at https://crypto.stanford.edu/pbc/manual/ch05s01.html. This package must be compiled using cgo. It also requires the installation of GMP and PBC. During the build process, this package will attempt to include <gmp.h> and <pbc/pbc.h>, and then dynamically link to GMP and PBC. Most systems include a package for GMP. To install GMP in Debian / Ubuntu: For an RPM installation with YUM: For installation with Fink (http://www.finkproject.org/) on Mac OS X: For more information or to compile from source, visit https://gmplib.org/ To install the PBC library, download the appropriate files for your system from https://crypto.stanford.edu/pbc/download.html. PBC has three dependencies: the gcc compiler, flex (http://flex.sourceforge.net/), and bison (https://www.gnu.org/software/bison/). See the respective sites for installation instructions. Most distributions include packages for these libraries. For example, in Debian / Ubuntu: The PBC source can be compiled and installed using the usual GNU Build System: After installing, you may need to rebuild the search path for libraries: It is possible to install the package on Windows through the use of MinGW and MSYS. MSYS is required for installing PBC, while GMP can be installed through a package. Based on your MinGW installation, you may need to add "-I/usr/local/include" to CPPFLAGS and "-L/usr/local/lib" to LDFLAGS when building PBC. Likewise, you may need to add these options to CGO_CPPFLAGS and CGO_LDFLAGS when installing this package. This package is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. For additional details, see the COPYING and COPYING.LESSER files. This example generates a pairing and some random group elements, then applies the pairing operation. This example computes and verifies a Boneh-Lynn-Shacham signature in a simulated conversation between Alice and Bob.
Package grpcreplay supports the capture and replay of gRPC calls. Its main goal is to improve testing. Once you capture the calls of a test that runs against a real service, you have an "automatic mock" that can be replayed against the same test, yielding a unit test that is fast and flake-free. To record a sequence of gRPC calls to a file, create a Recorder and pass its DialOptions to grpc.Dial: It is essential to close the Recorder when the interaction is finished. There is also a NewRecorderWriter function for capturing to an arbitrary io.Writer. To replay a captured file, create a Replayer and ask it for a (fake) connection. We don't actually have to dial a server. (Since we're reading the file and not writing it, we don't have to be as careful about the error returned from Close). A test might use random or time-sensitive values, for instance to create unique resources for isolation from other tests. The test therefore has initial values, such as the current time, or a random seed, that differ from run to run. You must record this initial state and re-establish it on replay. To record the initial state, serialize it into a []byte and pass it as the second argument to NewRecorder: On replay, get the bytes from Replayer.Initial: Recorders and replayers have support for running callbacks before messages are written to or read from the replay file. A Recorder has a BeforeFunc that can modify a request or response before it is written to the replay file. The actual RPCs sent to the service during recording remain unaltered; only what is saved in the replay file can be changed. A Replayer has a BeforeFunc that can modify a request before it is sent for matching. Example uses for these callbacks include customized logging, or scrubbing data before RPCs are written to the replay file. If requests are modified by the callbacks during recording, it is important to perform the same modifications to the requests when replaying, or RPC matching on replay will fail. A common way to analyze and modify the various messages is to use a type switch. A nondeterministic program may invoke RPCs in a different order each time it is run. The order in which RPCs are called during recording may differ from the order during replay. The replayer matches incoming to recorded requests by method name and request contents, so nondeterminism is only a concern for identical requests that result in different responses. A nondeterministic program whose behavior differs depending on the order of such RPCs probably has a race condition: since both the recorded sequence of RPCs and the sequence during replay are valid orderings, the program should behave the same under both. The same is not true of streaming RPCs. The replayer matches streams only by method name, since it has no other information at the time the stream is opened. Two streams with the same method name that are started concurrently may replay in the wrong order. Besides the differences in replay mentioned above, other differences may cause issues for some programs. We list them here. The Replayer delivers a response to an RPC immediately, without waiting for other incoming RPCs. This can violate causality. For example, in a Pub/Sub program where one goroutine publishes and another subscribes, during replay the Subscribe call may finish before the Publish call begins. For streaming RPCs, the Replayer delivers the result of Send and Recv calls in the order they were recorded. No attempt is made to match message contents. At present, this package does not record or replay stream headers and trailers, or the result of the CloseSend method.