Benchstat computes and compares statistics about benchmarks. This package has moved. Please use https://golang.org/x/perf/cmd/benchstat
Probability distributions and statistics.
`ipfixcat` is a utility to parse and print an IPFIX stream, as defined by RFC 5101. It's also the minimal demo of how to use the github.com/calmh/ipfix package. Grab a binary release from https://github.com/calmh/ipfixcat/releases. You can also build from source. Make sure you have Go 1.1 installed. See http://golang.org/doc/install. The output format is JSON with one object per line. Each object has fields `exportTime` (UNIX epoch seconds), `templateId` and `elements`. The latter is an array containing the information elements in the same order as received by the exporter. Each information element has the fields `name`, `enterprise`, `field`, `value` and `rawvalue`. For vendor fields that are not described by a user dictionary, `name` and `value` will be empty and `rawvalue` contains a byte array. For fully understood fields, `value` contains the parsed value and `rawvalue` is empty. There are some statistics that can be enabled as well, see `ipfixcat -help` for more information. Parse a UDP IPFIX stream, using a custom dictionary to interpret vendor fields. Note that it might take a while to start displaying datasets, because we need to receive the periodically sent template sets first in order to be able to parse them. Don't attempt to use netcat (`nc`) for reading UDP streams. Almost all distributed versions are broken and truncate UDP packets at 1024 bytes. The MIT License.
Package lunk provides a set of tools for structured logging in the style of Google's Dapper or Twitter's Zipkin. When we consider a complex event in a distributed system, we're actually considering a partially-ordered tree of events from various services, libraries, and modules. Consider a user-initiated web request. Their browser sends an HTTP request to an edge server, which extracts the credentials (e.g., OAuth token) and authenticates the request by communicating with an internal authentication service, which returns a signed set of internal credentials (e.g., signed user ID). The edge web server then proxies the request to a cluster of web servers, each running a PHP application. The PHP application loads some data from several databases, places the user in a number of treatment groups for running A/B experiments, writes some data to a Dynamo-style distributed database, and returns an HTML response. The edge server receives this response and proxies it to the user's browser. In this scenario we have a number of infrastructure-specific events: This scenario also involves a number of events which have little to do with the infrastructure, but are still critical information for the business the system supports: There are a number of different teams all trying to monitor and improve aspects of this system. Operational staff need to know if a particular host or service is experiencing a latency spike or drop in throughput. Development staff need to know if their application's response times have gone down as a result of a recent deploy. Customer support staff need to know if the system is operating nominally as a whole, and for customers in particular. Product designers and managers need to know the effect of an A/B test on user behavior. But the fact that these teams will be consuming the data in different ways for different purposes does mean that they are working on different systems. In order to instrument the various components of the system, we need a common data model. We adopt Dapper's notion of a tree to mean a partially-ordered tree of events from a distributed system. A tree in Lunk is identified by its root ID, which is the unique ID of its root event. All events in a common tree share a root ID. In our photo example, we would assign a unique root ID as soon as the edge server received the request. Events inside a tree are causally ordered: each event has a unique ID, and an optional parent ID. By passing the IDs across systems, we establish causal ordering between events. In our photo example, the two database queries from the app would share the same parent ID--the ID of the event corresponding to the app handling the request which caused those queries. Each event has a schema of properties, which allow us to record specific pieces of information about each event. For HTTP requests, we can record the method, the request URI, the elapsed time to handle the request, etc. Lunk is agnostic in terms of aggregation technologies, but two use cases seem clear: real-time process monitoring and offline causational analysis. For real-time process monitoring, events can be streamed to a aggregation service like Riemann (http://riemann.io) or Storm (http://storm.incubator.apache.org), which can calculate process statistics (e.g., the 95th percentile latency for the edge server responses) in real-time. This allows for adaptive monitoring of all services, with the option of including example root IDs in the alerts (e.g., 95th percentile latency is over 300ms, mostly as a result of requests like those in tree XXXXX). For offline causational analysis, events can be written in batches to batch processing systems like Hadoop or OLAP databases like Vertica. These aggregates can be queried to answer questions traditionally reserved for A/B testing systems. "Did users who were show the new navbar view more photos?" "Did the new image optimization algorithm we enabled for 1% of views run faster? Did it produce smaller images? Did it have any effect on user engagement?" "Did any services have increased exception rates after any recent deploys?" &tc &tc By capturing the root ID of a particular web request, we can assemble a partially-ordered tree of events which were involved in the handling of that request. All events with a common root ID are in a common tree, which allows for O(M) retrieval for a tree of M events. To send a request with a root ID and a parent ID, use the Event-ID HTTP header: The header value is simply the root ID and event ID, hex-encoded and separated with a slash. If the event has a parent ID, that may be included as an optional third parameter. A server that receives a request with this header can use this to properly parent its own events. Each event has a set of named properties, the keys and values of which are strings. This allows aggregation layers to take advantage of simplifying assumptions and either store events in normalized form (with event data separate from property data) or in denormalized form (essentially pre-materializing an outer join of the normalized relations). Durations are always recorded as fractional milliseconds. Lunk currently provides two formats for log entries: text and JSON. Text-based logs encode each entry as a single line of text, using key="value" formatting for all properties. Event property keys are scoped to avoid collisions. JSON logs encode each entry as a single JSON object.
Package gomempool implements a simple memory pool for byte slices. The Go garbage collector runs more often when more bytes are allocated. (For full details, see the runtime package documentation on the GOGC variable.) Avoiding allocations can help the stop-the-world GC run much less often. To determine if you should use this, first deploy your code into as realistic an environment as possible. Extract a runtime.MemStats structure from your running code. Examine the various Pause fields in that structure to determine if you have a GC problem, preferably in conjunction with some other monitoring of real performance. (Remember the numbers are in nanoseconds.) If the numbers you have are OK for your use case, STOP HERE. Do not proceed. If you are generating a lot of garbage collection pauses, the next question is why. Profile the heap. If the answer is anything other than []byte slices, STOP HERE. Do not proceed. Finally, if you are indeed seeing a lot of allocations of []bytes, you may wish to preceed with using this library. gomempool is a power tool; it can save your program, but it can blow your feet off, too. (I've done both.) That said, despite the narrow use case of this library, it can have an effect in certain situations, which I happened to encounter. I suspect the biggest use case is a network application that often allocates large messages, which is what I have, causing an otherwise relatively memory-svelte program to allocate dozens of megabytes per second of []byte buffers to process messages. To use the pool, there are three basic steps: The following prose documentation will cover each step at length. Expand the Example below for concise usage examples and things you can copy & paste. First, create a pool. A *Pool is obtained by calling gomempool.New. All methods on the *Pool are threadsafe. The pool can be configured via the New call. A nil pointer of type *gomempool.Pool is also a valid pool. This will use normal make() to create byte slices and simply discard the slice when asked to .Return() it. This is convenient for testing whether you've got a memory error in your code, because you can swap in a nil Pool pointer without changing any other code. If an error goes away when you do that, you have a memory error. (Probably .Return()ing something too soon.) []bytes in the pool come with an "Allocator" that is responsible for returning them correctly. To obtain an allocator, you call GetNewAllocator() on the pool. This returns an allocator that is not yet used. You may then call .Allocate(uint64) on it to assign a []byte from the pool. This []byte is then associated with that Allocator until you call .Return() on the allocator, after which the []byte goes back into the pool. If you ask for more bytes than the pool is configured to store, the Allocator will create a transient []byte, which it will not manage. You can check whether you are invoking this case by calling .MaxSize on the pool. You MUST NOT call .Return() until you are entirely done with the []byte. This includes shared slices you may have created; this is by far the easiest way to get in trouble with a []byte pool, as it is easy to accidentally introduce sharing without realizing it. You must also make sure not to do anything with your []byte that might cause another []byte to be created instead; for instance, using your pooled []byte in an "append" call is dangerous, because the runtime might decide to give you back a []byte that backs to an entirely different array. In this case your []byte and your Allocator will cease to be related. If the Allocator is correctly managed, your code will not fail, but you won't be getting any benefit, either. You may retrieve the []byte slice corresponding to an Allocator at any time by calling .Bytes(). This means that if you do need to pass the []byte and Allocator around, it suffices to pass just the Allocator. (Though, be aware that the Allocator's []byte will be of the original size you asked for. This can not be changed, as you can not change the original slice itself.) Allocators can be reused freely, as long as they are used correctly. However, an individual Allocator is not threadsafe. Its interaction with the Pool is, but its internal values are not; do not use the same Allocator from more than one goroutine at a time. Once allocated, an allocator will PANIC if you try to allocate again with ErrBytesAlreadyAllocated. Once .Return() has been called, an allocator will PANIC if you try to .Return() the []byte again. If no []byte is currently allocated, .Bytes() will PANIC if called. This is fully on purpose. All of these situations represent profound errors in your code. This sort of error is just as dangerous in Go as it is in any other language. Go may not segfault, but memory management issues can still dish out the pain; better to find out earlier rather than later. You can combine obtaining an Allocator and a particular sized []byte by calling .Allocate() on the pool. The Allocators returned by the nil *Pool use make() to create new slices every time, and simply discard the []byte when done, but they enforce the exact same rules as the "real" Allocators, and panic in all the same places. This is so there is as little difference as possible between the two types of pools. Optionally return the byte slices Calling .Return() on an allocator is optional. If a []byte is not returned to the pool, you do not get any further benefit from the pool for that []byte, but the garbage collector will still clean it up normally. This means using a pool is still feasible even if some of your code paths may need to retain a []byte for a long or complicated period of time. If you have a byte slice that is getting passed through some goroutines, I recommend creating a structure that holds all the relevant data about the object bound together with the allocator: which makes it easy to keep the two bound together, and pass them around, with only the last user finally performing the deallocation. This library does not provide this structure since all I could give you is basically the above struct, with an interface{} in it. You can query the pool for its cache statistics by calling Stats(), which will return a structure describing how the individual buckets are performing. Quality: At the moment I would call this alpha code. Go lint clean, go vet clean, 100% coverage in the tests. You and I both know that doesn't prove this is bug-free, but at least it shows I care.
Package gomempool implements a simple memory pool for byte slices. The Go garbage collector runs more often when more bytes are allocated. (For full details, see the runtime package documentation on the GOGC variable.) Avoiding allocations can help the stop-the-world GC run much less often. To determine if you should use this, first deploy your code into as realistic an environment as possible. Extract a runtime.MemStats structure from your running code. Examine the various Pause fields in that structure to determine if you have a GC problem, preferably in conjunction with some other monitoring of real performance. (Remember the numbers are in nanoseconds.) If the numbers you have are OK for your use case, STOP HERE. Do not proceed. If you are generating a lot of garbage collection pauses, the next question is why. Profile the heap. If the answer is anything other than []byte slices, STOP HERE. Do not proceed. Finally, if you are indeed seeing a lot of allocations of []bytes, you may wish to preceed with using this library. gomempool is a power tool; it can save your program, but it can blow your feet off, too. (I've done both.) That said, despite the narrow use case of this library, it can have an effect in certain situations, which I happened to encounter. I suspect the biggest use case is a network application that often allocates large messages, which is what I have, causing an otherwise relatively memory-svelte program to allocate dozens of megabytes per second of []byte buffers to process messages. To use the pool, there are three basic steps: The following prose documentation will cover each step at length. Expand the Example below for concise usage examples and things you can copy & paste. First, create a pool. A *Pool is obtained by calling gomempool.New. All methods on the *Pool are threadsafe. The pool can be configured via the New call. A nil pointer of type *gomempool.Pool is also a valid pool. This will use normal make() to create byte slices and simply discard the slice when asked to .Return() it. This is convenient for testing whether you've got a memory error in your code, because you can swap in a nil Pool pointer without changing any other code. If an error goes away when you do that, you have a memory error. (Probably .Return()ing something too soon.) []bytes in the pool come with an "Allocator" that is responsible for returning them correctly. To obtain an allocator, you call GetNewAllocator() on the pool. This returns an allocator that is not yet used. You may then call .Allocate(uint64) on it to assign a []byte from the pool. This []byte is then associated with that Allocator until you call .Return() on the allocator, after which the []byte goes back into the pool. Allocations never fail (barring a complete out-of-memory situation, of course). If the pool does not have a correctly-sized []byte on hand, it will create one. If you ask for more bytes than the pool is configured to store, the Allocator will create a transient []byte, which it will not manage. You can check whether you are invoking this case by calling .MaxSize on the pool. You MUST NOT call .Return() until you are entirely done with the []byte. This includes shared slices you may have created; this is by far the easiest way to get in trouble with a []byte pool, as it is easy to accidentally introduce sharing without realizing it. You must also make sure not to do anything with your []byte that might cause another []byte to be created instead; for instance, using your pooled []byte in an "append" call is dangerous, because the runtime might decide to give you back a []byte that backs to an entirely different array. In this case your []byte and your Allocator will cease to be related. If the Allocator is correctly managed, your code will not fail, but you won't be getting any benefit, either. You may retrieve the []byte slice corresponding to an Allocator at any time by calling .Bytes(). This means that if you do need to pass the []byte and Allocator around, it suffices to pass just the Allocator. (Though, be aware that the Allocator's []byte will be of the original size you asked for. This can not be changed, as you can not change the original slice itself.) Allocators can be reused freely, as long as they are used correctly. However, an individual Allocator is not threadsafe. Its interaction with the Pool is, but its internal values are not; do not use the same Allocator from more than one goroutine at a time. Once allocated, an allocator will PANIC if you try to allocate again with ErrBytesAlreadyAllocated. Once .Return() has been called, an allocator will PANIC if you try to .Return() the []byte again. If no []byte is currently allocated, .Bytes() will PANIC if called. This is fully on purpose. All of these situations represent profound errors in your code. This sort of error is just as dangerous in Go as it is in any other language. Go may not segfault, but memory management issues can still dish out the pain; better to find out earlier rather than later. You can combine obtaining an Allocator and a particular sized []byte by calling .Allocate() on the pool. The Allocators returned by the nil *Pool use make() to create new slices every time, and simply discard the []byte when done, but they enforce the exact same rules as the "real" Allocators, and panic in all the same places. This is so there is as little difference as possible between the two types of pools. Optionally return the byte slices Calling .Return() on an allocator is optional. If a []byte is not returned to the pool, you do not get any further benefit from the pool for that []byte, but the garbage collector will still clean it up normally. This means using a pool is still feasible even if some of your code paths may need to retain a []byte for a long or complicated period of time. If you have a byte slice that is getting passed through some goroutines, I recommend creating a structure that holds all the relevant data about the object bound together with the allocator: which makes it easy to keep the two bound together, and pass them around, with only the last user finally performing the deallocation. This library does not provide this structure since all I could give you is basically the above struct, with an interface{} in it. You can query the pool for its cache statistics by calling Stats(), which will return a structure describing how the individual buckets are performing.
Package rudd defines a concrete type for Binary Decision Diagrams (BDD), a data structure used to efficiently represent Boolean functions over a fixed set of variables or, equivalently, sets of Boolean vectors with a fixed size. Each BDD has a fixed number of variables, Varnum, declared when it is initialized (using the method New) and each variable is represented by an (integer) index in the interval [0..Varnum), called a level. Our library support the creation of multiple BDD with possibly different number of variables. Most operations over BDD return a Node; that is a pointer to a "vertex" in the BDD that includes a variable level, and the address of the low and high branch for this node. We use integer to represent the address of Nodes, with the convention that 1 (respectively 0) is the address of the constant function True (respectively False). For the most part, data structures and algorithms implemented in this library are a direct adaptation of those found in the C-library BuDDy, developed by Jorn Lind-Nielsen; we even implemented the same examples than in the BuDDy distribution for benchmarks and regression testing. We provide two possible implementations for BDD that can be selected using build tags. Our default implementation (without build tag) use a standard Go runtime hashmap to encode a "unicity table". When building your executable with the build tag `buddy`, the API will switch to an implementation that is very close to the one of the BuDDy library; based on a specialized data-structure that mix a dynamic array with a hash table. To get access to better statistics about caches and garbage collection, as well as to unlock logging of some operations, you can also compile your executable with the build tag `debug`. The library is written in pure Go, without the need for CGo or any other dependencies. Like with MuDDy, a ML interface to BuDDy, we piggyback on the garbage collection mechanism offered by our host language (in our case Go). We take care of BDD resizing and memory management directly in the library, but "external" references to BDD nodes made by user code are automatically managed by the Go runtime. Unlike MuDDy, we do not provide an interface, but a genuine reimplementation of BDD in Go. As a consequence, we do not suffer from FFI overheads when calling from Go into C. The following is an example of a callback handler, used in a call to Allnodes, that counts the number of active nodes in the whole BDD. The following is an example of a callback handler, used in a call to Allsat, that counts the number of possible assignments (such that we do not count don't care twice). This example shows the basic usage of the package: create a BDD, compute some expressions and output the result.
Package dumpcap provides an interface to Wireshark's dumpcap tool. You can use dumpcap to find out about available network devices and their capabilities, receive live statistics about the number of packets seen on each device and capture traffic using various options. On most BSD/Linux distributions dumpcap is suid'd so one can avoid root credibilities while processing captured traffic, possibly using packages like gopackets.
Package toolbox is a middleware that provides health check, pprof, profile and statistic services for Macaron.
Package couch implements a client for a CouchDB database. Version 0.1 focuses on basic operations, proper conflict management, error handling and replication. Not part of this version are attachment handling, general statistics and optimizations, change detection and creating views. Most of the features are accessible using the generic Do() function, though. Getting started: Every document in CouchDB has to be identifiable by a document id and a revision id. Two types already implement this interface called Identifiable: Doc and DynamicDoc. Doc can be used as an anonymous field in your own struct. DynamicDoc is a type alias for map[string]interface{}, use it when your documents have no implicit schema at all. To make code examples easier to follow, there will be no explicit error handling in these examples even though it's fully supported throughout the API. Insert() will create a new document if it doesn't have an id yet: After the operation the final id and revision id will be written back to p. That's why you can now just edit p and call Insert() again which will save the same document under a new revision. After this edit, p will contain the latest revision id. Note that it is possible that this second edit fails because someone else edited and saved the same document in the meantime. You will be notified of this in form of an error and you should then first retrieve the latest document revision to see the changes of this lost update: CouchDB doesn't edit documents in-place but adds a complete revision for each edit. That's why you will be correctly informed of any lost update. Because CouchDB supports multi-master replication of databases, it is possible that conflicts like the one described above can't be avoided. CouchDB is not going to interrupt replication because of a lost update. Let's say you have two instances running, maybe a central one and a mobile one and both are kept in sync by replication. Now let's assume you edit a document on your mobile DB and someone else edits the same document on the central DB. After you've come online again, you use bi-directional replication to sync the databases. CouchDB will now create a branch structure for your document, similar to version control systems. Your document has two conflicting revisions and in this case they can't necessarily be resolved automatically. This client helps you with a number of methods to resolve such an issue quickly. Read more about the conflict model http://docs.couchdb.org/en/latest/replication/conflicts.html Continuing with above example, replicate the database: Now, on the other database, edit the document (note that it has the same id there): Now edit the document on the first database. Retrieve it first to make sure it has the correct revision id: Now replicate anotherDB back to our first database: Now we have two conflicting versions of a document. Only you as the editor can decide whether "LatestAnna" or "AnotherAnna" is correct. To detect this conflict there are a number of methods. First, you can just ask a document: You probably want to have a look at the revisions in your preferred format, use Revisions() to unmarshal the revision data into a slice of a custom data type: Pick one of the revisions or create a new document to solve the conflict: That's it. You can detect conflicts like these throughout your database using: Errors returned by CouchDB will be converted into a Go error. Its regular Error() method will then return a combination of the shortform (e.g. bad_request) as well as the longer and more specific description. To be able to identify a specific error within your application, use ErrorType() to get the shortform only.
Package stats provides statistic utilities for Plan 9.
Package check provide helpers to complement Go testing package. This package is like testify/assert on steroids. :) Just wrap each (including subtests) *testing.T using check.T() and write tests as usually with testing package. Call new methods provided by this package to have more clean/concise test code and cool dump/diff. To get optional statistics about executed checkers add: When use goconvey tool, to get nice diff in web UI add: ★ How to check for errors: ★ Each check returns bool, so you can easily skip problematic code: ★ You can turn any check into assertion to stop test immediately: ★ You can turn all checks into assertions to stop test immediately: ★ You can provide extra description to each check: ★ There are short synonyms for checks implementing usual ==, !=, etc.: ★ If you need custom check, which isn't available out-of-box - see Should checker, it'll let you plug in your own checker with ease. ★ It will panic when called with arg of wrong type - because this means bug in your test. ★ If you don't see colors in `go test` output it may happens because of two reasons: either your $TERM doesn't contain substring "color" or you're running `go test path/to/your/package`. To force colored output in last case just set this environment variable: There are few special functions (assertion, custom checkers, etc.). Everything else are just trivial (mostly) checkers which works in obvious way and accept values of any types which makes sense (and panics on everything else).
Package aetools helps writting, testing and analysing Google App Engine applications. The aetools package implements a simple API to export the entity data from Datastore as a JSON stream, as well as load a JSON stream back into the Datastore. This can be used as a simple way to express state into a unit test, to backup a development environment state that can be shared with team members, or to make quick batch changes to data offline, like setting up configuration entities via Remote API. The goal is to provide both an API and a set of executable tools that uses that API, allowing for maximum flexibility. The functions Load, LoadJSON, Dump and DumpJSON operate using JSON data that represents datastore entities. Each entity is mapped to a JSON Object, where each entity property name is an Object atribute, and each property value is the corresponding Object atribute value. The property value is encoded using a JSON primitive, when possible. When the primitives are not sufficient to represent the property value, a JSON Object with the attributes "type" and "value" is used. The "type" attribute is a Datastore type, and value is a json-primitive serialization of that value. For instance, Blobs are encoded as a base64 JSON string, and time.Time values are encoded using the time.RFC3339 layout, also as strings. Datastore Keys are aways encoded as a JSON Array that represents the Key Path, including ancestors, but without the application ID. This is done to allow the entity key to be more readable and to be application independent. Currently, they don't support namespaces. Multiple properties are represented as a JSON Array of values described above. Unindexed properties are aways JSON objects with the "indexed" attribute set to false. This format is intended to make use of the JSON types as much as possible, so an entity can be easily represented as a text file, suitable for read or SCM checkin. The exported data format can also be used as an alternative way to export from Datastore, and then load the results right into other service, such as Google BigQuery or MongoDB. The package aetools/bundle contains a sample webapp to help you manage and stream datastore entities into BigQuery. The bundle uses the aetools/bigquerysync functions to infer an usefull schema from datastore statistics, and sync your entity data into BigQuery. The command aetools/remote_api is a Remote API client that exposes the Load and Dump functions to make backup and restore of development environment state quick and easy. This tool can also help setting up Q.A. or Production apps, but should be used with care.
panopticon collects statistics posted to it, and records them in a sqlite3 or mysql database.
Gonum is a set of packages designed to make writing numerical and scientific algorithms productive, performant, and scalable. Gonum contains libraries for matrices and linear algebra; statistics, probability distributions, and sampling; tools for function differentiation, integration, and optimization; network creation and analysis; and more.
Package zerolog is a package that prints messages in a format the 0-Core log monitor can read and use for logging (errors), statistics, selfhealing and other features. Usage: Output: Accepted message types may very on provided log level: String message (e.g.: LevelStdout, LevelStderr) takes strings, string aliases, types that implement fmt.Stringer, types that implement encoding.TextMarshaler. Statistics message (e.g: LevelStatistics) takes a MsgStatistics to have fields and validation for data required by the 0-Core statistics monitor https://github.com/zero-os/0-core/blob/master/docs/monitoring/stats.md The MsgStatistics Operation field takes an AggregationType which defines the data aggregation strategy for the 0-core. The MsgStatistics Tags field takes a MetricTags type which is a map with a string as key and an interface as value. When logging this map is formatted to a flat string, if the value is a string it will simply be added to the formatted string, if not it will check the value implements the fmt.Stringer or encoding.TextMarshaler interfaces, as a last resort the value will be turned into a string using fmt.Sprint. JSON messages (e.g.: LevelJSON) takes any type that can be marshalled to JSON. package docs: https://github.com/zero-os/0-log/blob/master/README.md Information about the 0-Core monitoring: https://github.com/zero-os/0-core/blob/master/docs/monitoring/logging.md
Package wifi provides access to IEEE 802.11 WiFi device actions and statistics.
Package statter is a statistics collector with multiple reporters..
Netspot is a simple IDS with statistical learning. It works either directly on a device (network interface or .pcap file) or through a server exposing a basic REST API. Basically, netspot monitors network statistics and detect abnormal events. Its core mainly relies on the SPOT algorithm (https://asiffer.github.io/libspot/) which flags extreme events on high throughput streaming data. The following example runs netspot on the localhost interface. In particular, it monitors the ratio of SYN packets every 2 seconds Raw data are printed to the console.
Ecosum prepares a summary of Go ecosystem pipeline results. Usage: The Go ecosystem pipeline runs analysis programs, such as new vet analyzers, on the latest versions of public Go packages. (For security reasons, it is currently only available inside Google.) Ecosum takes the raw results from an analysis and prepares a nicely formatted version that can be published to inform design discussions. Ecosum prints a report with statistics and then a random sample of 100 diagnostics. The number of diagnostics can be changed with the -n flag. A negative maximum sets no limit. By default ecosum considers all diagnostic errors in the report. The -g (grep) flag only considers diagnostics with messages matching regexp. The output is formatted as Markdown that can be pasted into a GitHub issue but is also mostly human-readable for direct use.
statshub is a repository for incrementally calculated statistics stored using a dimensional model. To run a local server for testing: Example stats updates using curl against the local server: Example stats get (for the country dimension): Example stats get (for all dimensions): See the README at https://github.com/getlantern/statshub for more information.
Probability distributions and statistics.
Command goat provides an implementation of a BitTorrent tracker, written in Go. goat can be built using Go 1.1+. It can be downloaded, built, and installed, simply by running: In addition, goat depends on a MySQL server for data storage. After creating a database and user for goat, its database schema may be imported from the SQL files located in 'res/'. goat will not run unless MySQL is installed, and a database and user are properly configured for its use. Optionally, goat can be built to use ql (https://github.com/cznic/ql) as its storage backend. This is done by supplying the 'ql' tag in the go get command: A blank ql database file is located under 'res/ql/goat.db', and will be copied to '~/.config/goat/goat.db' on UNIX systems. goat is now able to use ql as its storage backend, for those who do not wish to use an external, MySQL backend. goat is capable of listening for torrent traffic in three modes: HTTP, HTTPS, and UDP. HTTP/HTTPS are the recommended methods, and are required in order for goat to serve its API, and to allow use of private tracker passkeys. HTTP is considered the standard mode of operation for goat. HTTP allows gathering a great number of metrics, use of passkeys, use of a client whitelist, and access to goat's RESTful API, when configured. For most trackers, this will be the only listener which is necessary in order for goat to function properly. The HTTPS listener provides a method to encrypt traffic to the tracker, but must be used with caution. Unless the SSL certificate in use is signed by a proper certificate authority, it will distress most clients, and they may outright refuse to announce to it. If you are in possession of a certificate signed by a certificate authority, this mode may be more ideal, as it provides added security for your clients. The UDP listener is the most unusual method of the three, and should only be used for public trackers. The BitTorrent UDP tracker protocol specifies a very specific packet format, meaning that additional information or parameters cannot be packed into a UDP datagram in a standard way. The UDP tracker may be the fastest and least bandwidth-intensive, but as stated, should only be used for public trackers. A new feature goat added to goat in order to allow better interoperability with many languages is a RESTful API, which is served using the HTTP or HTTPS listeners. This API enables easy retrieval of tracker statistics, while allowing goat to run as a completely independent process. It should be noted that the API is only enabled when configured, and when a HTTP or HTTPS listener is enabled. Without a transport mechanism, the API will be inaccessible. Currently, the API is read-only, and only allows use of the HTTP GET method. This may change in the future, but as of now, it doesn't make any sense to modify tracker parameters without doing a proper announce or scrape via BitTorrent client. The API will feature several modes of authentication, including HTTP Basic and HMAC-SHA1. For the time being, only HTTP Basic is implemented. This method makes use of a username/password pair using the user's username, and an API key as the password. This list contains all API calls currently recognized by goat. Each call must be authenticated using the aforementioned methods. Retrieve a list of all files tracked by goat. Some extended attributes are not added to reduce strain on database, and to provide a more general overview. Retrieve extended attributes about a specific file with matching ID. This provides counts for number of completions, seeders, leechers, and a list of fileUser relationships associated with a given file. Retrieve a variety of metrics about the current status of goat, including its PID, hostname, memory usage, number of HTTP/UDP hits, etc. goat is configured using a JSON file, which will be created under '~/.config/goat/config.json' on UNIX systems. Here is an example configuration, describing the settings available to the user.
Generating random text: a Markov chain algorithm Based on the program presented in the "Design and Implementation" chapter of The Practice of Programming (Kernighan and Pike, Addison-Wesley 1999). See also Computer Recreations, Scientific American 260, 122 - 125 (1989). A Markov chain algorithm generates text by creating a statistical model of potential textual suffixes for a given prefix. Consider this text: Our Markov chain algorithm would arrange this text into this set of prefixes and suffixes, or "chain": (This table assumes a prefix length of two words.) To generate text using this table we select an initial prefix ("I am", for example), choose one of the suffixes associated with that prefix at random with probability determined by the input statistics ("a"), and then create a new prefix by removing the first word from the prefix and appending the suffix (making the new prefix is "am a"). Repeat this process until we can't find any suffixes for the current prefix or we exceed the word limit. (The word limit is necessary as the chain table may contain cycles.) Our version of this program reads text from standard input, parsing it into a Markov chain, and writes generated text to standard output. The prefix and output lengths can be specified using the -prefix and -words flags on the command-line.