Search API

github.com/weatherapicom/go

* Weather API * * # Introduction WeatherAPI.com provides access to weather and geo data via a JSON/XML restful API. It allows developers to create desktop, web and mobile applications using this data very easy. We provide following data through our API: - Real-time weather - 14 day weather forecast - Astronomy - Time zone - Location data - Search or Autocomplete API - NEW: Historical weather - NEW: Future Weather (Upto 300 days ahead) - Weather Alerts - Air Quality Data # Getting Started You need to [signup](https://www.weatherapi.com/signup.aspx) and then you can find your API key under [your account](https://www.weatherapi.com/login.aspx), and start using API right away! We have [code libraries](https://www.weatherapi.com/docs/code-libraries.aspx) for different programming languages like PHP, .net, JAVA, etc. If you find any features missing or have any suggestions, please [contact us](https://www.weatherapi.com/contact.aspx). # Authentication API access to the data is protected by an API key. If at anytime, you find the API key has become vulnerable, please regenerate the key using Regenerate button next to the API key. Authentication to the WeatherAPI.com API is provided by passing your API key as request parameter through an API . ## key parameter key=<YOUR API KEY> * * API version: 1.0.0-oas3 * Generated by: Swagger Codegen (https://github.com/swagger-api/swagger-codegen.git) * Weather API * * # Introduction WeatherAPI.com provides access to weather and geo data via a JSON/XML restful API. It allows developers to create desktop, web and mobile applications using this data very easy. We provide following data through our API: - Real-time weather - 14 day weather forecast - Astronomy - Time zone - Location data - Search or Autocomplete API - NEW: Historical weather - NEW: Future Weather (Upto 300 days ahead) - Weather Alerts - Air Quality Data # Getting Started You need to [signup](https://www.weatherapi.com/signup.aspx) and then you can find your API key under [your account](https://www.weatherapi.com/login.aspx), and start using API right away! We have [code libraries](https://www.weatherapi.com/docs/code-libraries.aspx) for different programming languages like PHP, .net, JAVA, etc. If you find any features missing or have any suggestions, please [contact us](https://www.weatherapi.com/contact.aspx). # Authentication API access to the data is protected by an API key. If at anytime, you find the API key has become vulnerable, please regenerate the key using Regenerate button next to the API key. Authentication to the WeatherAPI.com API is provided by passing your API key as request parameter through an API . ## key parameter key=<YOUR API KEY> * * API version: 1.0.0-oas3 * Generated by: Swagger Codegen (https://github.com/swagger-api/swagger-codegen.git) * Weather API * * # Introduction WeatherAPI.com provides access to weather and geo data via a JSON/XML restful API. It allows developers to create desktop, web and mobile applications using this data very easy. We provide following data through our API: - Real-time weather - 14 day weather forecast - Astronomy - Time zone - Location data - Search or Autocomplete API - NEW: Historical weather - NEW: Future Weather (Upto 300 days ahead) - Weather Alerts - Air Quality Data # Getting Started You need to [signup](https://www.weatherapi.com/signup.aspx) and then you can find your API key under [your account](https://www.weatherapi.com/login.aspx), and start using API right away! We have [code libraries](https://www.weatherapi.com/docs/code-libraries.aspx) for different programming languages like PHP, .net, JAVA, etc. If you find any features missing or have any suggestions, please [contact us](https://www.weatherapi.com/contact.aspx). # Authentication API access to the data is protected by an API key. If at anytime, you find the API key has become vulnerable, please regenerate the key using Regenerate button next to the API key. Authentication to the WeatherAPI.com API is provided by passing your API key as request parameter through an API . ## key parameter key=<YOUR API KEY> * * API version: 1.0.0-oas3 * Generated by: Swagger Codegen (https://github.com/swagger-api/swagger-codegen.git) * Weather API * * # Introduction WeatherAPI.com provides access to weather and geo data via a JSON/XML restful API. It allows developers to create desktop, web and mobile applications using this data very easy. We provide following data through our API: - Real-time weather - 14 day weather forecast - Astronomy - Time zone - Location data - Search or Autocomplete API - NEW: Historical weather - NEW: Future Weather (Upto 300 days ahead) - Weather Alerts - Air Quality Data # Getting Started You need to [signup](https://www.weatherapi.com/signup.aspx) and then you can find your API key under [your account](https://www.weatherapi.com/login.aspx), and start using API right away! We have [code libraries](https://www.weatherapi.com/docs/code-libraries.aspx) for different programming languages like PHP, .net, JAVA, etc. If you find any features missing or have any suggestions, please [contact us](https://www.weatherapi.com/contact.aspx). # Authentication API access to the data is protected by an API key. If at anytime, you find the API key has become vulnerable, please regenerate the key using Regenerate button next to the API key. Authentication to the WeatherAPI.com API is provided by passing your API key as request parameter through an API . ## key parameter key=<YOUR API KEY> * * API version: 1.0.0-oas3 * Generated by: Swagger Codegen (https://github.com/swagger-api/swagger-codegen.git) * Weather API * * # Introduction WeatherAPI.com provides access to weather and geo data via a JSON/XML restful API. It allows developers to create desktop, web and mobile applications using this data very easy. We provide following data through our API: - Real-time weather - 14 day weather forecast - Astronomy - Time zone - Location data - Search or Autocomplete API - NEW: Historical weather - NEW: Future Weather (Upto 300 days ahead) - Weather Alerts - Air Quality Data # Getting Started You need to [signup](https://www.weatherapi.com/signup.aspx) and then you can find your API key under [your account](https://www.weatherapi.com/login.aspx), and start using API right away! We have [code libraries](https://www.weatherapi.com/docs/code-libraries.aspx) for different programming languages like PHP, .net, JAVA, etc. If you find any features missing or have any suggestions, please [contact us](https://www.weatherapi.com/contact.aspx). # Authentication API access to the data is protected by an API key. If at anytime, you find the API key has become vulnerable, please regenerate the key using Regenerate button next to the API key. Authentication to the WeatherAPI.com API is provided by passing your API key as request parameter through an API . ## key parameter key=<YOUR API KEY> * * API version: 1.0.0-oas3 * Generated by: Swagger Codegen (https://github.com/swagger-api/swagger-codegen.git)

v0.0.0-20231108085335-60dfd94eba67 • 9 months ago

github.com/CandisIO/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.8.4 • 4 years ago

gitlab.com/MatteoCampinoti94/faapi-go

Package faapi uses GET requests and parsing to get metadata from the main sections of FurAffinity. These include submissions, user-pages, galleries and searches. The package loads cookies from a JSON formatted file and will return structs with the requested metadata. The main type in the package is faapi.FAAPI. The Cookies and Request fields are assigned using faapi.LoadCookies() and faapi.MakeRequest() methods. The thus assigned cookies and request will be used by the rest of the package to perform the GET requests. The Interval field specifies how long to wait between requests (i.e. throttle). The other two types are used to return parsed metadata from the requests. The field names describe their content. For Submission.ID an int type has been used instead of string because the a submission ID on FA can only be a number, the other fields use simple strings (or slices of them) for easier handling. The Submission type contains the metadata that the package can parse from the submission page. The date is the only modified field: the package converts from the "Mmm D, YYYY" format to "YYYY-MM-DD" for easier sorting and reformatting. Tags are stored in a non-sorted slice. The Description field contains the HTML source of the submission description, directly extracted from the response. The only edit the package performs is to add "http:" to src attributes that lack it (i.e. resources internal to FA). The FileURL field stores the link to the submission file (with "http:" added). The File field stores the binary data of the submission file. The User type contains the metadata extracted from a user's main page. User.Userid and User.Username store the same value but in two different formats. Userid is the url-compatible version of the username, so it lacks capital letters and the underscore character is removed. Username instead stores the user-formatted version of the name (i.e. with capital letters, underscore). The Title is the small "description" chosen by the user (e.g. Writer, Artist, etc...) and extracted as is. The Status can be active, suspended or deceased at the moment and it depends from the first character that FA automatically prefixes to the user's name (e.g. "~UserName" for an active username). Date is the date the user joined, formatted to "YYYY-MM-DD". The Profile is the HTML source of the main description written by the user. The FAAPI type supports two types of methods, one to initialize the api, the other to get data off the forum. LoadCookies() and MakeRequest() are used to initialize the type. The first takes a string as filename and opens the corresponding file. If the file is properly formatted (in JSON) then the cookies contained in it will be loaded into the api as a slice of http.Cookie. The cookies file needs to contain elements with at least name and value keys, others are not necessary. MakeRequest() simply takes the cookies slice returned and saved by LoadCookies() and uses it to create a http.Request object which is then saved in the FAAPI struct. The Get* methods return single structs or slices of what their name suggests. GetSubmission() and GetUser() return a single struct of type Submission and User respectively. These are the calls that provide the most information for both user and submission. GetGallery(), GetScraps(), GetFavorites() and GetSearch() all return a slice of Submission. Gallery, scraps and favorites take a userid as argument together with a page value. The slices returned contain normal Submission structs but with less information: the Date, Tags, Description and FileURL fields are empty. This is because these methods parse the links contained withing the user folders (or search pages) which do not contain those values. The second item the methods return is the next page in the specific section. If no next page page can be found (i.e. there is no next page button) then `next` will have value -1. For gallery, scraps and searches the next page item is just the passed page argument increased by 1, for favorites it is the id of the next page. GetFavorites() page works differently than the others. FA uses an id for favorites pages. 0 can be used to get the first page and 1 for the last, but those in between will need the specific page id. GetSearch() takes a map argument with both keys and values of type string. These are then encoded using the URL. Values method AddValues and are then encoded and added to the URL.RawQuery field of the http.Request. The Submission type supports a single method. GetFile() downloads the file pointed at by FileURL and saves its binary data into the File field.

v1.0.1 • 5 years ago

github.com/improbable-io/go-github

Package github provides a client for using the GitHub API. Usage: Construct a new GitHub client, then use the various services on the client to access different parts of the GitHub API. For example: Some API methods have optional parameters that can be passed. For example: The services of a client divide the API into logical chunks and correspond to the structure of the GitHub API documentation at https://developer.github.com/v3/. NOTE: Using the https://godoc.org/context package, one can easily pass cancelation signals and deadlines to various services of the client for handling a request. In case there is no context available, then context.Background() can be used as a starting point. For more sample code snippets, head over to the https://github.com/google/go-github/tree/master/example directory. The go-github library does not directly handle authentication. Instead, when creating a new client, pass an http.Client that can handle authentication for you. The easiest and recommended way to do this is using the golang.org/x/oauth2 library, but you can always use any other library that provides an http.Client. If you have an OAuth2 access token (for example, a personal API token), you can use it with the oauth2 library using: Note that when using an authenticated Client, all calls made by the client will include the specified OAuth token. Therefore, authenticated clients should almost never be shared between different users. See the oauth2 docs for complete instructions on using that library. For API methods that require HTTP Basic Authentication, use the BasicAuthTransport. GitHub Apps authentication can be provided by the https://github.com/bradleyfalzon/ghinstallation package. GitHub imposes a rate limit on all API clients. Unauthenticated clients are limited to 60 requests per hour, while authenticated clients can make up to 5,000 requests per hour. The Search API has a custom rate limit. Unauthenticated clients are limited to 10 requests per minute, while authenticated clients can make up to 30 requests per minute. To receive the higher rate limit when making calls that are not issued on behalf of a user, use UnauthenticatedRateLimitedTransport. The returned Response.Rate value contains the rate limit information from the most recent API call. If a recent enough response isn't available, you can use RateLimits to fetch the most up-to-date rate limit data for the client. To detect an API rate limit error, you can check if its type is *github.RateLimitError: Learn more about GitHub rate limiting at https://developer.github.com/v3/#rate-limiting. Some endpoints may return a 202 Accepted status code, meaning that the information required is not yet ready and was scheduled to be gathered on the GitHub side. Methods known to behave like this are documented specifying this behavior. To detect this condition of error, you can check if its type is *github.AcceptedError: The GitHub API has good support for conditional requests which will help prevent you from burning through your rate limit, as well as help speed up your application. go-github does not handle conditional requests directly, but is instead designed to work with a caching http.Transport. We recommend using https://github.com/gregjones/httpcache for that. Learn more about GitHub conditional requests at https://developer.github.com/v3/#conditional-requests. All structs for GitHub resources use pointer values for all non-repeated fields. This allows distinguishing between unset fields and those set to a zero-value. Helper functions have been provided to easily create these pointers for string, bool, and int values. For example: Users who have worked with protocol buffers should find this pattern familiar. All requests for resource collections (repos, pull requests, issues, etc.) support pagination. Pagination options are described in the github.ListOptions struct and passed to the list methods directly or as an embedded type of a more specific list options struct (for example github.PullRequestListOptions). Pages information is available via the github.Response struct. Go on App Engine Classic (which as of this writing uses Go 1.6) can not use the "context" import and still relies on "golang.org/x/net/context". As a result, if you wish to continue to use "go-github" on App Engine Classic, you will need to rewrite all the "context" imports using the following command: See "with_appengine.go" for more details.

v15.0.1+incompatible • 6 years ago

github.com/congop/execstub

Package execstub provides stubbing for a usage of a command line api which is based on an executable discovered using the environment <PATH> or some sort of <HOME>+<Bin> configuration. What is a usage of a command line api? Its when you spawn a sub-process using a programming language API. Think go exec.Cmd.Run(..), java Runtime.exec(..), python subprocess.run(..). The sub-process execution will yield an outcome consisting of: side effects (e.g. a file created when executing touch). bytes or string in std-out and/or std-err. an exit code. The executed binary is typically specified using a path like argument. It can be: a absolute path a relative path the name of the binary Discovering the executable How the binary is discovered depends on the art the specified path like argument. The goal of the discovery process is to transform the path like specification into an absolute path, so that if the specification is done with: absolute path nothing is left to be done relative path it will be appended to the current directory to form an absolute path name of the binary a file with that name will be search in the directories specified by the PATH environment variable. The first found will be used. On some platforms known and specified file extensions (e.g. .exe) will be appended to the name during the search. Even when using absolute paths, you may decide to put a customization strategy using some process environment variable. We call this the home-bin-dir strategy. E.g. Java executed base on JAVA_HOME or JRE_HOME with the java executable located at ${JAVA_HOME}/bin/java Stubbing mechanism The aim os stubbing is to replace at execution time the actual executable with a fake one which will yield some kind of pre-defined known outcome. This can be done if there is step in the binary discovery which can be tweaked so that the stud-executable is discovered instead of the actual one. It is the case if binary is specified by name and discovered using PATH or when absolute path is used but with a customization layer in place based on defined environment home variable. The mechanism will just have to mutate the PATH or set the home variable accordingly. Stubbing become complicated if the specification is based on absolute (without customization layer) or relative path because it will require e.g.: a file overley to hide the original executable a change of the current directory an in the flight replacement of the original executable and a rollback. These are either not desirable, or too heavy especially in a unit-test-ish context. The execstub module will therefore only provide stubbing for PATH and Home-Bin-Dir based discovery. Design elements Modeling the command line usage We have basically 3 aspects to model: Starting the process It is modeled using StubRequest which hold the the command name and the execution argument so that the fake execution process can be function of them. Execution The outcome can be a static sum of known in advance stderr, stdout and exit code. This is referred to as static mode. It may also be desirale to have some side effects, e.g.: related to the execution itself like creating a file or related to the unit-test as counting the number of calls In other constellations the outocome may requires some computation to determine stderr/stdout/exicode as function of the request. We refer to these cases as dynamic mode. The execution is therefore modelled as function namely StubFunc. Thus, if required the StubFunc corresponding to the current stubbing will be looked-up and run. Outcome The stderr/stdout/exit-code part of the outcome is modeled as ExecOutcome. The execution the the stubfunc may result however in an error. Such error is not an actual sub-process execution erroneous outcome. It therefore must be modeled separately as opposed to be added to stderr and setting a non-zero exit code. Such and error can be communicated by using the ExecOutcome field InternalErrTxt. Stub-Executable It is the fake executable used to effectuate the overall outcome. It may release the outcome by itself. It may also cooperate with a test helper to achieve the outcome. The following command line is used: /tmp/go-build720053430/b001/xyz.test -test.run=TestHelperProcess -- arg1 arg2 arg3 It is not possible to use the test helper directly because we will need to inject the args test.run and -- at the actual execution call site. Two kinds of stub-executable are provided: bash based, which uses a bash script exec base, which used an go based executable Obviously the go based one is bound to support more platform than the bash based one. Both executable are genric and need context information about the stubbing for which they will be used. This context data is modeled as CmdConfig and saved as file alongside the executable. Inter process communication (IPC) to support dynamic outcomes The invocation of a StubFunc to produce a dynamic outcome is done in the unit test process. There is no means to access the stub function directly from the fake process execution. IPC is used here to allow the fake process to issue a StubRequest and receive an ExecOutcome. The Serialization/Deserialization mechanism needed for IPC must be understood by both parties (mind bash) and satisfy a domain specific requirement of been able to cope with multiline sterr and stdout. We choose a combination of : base64 to encode the discrete data (e.g. stderr, stdout) and comma separated value (easily encoded and decoded in bash) for envelope encoding Named pipes are used for the actual data transport, because they are file based and easily handle in different setups (e.g. in bash). Stubbing Setup An ExecStubber is provided to manage the stubbing setup. Its key feature is to setup an invokation of a StubFunc to replace the execution of a process, See ExecStubber.WhenExecDoStubFunc(...). It allows settings beyond the basic requirement of specifying the executable to be stubbed and a StubFunc replacement. This is modeled using Settings, which provides the following configuration options: Selecting and customizing the discovery mode (PATH vs. Home-Bin-Dir) Selecting the stub executable (Bash vs. Exec(go based)) Selecting the stubbing mode (static vs. dynamic) Specifying the test process helper method name specifying a timeout for a stub sub-process execution A stubbing setup is identified by a key. The key can be use to: unwind the setup(Unregister(..)) access the stubbing data basis of static requests (FindAllPersistedStubRequests, DeleteAllPersistedStubRequests ). Concurrency and parallelism The mutation of the process environment is the key enabling mechanism of Execstub. The process environment must therefore be guarded against issues of concurrency and parallelism. This also mean that is not possible to have a parallel stubbing setup for the same executable within a test process. The outcome will otherwise be non-deterministic because the setting will likely override each other. ExecStubber provides a locking mechanism to realize the serialization of the mutation of environment. For this to work correctly however there must only be one ExecStubber per unit test process. Note that the code under test can still execute its sub-processes concurrently or in parallel. The correctness of the outcome here dependents on the implementation of the StubFunc function being used. Static outcomes without side effect are of course always deterministic.

v0.0.0-20210519151710-e93c592abc71 • 3 years ago

github.com/dcormier/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.3.0 • 6 years ago

github.com/cgroschupp/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.8.3 • 4 years ago

github.com/pritoj/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.8.4 • 4 years ago

github.com/danilodvaz/lingua-go/v2

Package lingua accurately detects the natural language of written text, be it long or short. Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing applications. In cases where you don't need the full-fledged functionality of those systems or don't want to learn the ropes of those, a small flexible library comes in handy. So far, the only other comprehensive open source library in the Go ecosystem for this task is Whatlanggo (https://github.com/abadojack/whatlanggo). Unfortunately, it has two major drawbacks: 1. Detection only works with quite lengthy text fragments. For very short text snippets such as Twitter messages, it does not provide adequate results. 2. The more languages take part in the decision process, the less accurate are the detection results. Lingua aims at eliminating these problems. It nearly does not need any configuration and yields pretty accurate results on both long and short text, even on single words and phrases. It draws on both rule-based and statistical methods but does not use any dictionaries of words. It does not need a connection to any external API or service either. Once the library has been downloaded, it can be used completely offline. Compared to other language detection libraries, Lingua's focus is on quality over quantity, that is, getting detection right for a small set of languages first before adding new ones. Currently, 75 languages are supported. They are listed as variants of type Language. Lingua is able to report accuracy statistics for some bundled test data available for each supported language. The test data for each language is split into three parts: 1. a list of single words with a minimum length of 5 characters 2. a list of word pairs with a minimum length of 10 characters 3. a list of complete grammatical sentences of various lengths Both the language models and the test data have been created from separate documents of the Wortschatz corpora (https://wortschatz.uni-leipzig.de) offered by Leipzig University, Germany. Data crawled from various news websites have been used for training, each corpus comprising one million sentences. For testing, corpora made of arbitrarily chosen websites have been used, each comprising ten thousand sentences. From each test corpus, a random unsorted subset of 1000 single words, 1000 word pairs and 1000 sentences has been extracted, respectively. Given the generated test data, I have compared the detection results of Lingua, and Whatlanggo running over the data of Lingua's supported 75 languages. Additionally, I have added Google's CLD3 (https://github.com/google/cld3/) to the comparison with the help of the gocld3 bindings (https://github.com/jmhodges/gocld3). Languages that are not supported by CLD3 or Whatlanggo are simply ignored during the detection process. The bar and box plots (https://github.com/pemistahl/lingua-go/blob/main/ACCURACY_PLOTS.md) show the measured accuracy values for all three performed tasks: Single word detection, word pair detection and sentence detection. Lingua clearly outperforms its contenders. Detailed statistics including mean, median and standard deviation values for each language and classifier are available in tabular form (https://github.com/pemistahl/lingua-go/blob/main/ACCURACY_TABLE.md) as well. Every language detector uses a probabilistic n-gram (https://en.wikipedia.org/wiki/N-gram) model trained on the character distribution in some training corpus. Most libraries only use n-grams of size 3 (trigrams) which is satisfactory for detecting the language of longer text fragments consisting of multiple sentences. For short phrases or single words, however, trigrams are not enough. The shorter the input text is, the less n-grams are available. The probabilities estimated from such few n-grams are not reliable. This is why Lingua makes use of n-grams of sizes 1 up to 5 which results in much more accurate prediction of the correct language. A second important difference is that Lingua does not only use such a statistical model, but also a rule-based engine. This engine first determines the alphabet of the input text and searches for characters which are unique in one or more languages. If exactly one language can be reliably chosen this way, the statistical model is not necessary anymore. In any case, the rule-based engine filters out languages that do not satisfy the conditions of the input text. Only then, in a second step, the probabilistic n-gram model is taken into consideration. This makes sense because loading less language models means less memory consumption and better runtime performance. In general, it is always a good idea to restrict the set of languages to be considered in the classification process using the respective api methods. If you know beforehand that certain languages are never to occur in an input text, do not let those take part in the classifcation process. The filtering mechanism of the rule-based engine is quite good, however, filtering based on your own knowledge of the input text is always preferable. There might be classification tasks where you know beforehand that your language data is definitely not written in Latin, for instance. The detection accuracy can become better in such cases if you exclude certain languages from the decision process or just explicitly include relevant languages. Knowing about the most likely language is nice but how reliable is the computed likelihood? And how less likely are the other examined languages in comparison to the most likely one? In the example below, a slice of ConfidenceValue is returned containing all possible languages sorted by their confidence value in descending order. The values that this method computes are part of a relative confidence metric, not of an absolute one. Each value is a number between 0.0 and 1.0. The most likely language is always returned with value 1.0. All other languages get values assigned which are lower than 1.0, denoting how less likely those languages are in comparison to the most likely language. The slice returned by this method does not necessarily contain all languages which the calling instance of LanguageDetector was built from. If the rule-based engine decides that a specific language is truly impossible, then it will not be part of the returned slice. Likewise, if no ngram probabilities can be found within the detector's languages for the given input text, the returned slice will be empty. The confidence value for each language not being part of the returned slice is assumed to be 0.0. By default, Lingua uses lazy-loading to load only those language models on demand which are considered relevant by the rule-based filter engine. For web services, for instance, it is rather beneficial to preload all language models into memory to avoid unexpected latency while waiting for the service response. If you want to enable the eager-loading mode, you can do it as seen below. Multiple instances of LanguageDetector share the same language models in memory which are accessed asynchronously by the instances. By default, Lingua returns the most likely language for a given input text. However, there are certain words that are spelled the same in more than one language. The word `prologue`, for instance, is both a valid English and French word. Lingua would output either English or French which might be wrong in the given context. For cases like that, it is possible to specify a minimum relative distance that the logarithmized and summed up probabilities for each possible language have to satisfy. It can be stated as seen below. Be aware that the distance between the language probabilities is dependent on the length of the input text. The longer the input text, the larger the distance between the languages. So if you want to classify very short text phrases, do not set the minimum relative distance too high. Otherwise Unknown will be returned most of the time as in the example below. This is the return value for cases where language detection is not reliably possible.

v2.0.5 • 2 years ago

github.com/ScaleChamp/rod

Package rod is a high-level driver directly based on DevTools Protocol. This example opens https://github.com/, searches for "git", and then gets the header element which gives the description for Git. Rod use https://golang.org/pkg/context to handle cancelations for IO blocking operations, most times it's timeout. Context will be recursively passed to all sub-methods. For example, methods like Page.Context(ctx) will return a clone of the page with the ctx, all the methods of the returned page will use the ctx if they have IO blocking operations. Page.Timeout or Page.WithCancel is just a shortcut for Page.Context. Of course, Browser or Element works the same way. Shows how we can further customize the browser with the launcher library. Usually you use launcher lib to set the browser's command line flags (switches). Doc for flags: https://peter.sh/experiments/chromium-command-line-switches Shows how to change the retry/polling options that is used to query elements. This is useful when you want to customize the element query retry logic. When rod doesn't have a feature that you need. You can easily call the cdp to achieve it. List of cdp API: https://github.com/ScaleChamp/rod/tree/master/lib/proto Shows how to disable headless mode and debug. Rod provides a lot of debug options, you can set them with setter methods or use environment variables. Doc for environment variables: https://pkg.go.dev/github.com/ScaleChamp/rod/lib/defaults We use "Must" prefixed functions to write example code. But in production you may want to use the no-prefix version of them. About why we use "Must" as the prefix, it's similar to https://golang.org/pkg/regexp/#MustCompile Shows how to share a remote object reference between two Eval Shows how to listen for events. Shows how to intercept requests and modify both the request and the response. The entire process of hijacking one request: The --req-> and --res-> are the parts that can be modified. Show how to handle multiple results of an action. Such as when you login a page, the result can be success or wrong password. Example_search shows how to use Search to get element inside nested iframes or shadow DOMs. It works the same as https://developers.google.com/web/tools/chrome-devtools/dom#search Shows how to update the state of the current page. In this example we enable the network domain. Rod uses mouse cursor to simulate clicks, so if a button is moving because of animation, the click may not work as expected. We usually use WaitStable to make sure the target isn't changing anymore. When you want to wait for an ajax request to complete, this example will be useful.

v0.0.0-20220827210002-67bb87e037ae • 2 years ago

github.com/golearn/cloudforest

Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang). It includes implementations of Breiman and Cutler's Random Forest for classification and regression on heterogeneous numerical/categorical data with missing values and several related algorithms including entropy and cost driven classification, L1 regression and feature selection with artificial contrasts and hooks for modifying the algorithms for your needs. Command line utilities to grow, apply and analyze forests are provided in sub directories. CloudForest is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. Documentation has been generated with godoc and can be viewed live at: http://godoc.org/github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome; Code Repo and Issue tracker can be found at: https://github.com/ryanbressler/CloudForest CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the origional slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record Variable Importance in CloudForest is calculated as the mean decrease in impurity over all of the splits made using a feature. To provide a baseline for evaluating importance, artificial contrast features can be used by including shuffled copies of existing features. In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface. When compiled with go1.1 CloudForest achieves running times similar to implementations in other languages. Using gccgo (4.8.0 at least) results in longer running times and is not recommended until full go1.1 support is implemented in gcc 4.8.1. Development of CloudForest is being driven by our needs as we analyze large biomedical data sets. As such new and modified analysis will be added as needed. The basic functionality has stabilized but we have discussed several possible changes that may require additional abstraction and/or changes in the api. These include: Allow additional types of candidate features. Some multidimensional data types may not be best served by decomposition into categorical and numerical features. It would be possible to allow arbitrary feature types by adding CanidateFeature (which should expose BestSplit), CodedSplitter and Splitter abstraction. Allowing data to reside anywhere. This would involve abstracting FeatureMatrix to allow database etc driven implementations. "growforest" trains a forest using the following parameters which can be listed with -h "applyforest" applies a forest to the specified feature matrix and outputs predictions as a two column (caselabel predictedvalue) tsv. errorrate calculates the error of a forest vs a testing data set and reports it to standard out leafcount outputs counts of case case co-occurrence on leaf nodes (Brieman's proximity) and counts of the number of times a feature is used to split a node containing each case (a measure of relative/local importance). CloudForest borrows the annotated feature matrix (.afm) and stoicastic forest (.sf) file formats from Timo Erkkila's rf-ace which can be found at https://code.google.com/p/rf-ace/ An annotated feature matrix (.afm) file is a tab delineated file with column and row headers. Columns represent cases and rows represent features. A row header/feature id includes a prefix to specify the feature type Categorical and boolean features use strings for their category labels. Missing values are represented by "?","nan","na", or "null" (case insensitive). A short example: A stochastic forest (.sf) file contains a forest of decision trees. The main advantage of this format as opposed to an established format like json is that an sf file can be written iteratively tree by tree and multiple .sf files can be combined with minimal logic required allowing for massively parallel growth of forests with low memory use. An .sf file consists of lines each of which is a comma separated list of key value pairs. Lines can designate either a FOREST, TREE, or NODE. Each tree belongs to the preceding forest and each node to the preceding tree. Nodes must be written in order of increasing depth. CloudForest generates fewer fields then rf-ace but requires the following. Other fields will be ignored Forest requires forest type (only RF currently), target and ntrees: Tree requires only an int and the value is ignored though the line is needed to designate a new tree: Node requires a path encoded so that the root node is specified by "*" and each split left or right as "L" or "R". Leaf nodes should also define PRED such as "PRED=1.5" or "PRED=red". Splitter nodes should define SPLITTER with a feature id inside of double quotes, SPLITTERTYPE=[CATEGORICAL|NUMERICAL] and a LVALUE term which can be either a float inside of double quotes representing the highest value sent left or a ":" separated list of categorical values sent left. An example .sf file: Cloud forest can parse and apply .sf files generated by at least some versions of rf-ace. The idea for (and trademark of the term) Random Forests originated with Leo Brieman and Adele Cuttler. Their code and paper's can be found at: http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm All code in CloudForest is original but some ideas for methods and optimizations were inspired by Timo Erkilla's rf-ace and Andy Liaw and Matthew Wiener randomForest R package based on Brieman and Cuttler's code: https://code.google.com/p/rf-ace/ http://cran.r-project.org/web/packages/randomForest/index.html The idea for Artificial Contrasts was found in: Eugene Tuv, Alexander Borisov, George Runger and Kari Torkkola's paper "Feature Selection with Ensembles, Artificial Variables, and Redundancy Elimination" http://www.researchgate.net/publication/220320233_Feature_Selection_with_Ensembles_Artificial_Variables_and_Redundancy_Elimination/file/d912f5058a153a8b35.pdf The idea for growing trees to minimize categorical entropy comes from Ross Quinlan's ID3: http://en.wikipedia.org/wiki/ID3_algorithm "The Elements of Statistical Learning" 2nd edition by Trevor Hastie, Robert Tibshirani and Jerome Friedman was also consulted during development.

v0.0.0-20131023235215-805a285a97ce • 11 years ago

github.com/tanaikech/ggsrun

Package main (doc.go) : This is a CLI tool to execute Google Apps Script (GAS) on a terminal. Will you want to develop GAS on your local PC? Generally, when we develop GAS, we have to login to Google using own browser and develop it on the Script Editor. Recently, I have wanted to have more convenient local-environment for developing GAS. So I created this "ggsrun". The main work is to execute GAS on local terminal and retrieve the results from Google. 1. Develops GAS using your terminal and text editor which got accustomed to using. 2. Executes GAS by giving values to your script. 3. Executes GAS made of CoffeeScript. 4. Downloads spreadsheet, document and presentation, while executes GAS, simultaneously. 5. Downloads files from Google Drive and Uploads files to Google Drive. 6. Downloads standalone script and bound script. 7. Downloads all files and folders in a specific folder. 8. Upload script files and create project as standalone script and container-bound script. 9. Update project. 10. Retrieve revision files of Google Docs and retrieve versions of projects. 11. Rearranges scripts in project. 12. Modifies Manifests in project. 13. Seach files in Google Drive using search query and regex. 14. Manage Permissions of files. 15. Get Drive Information. 16. ggsrun got to be able to be used by not only OAuth2, but also Service Account from v1.7.0. You can see the release page https://github.com/tanaikech/ggsrun/releases ggsrun uses Execution API, Web Apps and Drive API on Google. About how to install ggsrun, please check my github repository. https://github.com/tanaikech/ggsrun/ You can read the detail information there. --------------------------------------------------------------- # How to Execute Google Apps Script Using ggsrun When you have the configure file `ggsrun.cfg`, you can execute GAS. If you cannot find it, please download `client_secret.json` and run $ ggsrun auth In the case of using Execution API, $ ggsrun e1 -s sample.gs If you want to execute a function except for `main()` of default, you can use an option like `-f foo`. This command `exe1` can be used to execute a function on project. $ ggsrun e1 -f foo $ ggsrun e2 -s sample.gs At `e2`, you cannot select the executing function except for `main()` of default. `e1`, `e2` and `-s` mean using Execution API and GAS script file name, respectively. Sample codes which are shown here will be used Execution API. At this time, the executing function is `main()`, which is a default, in the script. In the case of using Web Apps, $ ggsrun w -s sample.gs -p password -u [ WebApps URL ] `w` and `-p` mean using Web Apps and password you set at the server side, respectively. Using `-u` it imports Web Apps URL like `-u https://script.google.com/macros/s/#####/exec`. --------------------------------------------------------------- Package main (ggsrun.go) : This file is included all commands and options. Package main (handler.go) : Handler for ggsrun Package main (init.go) : These methods are for reading and writing configuration file (ggsrun.cfg). Package main (materials.go) : Materials for ggsrun. Package main (oauth.go) : Get accesstoken using refreshtoken, and confirm condition of accesstoken. Package main (projectupdater.go) : These methods are for updating project. Package main (scriptrearrange.go) : These methods are for rearranging scripts in a project. Package main (sender.go) : These methods are for sending GAS scripts to Google Drive.

v1.7.4 • 5 years ago

github.com/gopher-johns/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.9.0 • 3 years ago

github.com/candisio/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.8.4 • 4 years ago

github.com/allenluce/enmime

Package enmime implements a MIME encoding and decoding library. It's built on top of Go's included mime/multipart support where possible, but is geared towards parsing MIME encoded emails. The enmime API has two conceptual layers. The lower layer is a tree of Part structs, representing each component of a decoded MIME message. The upper layer, called an Envelope provides an intuitive way to interact with a MIME message. Calling ReadParts causes enmime to parse the body of a MIME message into a tree of Part objects, each of which is aware of its content type, filename and headers. The content of a Part is available as a slice of bytes via the Content field. If the part was encoded in quoted-printable or base64, it is decoded prior to being placed in Content. If the Part contains text in a character set other than utf-8, enmime will attempt to convert it to utf-8. To locate a particular Part, pass a custom PartMatcher function into the BreadthMatchFirst() or DepthMatchFirst() methods to search the Part tree. BreadthMatchAll() and DepthMatchAll() will collect all Parts matching your criteria. ReadEnvelope returns an Envelope struct. Behind the scenes a Part tree is constructed, and then sorted into the correct fields of the Envelope. The Envelope contains both the plain text and HTML portions of the email. If there was no plain text Part available, the HTML Part will be down-converted using the html2text library1. The root of the Part tree, as well as slices of the inline and attachment Parts are also available. Every MIME Part has its own headers, accessible via the Part.Header field. The raw headers for an Envelope are available in Root.Header. Envelope also provides helper methods to fetch headers: GetHeader(key) will return the RFC 2047 decoded value of the specified header. AddressList(key) will convert the specified address header into a slice of net/mail.Address values. enmime attempts to be tolerant of poorly encoded MIME messages. In situations where parsing is not possible, the ReadEnvelope and ReadParts functions will return a hard error. If enmime is able to continue parsing the message, it will add an entry to the Errors slice on the relevant Part. After parsing is complete, all Part errors will be appended to the Envelope Errors slice. The Error* constants can be used to identify a specific class of error. Please note that enmime parses messages into memory, so it is not likely to perform well with multi-gigabyte attachments. enmime is open source software released under the MIT License. The latest version can be found at https://github.com/jhillyerd/enmime

v0.2.0 • 6 years ago

github.com/evorts/rod

This example opens https://github.com/, searches for "git", and then gets the header element which gives the description for Git. Rod use https://golang.org/pkg/context to handle cancelations for IO blocking operations, most times it's timeout. Context will be recursively passed to all sub-methods. For example, methods like Page.Context(ctx) will return a clone of the page with the ctx, all the methods of the returned page will use the ctx if they have IO blocking operations. Page.Timeout or Page.WithCancel is just a shortcut for Page.Context. Of course, Browser or Element works the same way. Shows how we can further customize the browser with the launcher library. Usually you use launcher lib to set the browser's command line flags (switches). Doc for flags: https://peter.sh/experiments/chromium-command-line-switches Shows how to change the retry/polling options that is used to query elements. This is useful when you want to customize the element query retry logic. When rod doesn't have a feature that you need. You can easily call the cdp to achieve it. List of cdp API: https://chromedevtools.github.io/devtools-protocol Shows how to disable headless mode and debug. Rod provides a lot of debug options, you can set them with setter methods or use environment variables. Doc for environment variables: https://pkg.go.dev/github.com/go-rod/rod/lib/defaults We use "Must" prefixed functions to write example code. But in production you may want to use the no-prefix version of them. About why we use "Must" as the prefix, it's similar to https://golang.org/pkg/regexp/#MustCompile Shows how to share a remote object reference between two Eval Shows how to listen for events. Shows how to intercept requests and modify both the request and the response. The entire process of hijacking one request: The --req-> and --res-> are the parts that can be modified. Show how to handle multiple results of an action. Such as when you login a page, the result can be success or wrong password. Example_search shows how to use Search to get element inside nested iframes or shadow DOMs. It works the same as https://developers.google.com/web/tools/chrome-devtools/dom#search Shows how to update the state of the current page. In this example we enable the network domain. Rod uses mouse cursor to simulate clicks, so if a button is moving because of animation, the click may not work as expected. We usually use WaitStable to make sure the target isn't changing anymore. When you want to wait for an ajax request to complete, this example will be useful.

v0.78.5 • 4 years ago

github.com/chirayu/lingua-go

Package lingua accurately detects the natural language of written text, be it long or short. Its task is simple: It tells you which language some provided textual data is written in. This is very useful as a preprocessing step for linguistic data in natural language processing applications such as text classification and spell checking. Other use cases, for instance, might include routing e-mails to the right geographically located customer service department, based on the e-mails' languages. Language detection is often done as part of large machine learning frameworks or natural language processing applications. In cases where you don't need the full-fledged functionality of those systems or don't want to learn the ropes of those, a small flexible library comes in handy. So far, the only other comprehensive open source library in the Go ecosystem for this task is Whatlanggo (https://github.com/abadojack/whatlanggo). Unfortunately, it has two major drawbacks: 1. Detection only works with quite lengthy text fragments. For very short text snippets such as Twitter messages, it does not provide adequate results. 2. The more languages take part in the decision process, the less accurate are the detection results. Lingua aims at eliminating these problems. It nearly does not need any configuration and yields pretty accurate results on both long and short text, even on single words and phrases. It draws on both rule-based and statistical methods but does not use any dictionaries of words. It does not need a connection to any external API or service either. Once the library has been downloaded, it can be used completely offline. Compared to other language detection libraries, Lingua's focus is on quality over quantity, that is, getting detection right for a small set of languages first before adding new ones. Currently, 75 languages are supported. They are listed as variants of type Language. Lingua is able to report accuracy statistics for some bundled test data available for each supported language. The test data for each language is split into three parts: 1. a list of single words with a minimum length of 5 characters 2. a list of word pairs with a minimum length of 10 characters 3. a list of complete grammatical sentences of various lengths Both the language models and the test data have been created from separate documents of the Wortschatz corpora (https://wortschatz.uni-leipzig.de) offered by Leipzig University, Germany. Data crawled from various news websites have been used for training, each corpus comprising one million sentences. For testing, corpora made of arbitrarily chosen websites have been used, each comprising ten thousand sentences. From each test corpus, a random unsorted subset of 1000 single words, 1000 word pairs and 1000 sentences has been extracted, respectively. Given the generated test data, I have compared the detection results of Lingua, and Whatlanggo running over the data of Lingua's supported 75 languages. Additionally, I have added Google's CLD3 (https://github.com/google/cld3/) to the comparison with the help of the gocld3 bindings (https://github.com/jmhodges/gocld3). Languages that are not supported by CLD3 or Whatlanggo are simply ignored during the detection process. The bar and box plots (https://github.com/pemistahl/lingua-go/blob/main/ACCURACY_PLOTS.md) show the measured accuracy values for all three performed tasks: Single word detection, word pair detection and sentence detection. Lingua clearly outperforms its contenders. Detailed statistics including mean, median and standard deviation values for each language and classifier are available in tabular form (https://github.com/pemistahl/lingua-go/blob/main/ACCURACY_TABLE.md) as well. Every language detector uses a probabilistic n-gram (https://en.wikipedia.org/wiki/N-gram) model trained on the character distribution in some training corpus. Most libraries only use n-grams of size 3 (trigrams) which is satisfactory for detecting the language of longer text fragments consisting of multiple sentences. For short phrases or single words, however, trigrams are not enough. The shorter the input text is, the less n-grams are available. The probabilities estimated from such few n-grams are not reliable. This is why Lingua makes use of n-grams of sizes 1 up to 5 which results in much more accurate prediction of the correct language. A second important difference is that Lingua does not only use such a statistical model, but also a rule-based engine. This engine first determines the alphabet of the input text and searches for characters which are unique in one or more languages. If exactly one language can be reliably chosen this way, the statistical model is not necessary anymore. In any case, the rule-based engine filters out languages that do not satisfy the conditions of the input text. Only then, in a second step, the probabilistic n-gram model is taken into consideration. This makes sense because loading less language models means less memory consumption and better runtime performance. In general, it is always a good idea to restrict the set of languages to be considered in the classification process using the respective api methods. If you know beforehand that certain languages are never to occur in an input text, do not let those take part in the classifcation process. The filtering mechanism of the rule-based engine is quite good, however, filtering based on your own knowledge of the input text is always preferable. There might be classification tasks where you know beforehand that your language data is definitely not written in Latin, for instance. The detection accuracy can become better in such cases if you exclude certain languages from the decision process or just explicitly include relevant languages. Knowing about the most likely language is nice but how reliable is the computed likelihood? And how less likely are the other examined languages in comparison to the most likely one? In the example below, a slice of ConfidenceValue is returned containing all possible languages sorted by their confidence value in descending order. The values that this method computes are part of a relative confidence metric, not of an absolute one. Each value is a number between 0.0 and 1.0. The most likely language is always returned with value 1.0. All other languages get values assigned which are lower than 1.0, denoting how less likely those languages are in comparison to the most likely language. The slice returned by this method does not necessarily contain all languages which the calling instance of LanguageDetector was built from. If the rule-based engine decides that a specific language is truly impossible, then it will not be part of the returned slice. Likewise, if no ngram probabilities can be found within the detector's languages for the given input text, the returned slice will be empty. The confidence value for each language not being part of the returned slice is assumed to be 0.0. By default, Lingua uses lazy-loading to load only those language models on demand which are considered relevant by the rule-based filter engine. For web services, for instance, it is rather beneficial to preload all language models into memory to avoid unexpected latency while waiting for the service response. If you want to enable the eager-loading mode, you can do it as seen below. Multiple instances of LanguageDetector share the same language models in memory which are accessed asynchronously by the instances. By default, Lingua returns the most likely language for a given input text. However, there are certain words that are spelled the same in more than one language. The word `prologue`, for instance, is both a valid English and French word. Lingua would output either English or French which might be wrong in the given context. For cases like that, it is possible to specify a minimum relative distance that the logarithmized and summed up probabilities for each possible language have to satisfy. It can be stated as seen below. Be aware that the distance between the language probabilities is dependent on the length of the input text. The longer the input text, the larger the distance between the languages. So if you want to classify very short text phrases, do not set the minimum relative distance too high. Otherwise Unknown will be returned most of the time as in the example below. This is the return value for cases where language detection is not reliably possible.

v1.0.0 • 3 years ago

github.com/jasdel/aws-sdk-go-v2/service/resourcegroupstaggingapi

Package resourcegroupstaggingapi provides the client and types for making API requests to AWS Resource Groups Tagging API. This guide describes the API operations for the resource groups tagging. A tag is a label that you assign to an AWS resource. A tag consists of a key and a value, both of which you define. For example, if you have two Amazon EC2 instances, you might assign both a tag key of "Stack." But the value of "Stack" might be "Testing" for one and "Production" for the other. Tagging can help you organize your resources and enables you to simplify resource management, access management and cost allocation. For more information about tagging, see Working with Tag Editor (http://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/tag-editor.html) and Working with Resource Groups (http://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/resource-groups.html). For more information about permissions you need to use the resource groups tagging APIs, see Obtaining Permissions for Resource Groups (http://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/obtaining-permissions-for-resource-groups.html) and Obtaining Permissions for Tagging (http://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/obtaining-permissions-for-tagging.html). You can use the resource groups tagging APIs to complete the following tasks: Tag and untag supported resources located in the specified region for the AWS account Use tag-based filters to search for resources located in the specified region for the AWS account List all existing tag keys in the specified region for the AWS account List all existing values for the specified key in the specified region for the AWS account Not all resources can have tags. For a lists of resources that you can tag, see Supported Resources (http://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/supported-resources.html) in the AWS Resource Groups and Tag Editor User Guide. To make full use of the resource groups tagging APIs, you might need additional IAM permissions, including permission to access the resources of individual services as well as permission to view and apply tags to those resources. For more information, see Obtaining Permissions for Tagging (http://docs.aws.amazon.com/awsconsolehelpdocs/latest/gsg/obtaining-permissions-for-tagging.html) in the AWS Resource Groups and Tag Editor User Guide. See https://docs.aws.amazon.com/goto/WebAPI/resourcegroupstaggingapi-2017-01-26 for more information on this service. See resourcegroupstaggingapi package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/resourcegroupstaggingapi/ To AWS Resource Groups Tagging API with the SDK use the New function to create a new service client. With that client you can make API requests to the service. These clients are safe to use concurrently. See the SDK's documentation for more information on how to use the SDK. https://docs.aws.amazon.com/sdk-for-go/api/ See aws.Config documentation for more information on configuring SDK clients. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config See the AWS Resource Groups Tagging API client ResourceGroupsTaggingAPI for more information on creating client for this service. https://docs.aws.amazon.com/sdk-for-go/api/service/resourcegroupstaggingapi/#New

v2.0.0-preview.1+incompatible • 2 years ago

github.com/ranjib/goiardi

Goiardi is an implementation of the Chef server (http://www.opscode.com) written in Go. It can either run entirely in memory with the option to save and load the in-memory data and search indexes to and from disk, drawing inspiration from chef-zero, or it can use MySQL or PostgreSQL as its storage backend. Like all software, it is a work in progress. Goiardi now, though, should have all the functionality of the open source Chef Server, plus some extras like reporting and event logging. It does not support other Enterprise Chef type features like organizations or pushy at this time. When used, knife works, and chef-client runs complete successfully. Almost all chef-pendant tests successfully successfully run, with a few disagreements about error messages that don't impact the clients. It does pretty well against the official chef-pedant, but because goiardi handles some authentication matters a little differently than the official chef-server, there is also a fork of chef-pedant located at https://github.com/ctdk/chef-pedant that's more custom tailored to goiardi. Many go tests are present as well in different goiardi subdirectories. Goiardi currently has eight dependencies: go-flags, go-cache, go-trie, toml, the mysql driver from go-sql-driver, the postgres driver, logger, and go-uuid. To install them, run: from your $GOROOT, or just use the -t flag when you go get goiardi. If you would like to modify the search grammar, you'll need the 'peg' package. To install that, run In the 'search/' directory, run 'peg -switch -inline search-parse.peg' to generate the new grammar. If you don't plan on editing the search grammar, though, you won't need that. 1. Install go. (http://golang.org/doc/install.html) You may need to upgrade to go 1.2 to compile all the dependencies. Go 1.3 is also confirmed to work. 2. Make sure your $GOROOT and PATH are set up correctly per the Go installation instructions. 3. Download goairdi and its dependencies. 4. Run tests, if desired. Several goiardi subdirectories have go tests, and chef-pedant can and should be used for testing goiardi as well. 5. Install the goiardi binaries. 6. Run goiardi. Or, you can look at the goiardi releases page on github at https://github.com/ctdk/goiardi/releases and see if there are precompiled binaries available for your platform. You can get a list of command-line options with the '-h' flag. Goiardi can also take a config file, run like goiardi -c /path/to/conf-file. See etc/goiardi.conf-sample for an example documented configuration file. Options in the configuration file share the same name as the long command line arguments (so, for example, --ipaddress=127.0.0.1 on the command line would be ipaddress = "127.0.0.1" in the config file. Currently available command line and config file options: For more documentation on Chef, see http://docs.opscode.com. If goiardi is not running in use-auth mode, it does not actually care about .pem files at all. You still need to have one to keep knife and chef-client happy. It's like chef-zero in that regard. If goiardi is running in use-auth mode, then proper keys are needed. When goiardi is started, if the chef-webui and chef-validator clients, and the admin user, are not present, it will create new keys in the --conf-root directory. Use them as you would normally for validating clients, performing tasks with the admin user, or using chef-webui if webui will run in front of goiardi. In auth mode, goiardi supports versions 1.0, 1.1, and 1.2 of the Chef authentication protocol. *Note:* The admin user, when created on startup, does not have a password. This prevents logging in to the webui with the admin user, so a password will have to be set for admin before doing so. By default, goiardi logs to standard output. A log file may be specified with the `-L/--log-file` flag, or goiardi can log to syslog with the `-s/--syslog` flag on platforms that support syslog. Attempting to use syslog on one of these platforms (currently Windows and plan9 (although plan9 doesn't build for other reasons)) will result in an error. Log levels can be set in goiardi with either the `log-level` option in the configuration file, or with one to four -V flags on the command line. Log level options are "debug", "info", "warning", "error", and "critical". More -V on the command line means more spewing into the log. Goiardi can now use MySQL to store its data, instead of keeping all its data in memory (and optionally freezing its data to disk for persistence). If you want to use MySQL, you (unsurprisingly) need a MySQL installation that goiardi can access. This document assumes that you are able to install, configure, and run MySQL. Once the MySQL server is set up to your satisfaction, you'll need to install sqitch to deploy the schema, and any changes to the database schema that may come along later. It can be installed out of CPAN or homebrew; see "Installation" on http://sqitch.org for details. The sqitch MySQL tutorial at https://metacpan.org/pod/sqitchtutorial-mysql explains how to deploy, verify, and revert changes to the database with sqitch, but the basic steps to deploy the schema are: * Create goiardi's database: `mysql -u root --execute 'CREATE DATABASE goiardi'` * Optionally, create a separate mysql user for goiardi and give it permissions on that database. * In sql-files/mysql-bundle, deploy the bundle: `sqitch deploy db:mysql://root[:<password>]@/goiardi` To update an existing database deployed by sqitch, run the `sqitch deploy` command above again. If you really really don't want to install sqitch, apply each SQL patch in sql-files/mysql-bundle by hand in the same order they're listed in the sqitch.plan file. The above values are for illustration, of course; nothing requires goiardi's database to be named "goiardi". Just make sure the right database is specified in the config file. Set `use-mysql = true` in the configuration file, or specify `--use-mysql` on the command line. It is an error to specify both the `-D`/`--data-file` flag and `--use-mysql` at the same time. At this time, the mysql connection options have to be defined in the config file. An example configuration is available in `etc/goiardi.conf-sample`, and is given below: Goiardi can also use Postgres as a backend for storing its data, instead of using MySQL or the in-memory data store. The overall procedure is pretty similar to setting up goiardi to use MySQL. Specifically for Postgres, you may want to create a database especially for goiardi, but it's not mandatory. If you do, you may also want to create a user for it. If you decide to do that: * Create the user: `$ createuser goiardi <additional options>` * Create the database, if you decided to: `$ createdb goiardi_db <additional options>`. If you created a user, make it the owner of the goiardi db with `-O goiardi`. After you've done that, or decided to use an existing database and user, deploy the sqitch bundle in sql-files/postgres-bundle. If you're using the default Postgres user on the local machine, `sqitch deploy db:pg:<dbname>` will be sufficient. Otherwise, the deploy command will be something like `sqitch deploy db:pg://user:password@localhost/goairdi_db`. The Postgres sqitch tutorial at https://metacpan.org/pod/sqitchtutorial explains more about how to use sqitch and Postgres. Set `use-postgresql` in the configuration file, or specify `--use-postgresql` on the command line. It's also an error to specify both `-D`/`--data-file` flag and `--use-postgresql` at the same time like it is in MySQL mode. MySQL and Postgres cannot be used at the same time, either. Like MySQL, the Postgres connection options must be specified in the config file at this time. There is also an example Postgres configuration in the config file, and can be seen below: Goiardi has optional event logging. When enabled with the `--log-events` command line option, or with the `"log-events"` option in the config file, changes to clients, users, cookbooks, data bags, environments, nodes, and roles will be tracked. The event log can be viewed through the /events API endpoint. If the `-K`/`--log-event-keep` option is set, then once a minute the event log will be automatically purged, leaving that many events in the log. This is particularly recommended when using the event log in in-memory mode. The event API endpoints work as follows: A user or client must be an administrator account to use the `/events` endpoint. The data returned by an event should look something like this: The easiest way to use the event log is with the knife-goiardi-event-log knife plugin. It's available on rubygems, or at github at https://github.com/ctdk/knife-goiardi-event-log. Goiardi now supports, on an experimental basis, Chef's reporting facilities. Nothing needs to be enabled in goiardi to use this, but changes are required with the client. See http://docs.opscode.com/reporting.html for details on how to enable reporting and how to use it. There is a goiardi extension to reporting: a "status" query parameter may be passed in a GET request that lists reports to limit the reports returned to ones that match the status, so you can read only reports of chef runs that were successful, failed, or started but haven't completed yet. Valid values for the "status" parameter are "started", "success", and "failure". To use reporting, you'll either need the Chef knife-reporting plugin, or use the knife-goiardi-reporting plugin that supports querying runs by status. It's available on rubygems, or on github at https://github.com/ctdk/knife-goiardi-reporting. As this is an experimental feature, it may not work entirely correctly. Bug reports are appreciated. Goiardi can now import and export its data in a JSON file. This can help both when upgrading, when the on-disk data format changes between releases, and to convert your goiardi installation from in-memory to MySQL (or vice versa). The JSON file has a version number set (currently 1.0), so that in the future if there is some sort of incompatible change to the JSON file format the importer will be able to handle it. Before importing data, you should back up any existing data and index files (and take a snapshot of the SQL db, if applicable) if there's any reason you might want it around later. After exporting, you may wish to hold on to the old installation data until you're satisfied that the import went well. Remember that the JSON export file contains the client and user public keys (which for the purposes of goiardi and chef are private) and the user hashed passwords and password salts. The export file should be guarded closely. The `-x/--export` and `-m/--import` flags control importing and exporting data. To export data, stop goiardi, then run it again with the same options as before but adding `-x <filename>` to the command. This will export all the data to the given filename, and goiardi will exit. Importing is ever so slightly trickier. You should remove any existing data store and index files, and if using an SQL database use sqitch to revert and deploy all of the SQL files to set up a completely clean schema for goiardi. Then run goiardi with the new options like you normally would, but add `-m <filename>`. Goiardi will run, import the new data, and exit. Assuming it went well, the data will be all imported. The export dump does not contain the user and client .pem files, so those will need to be saved and moved as needed. Theoretically a properly crafted export file could be used to do bulk loading of data into goiardi, thus goiardi does not wipe out the existing data on its own but rather leaves that task to the administrator. This functionality is merely theoretical and completely untested. If you try it, you should back your data up first. Goiardi has been built and run with the native 6g compiler on Mac OS X (10.7, 10.8, and 10.9), Debian squeeze and wheezy, a fairly recent Arch Linux, FreeBSD 9.2, and Solaris. Using Go's cross compiling capabilities, goiardi builds for all of Go's supported platforms except Dragonfly BSD and plan9 (because of issues with the postgres client library). Windows support has not been tested extensively, but a cross compiled binary has been tested successfully on Windows. Goiardi has also been built and run with gccgo (using the "-compiler gccgo" option with the "go" command) on Arch Linux. Building it with gccgo without the go command probably works, but it hasn't happened yet. This is a priority, though, so goiardi can be built on platforms the native compiler doesn't support yet. As mentioned above, goiardi can now freeze its in-memory data store and index to disk if specified. It will save before quitting if the program receives a SIGTERM or SIGINT signal, along with saving every "freeze-interval" seconds automatically. Saving automatically helps guard against the case where the server receives a signal that it can't handle and forces it to quit. In addition, goiardi will not replace the old save files until the new one is all finished writing. However, it's still not anywhere near a real database with transaction protection, etc., so while it should work fine in the general case, possibilities for data loss and corruption do exist. The appropriate caution is warranted. In addition to the aforementioned Chef documentation at http://docs.opscode.com, more documentation specific to goiardi can be viewed with godoc. See http://godoc.org/code.google.com/p/go.tools/cmd/godoc for an explanation of how godoc works. See the TODO file for an up-to-date list of what needs to be done. There's a lot. There's going to be a lot of these for a while, so we'll just keep those in a BUGS file, won't we? This started as a project to learn Go, and because I thought that an in memory chef server would be handy. Then I found out about chef-zero, but I still wanted a project to learn Go, so I kept it up. Chef 11 Server also only runs under Linux at this time, while Goiardi is developed under Mac OS X and ought to run under any platform Go supports (only partially at this time though). If you feel like contributing, great! Just fork the repo, make your improvements, and submit a pull request. Tests would, of course, be appreciated. Adding tests where there are no tests currently would be even more appreciated. At least, though, try and not break anything worse than it is. Test coverage has improved, but is still an ongoing concern. Goiardi is authored and copyright (c) Jeremy Bingham, 2013. Like many Chef ecosystem programs, goairdi is licensed under the Apache 2.0 License. See the LICENSE file for details. Chef is copyright (c) 2008-2013 Opscode, Inc. and its various contributors. Thanks go out to the fine folks of Opscode and the Chef community for all their hard work. Also, if you were wondering, Ettore Boiardi was the man behind Chef Boyardee. Wakka wakka.

v0.6.0 • 10 years ago

github.com/ryanbressler/cloudforest

Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.

v0.0.0-20220205065429-8f151e494fd2 • 2 years ago

xn--rhta.xyz/pingcap/errors

github.com/jmuders/freegeoip

github.com/weatherapicom/go

github.com/CandisIO/enmime

gitlab.com/MatteoCampinoti94/faapi-go

github.com/kilicoglutuncay/elasticlient

github.com/anticycle/anticycle

github.com/improbable-io/go-github

github.com/congop/execstub

xgit.xiaoniangao.cn/xngo/service/xng-search-gateway-api

github.com/jdbranham/esquery

github.com/dcormier/enmime

github.com/thiagozs/freegeoip

github.com/cgroschupp/enmime

github.com/wouafff/nationbuilder

github.com/DexterLB/osdb

github.com/gmccue/go-epa-echo

github.com/pritoj/enmime

github.com/bakanis/freegeoip

github.com/danilodvaz/lingua-go/v2

github.com/ScaleChamp/rod

github.com/circlingthesun/freegeoip

github.com/golearn/cloudforest

github.com/tanaikech/ggsrun

github.com/gopher-johns/enmime

github.com/candisio/enmime

github.com/allenluce/enmime

github.com/insticator/freegeoip

github.com/derktam/enmime

github.com/mrz1836/go-pipl

github.com/xpcmdshell/pipl

github.com/evorts/rod

github.com/overvenus/errors

github.com/devincarr/goarxiv

github.com/chirayu/lingua-go

github.com/jasdel/aws-sdk-go-v2/service/resourcegroupstaggingapi

github.com/msbarry/freegeoip

bmstu.codes/developers34/SBWeb

github.com/beeketing/scrape

github.com/ranjib/goiardi

github.com/xiongjiwei/errors

github.com/carrafa/foursquarego

github.com/cseeger-epages/i-doit-go-api

github.com/ryanbressler/cloudforest

github.com/jdeng/enmime

github.com/h2object/scrape

github.com/deface90/sqlittle

github.com/hackborn/sqlittle

github.com/jasperlabs/go-timezones

github.com/DiscoRiver/dohyo