Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/master/_examples Validator is designed to be thread-safe and used as a singleton instance. It caches information about your struct and validations, in essence only parsing your validation tags once per struct type. Using multiple instances neglects the benefit of caching. The not thread-safe functions are explicitly marked as such in the documentation. Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rgb|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. Allows to skip the validation if the value is nil (same as omitempty, but only for the nil-values). This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value when using WithRequiredStructEnabled. The field under validation must be present and not empty only if all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Examples: The field under validation must be present and not empty unless all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Examples: The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Example: The field under validation must not be present or not empty only if all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Examples: The field under validation must not be present or empty unless all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. For structs ensures value is not the zero value. Examples: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, len will ensure that the value is equal to the duration given in the parameter. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, max will ensure that the value is less than or equal to the duration given in the parameter. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, min will ensure that the value is greater than or equal to the duration given in the parameter. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, eq will ensure that the value is equal to the duration given in the parameter. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, ne will ensure that the value is not equal to the duration given in the parameter. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. To match strings with spaces in them, include the target string between single quotes. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Example #3 (time.Duration) For time.Duration, gt will ensure that the value is greater than the duration given in the parameter. Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). Example #3 (time.Duration) For time.Duration, gte will ensure that the value is greater than or equal to the duration given in the parameter. For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Example #3 (time.Duration) For time.Duration, lt will ensure that the value is less than the duration given in the parameter. Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). Example #3 (time.Duration) For time.Duration, lte will ensure that the value is less than or equal to the duration given in the parameter. This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value can successfully be parsed into a boolean with strconv.ParseBool This validates that a string value contains number values only. For integers or float it returns true. This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains only lowercase characters. An empty string is not a valid lowercase string. This validates that a string value contains only uppercase characters. An empty string is not a valid uppercase string. This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid E.164 Phone number https://en.wikipedia.org/wiki/E.164 (ex. +1123456789) This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value is valid JSON This validates that a string value is a valid JWT This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid file path and that the file exists on the machine and is an image. This is done using os.Stat and github.com/gabriel-vasile/mimetype This validates that a string value contains a valid file path but does not validate the existence of that file. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value, but without = padding, according the RFC4648 spec, section 3.2. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format. This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value does not start with the supplied string value This validates that a string value does not end with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains a valid ULID value. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid directory but does not validate the existence of that directory. This is done using os.Stat, which is a platform independent function. It is safest to suffix the string with os.PathSeparator if the directory may not exist at the time of validation. This validates that a string value contains a valid DNS hostname and port that can be used to valiate fields typically passed to sockets and connections. This validates that a string value is a valid datetime based on the supplied datetime format. Supplied format must match the official Go time format layout as documented in https://golang.org/pkg/time/ This validates that a string value is a valid country code based on iso3166-1 alpha-2 standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid country code based on iso3166-1 alpha-3 standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid country code based on iso3166-1 alpha-numeric standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid BCP 47 language tag, as parsed by language.Parse. More information on https://pkg.go.dev/golang.org/x/text/language BIC (SWIFT code) This validates that a string value is a valid Business Identifier Code (SWIFT code), defined in ISO 9362. More information on https://www.iso.org/standard/60390.html This validates that a string value is a valid dns RFC 1035 label, defined in RFC 1035. More information on https://datatracker.ietf.org/doc/html/rfc1035 This validates that a string value is a valid time zone based on the time zone database present on the system. Although empty value and Local value are allowed by time.LoadLocation golang function, they are not allowed by this validator. More information on https://golang.org/pkg/time/#LoadLocation This validates that a string value is a valid semver version, defined in Semantic Versioning 2.0.0. More information on https://semver.org/ This validates that a string value is a valid cve id, defined in cve mitre. More information on https://cve.mitre.org/ This validates that a string value contains a valid credit card number using Luhn algorithm. This validates that a string or (u)int value contains a valid checksum using the Luhn algorithm. This validates that a string is a valid 24 character hexadecimal string. This validates that a string value contains a valid cron expression. This validates that a string is valid for use with SpiceDb for the indicated purpose. If no purpose is given, a purpose of 'id' is assumed. Alias Validators and Tags NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/v9/_examples Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rbg|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format Full validation is blocked by https://github.com/golang/crypto/pull/28 This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/master/_examples Validator is designed to be thread-safe and used as a singleton instance. It caches information about your struct and validations, in essence only parsing your validation tags once per struct type. Using multiple instances neglects the benefit of caching. The not thread-safe functions are explicitly marked as such in the documentation. Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rgb|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty unless all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must not be present or not empty only if all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must not be present or empty unless all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, len will ensure that the value is equal to the duration given in the parameter. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, max will ensure that the value is less than or equal to the duration given in the parameter. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, min will ensure that the value is greater than or equal to the duration given in the parameter. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, eq will ensure that the value is equal to the duration given in the parameter. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, ne will ensure that the value is not equal to the duration given in the parameter. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. To match strings with spaces in them, include the target string between single quotes. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Example #3 (time.Duration) For time.Duration, gt will ensure that the value is greater than the duration given in the parameter. Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). Example #3 (time.Duration) For time.Duration, gte will ensure that the value is greater than or equal to the duration given in the parameter. For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Example #3 (time.Duration) For time.Duration, lt will ensure that the value is less than the duration given in the parameter. Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). Example #3 (time.Duration) For time.Duration, lte will ensure that the value is less than or equal to the duration given in the parameter. This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value can successfully be parsed into a boolean with strconv.ParseBool This validates that a string value contains number values only. For integers or float it returns true. This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains only lowercase characters. An empty string is not a valid lowercase string. This validates that a string value contains only uppercase characters. An empty string is not a valid uppercase string. This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid E.164 Phone number https://en.wikipedia.org/wiki/E.164 (ex. +1123456789) This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value is valid JSON This validates that a string value is a valid JWT This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid file path but does not validate the existence of that file. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value, but without = padding, according the the RFC4648 spec, section 3.2. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format. This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value does not start with the supplied string value This validates that a string value does not end with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains a valid ULID value. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid directory but does not validate the existence of that directory. This is done using os.Stat, which is a platform independent function. It is safest to suffix the string with os.PathSeparator if the directory may not exist at the time of validation. This validates that a string value contains a valid DNS hostname and port that can be used to valiate fields typically passed to sockets and connections. This validates that a string value is a valid datetime based on the supplied datetime format. Supplied format must match the official Go time format layout as documented in https://golang.org/pkg/time/ This validates that a string value is a valid country code based on iso3166-1 alpha-2 standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid country code based on iso3166-1 alpha-3 standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid country code based on iso3166-1 alpha-numeric standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid BCP 47 language tag, as parsed by language.Parse. More information on https://pkg.go.dev/golang.org/x/text/language BIC (SWIFT code) This validates that a string value is a valid Business Identifier Code (SWIFT code), defined in ISO 9362. More information on https://www.iso.org/standard/60390.html This validates that a string value is a valid dns RFC 1035 label, defined in RFC 1035. More information on https://datatracker.ietf.org/doc/html/rfc1035 This validates that a string value is a valid time zone based on the time zone database present on the system. Although empty value and Local value are allowed by time.LoadLocation golang function, they are not allowed by this validator. More information on https://golang.org/pkg/time/#LoadLocation This validates that a string value is a valid semver version, defined in Semantic Versioning 2.0.0. More information on https://semver.org/ This validates that a string value is a valid cve id, defined in cve mitre. More information on https://cve.mitre.org/ This validates that a string value contains a valid credit card number using Luhn algoritm. This validates that a string or (u)int value contains a valid checksum using the Luhn algorithm. #MongoDb ObjectID This validates that a string is a valid 24 character hexadecimal string. This validates that a string value contains a valid cron expression. Alias Validators and Tags NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
Copying All nodes (since they implement the Node interface) also implement the NodeCopier interface which provides the ShallowCopy() function. A shallow copy returns a new node with all the same properties, but no children. On the other hand there is a DeepCopy function which returns a new node with all recursive children also copied. This ensures that the new returned node can be manipulated without affecting the original node or any of its children. Dates in GEDCOM files can be very complex as they can cater for many scenarios: 1. Incomplete, like "Dec 1943" 2. Anchored, like "Aft. 3 Sep 2003" or "Before 1923" 3. Ranges, like "Bet. 4 Apr 1823 and 8 Apr 1823" 4. Phrases, like "(Foo Bar)" This package provides a very rich API for dealing with all kind of dates in a meaningful and sensible way. Some notable features include: 1. All dates, even though that specify an specific day have a minimum and maximum value that are their true bounds. This is especially important for larger date ranges like the whole month of "Jun 1945". 2. Upper and lower bounds of dates can be converted to the native Go time.Time object. 3. There is a Years function that provides a convenient way to normalise a date range into a number for easier distance and comparison measurements. 4. Algorithms for calculating the similarity of dates on a configurable parabola. Decoding a GEDCOM stream: If you are reading from a file you can use NewDocumentFromGEDCOMFile: Package gedcom contains functionality for encoding, decoding, traversing, manipulating and comparing of GEDCOM data. You can download the latest binaries for macOS, Windows and Linux on the Releases page: https://github.com/elliotchance/gedcom/releases This will not require you to install Go or any other dependencies. If you wish to build it from source you must install the dependencies with: On top of the raw document is a powerful API that takes care of the complex traversing of the Document. Here is a simple example: Some of the nodes in a GEDCOM file have been replaced with more function rich types, such as names, dates, families and more. Encoding a Document If you need the GEDCOM data as a string you can simply using fmt.Stringer: The Filter function recursively removes or manipulates nodes with a FilterFunction: Some examples of Filter functions include BlacklistTagFilter, OfficialTagFilter, SimpleNameFilter and WhitelistTagFilter. There are several functions available that handle different kinds of merging: - MergeNodes(left, right Node) Node: returns a new node that merges children from both nodes. - MergeNodeSlices(left, right Nodes, mergeFn MergeFunction) Nodes: merges two slices based on the mergeFn. This allows more advanced merging when dealing with slices of nodes. - MergeDocuments(left, right *Document, mergeFn MergeFunction) *Document: creates a new document with their respective nodes merged. You can use IndividualBySurroundingSimilarityMergeFunction with this to merge individuals, rather than just appending them all. The MergeFunction is a type that can be received in some of the merging functions. The closure determines if two nodes should be merged and what the result would be. Alternatively it can also describe when two nodes should not be merged. You may certainly create your own MergeFunction, but there are some that are already included: - IndividualBySurroundingSimilarityMergeFunction creates a MergeFunction that will merge individuals if their surrounding similarity is at least minimumSimilarity. - EqualityMergeFunction is a MergeFunction that will return a merged node if the node are considered equal (with Equals). Node.Equals performs a shallow comparison between two nodes. The implementation is different depending on the types of nodes being compared. You should see the specific documentation for the Node. Equality is not to be confused with the Is function seen on some of the nodes, such as Date.Is. The Is function is used to compare exact raw values in nodes. DeepEqual tests if left and right are recursively equal. CompareNodes recursively compares two nodes. For example: Produces a *NodeDiff than can be rendered with the String method:
A filesystem hierarchy synchronizer Rename files in TARGET so that identical files found in SOURCE and TARGET have the same relative path. The main goal of the program is to make folders synchronization faster by sparing big file transfers when a simple rename suffices. It complements other synchronization programs that lack this capability. See http://ambrevar.bitbucket.io/hsync and 'hsync -h' for more details. Usage: For usage options, see: We store the file entries in the following structure: This 'entries' map indexes the possible file matches by content ('partialHash'). A file match references the paths that will be used to rename the file in TARGET from 'oldpath' to 'newpath'. Note that 'newpath' is given by 'sourceID.path', and 'oldpath' by 'targetID.path'. The algorithm is centered around one main optimization: rolling-checksums. We assume that two files match if they have the same partial hash. The initial partial hash is just the size with an empty hash. This speeds up the process since this saves an open/close of the file. We just need a 'stat'. Files will not be read unless a rolling-checksum is required. As a consequence, unreadable files with a unique size will be stored in 'entries', while unreadable conflicting files will be discarded. Note that the system allows to rename files that cannot be read. One checksum roll increments 'pos' and updates the hash by hashing the next BLOCKSIZE bytes of the file. BLOCKSIZE is set to a value that is commonly believed to be optimal in most cases. The optimal value would be the device blocksize where the file resides. It would be more complex and memory consuming to query this value for each file. We choose md5 (128 bits) as the checksum algorithm. Adler32, CRC-32 and CRC-64 are only a tiny little faster while suffering from more clashes. This choice should be backed up with a proper benchmark. A conflict arises when two files in either SOURCE or TARGET have the same partial hash. We solve the conflict by updating the partial hashes until they differ. If the partial hashes cannot be updated any further (i.e. we reached end-of-file), it means that the files are duplicates. Notes: - Partial hashes of conflicting files will be complete at the same roll since they have the same size. - When a partial hash is complete, we have the following relation: - There is only one possible conflicting file at a time. A file match may be erroneous if the partial hash is not complete. The most obvious case is when two different files are the only ones of size N in SOURCE and TARGET. This down-side is a consequence of the design choice, i.e. focus on speed. Erroneous matches can be corrected in the preview file. If we wanted no ambiguity, we would have to compute the full hashes and this would take approximately as much time as copying files from SOURCE to TARGET, like a regular synchronization tool would do. We store the digest 'hash.Hash' together with the file path for when we update a partial hash. Process: 1. We walk SOURCE completely. Only regular files are processed. The 'sourceID' are stored. If two entries conflict (they have the same partial hash), we compute update the partial hashes until they do not conflict anymore. If the conflict is not resolvable, i.e. the partial hash is complete and files are identical, we drop both files from 'entries'. Future files can have the same partial hash that led to a former conflict. To distinguish the content from former conflicts when adding a new file, we must compute the partial hash up to the 'pos' of the last conflict (the number of checksum rolls). To keep track of this 'pos' when there is a conflict, we mark all computed partial hash as dummy values. When the next entry will be added, we will have to compute the partial hash until it does not match a dummy value in 'entries'. Duplicates are not processed but display a warning. Usually the user does not want duplicates, so she is better off fixing them before processing with the renames. It would add a lot of complexity to handle duplicates properly. 2. We walk TARGET completely. We skip all dummies as source the SOURCE walk. We need to analyze SOURCE completely before we can check for matches. - If there are only dummy entries, there was an unsolvable conflict in SOURCE. We drop the file. - If we end on a non-empty entry with an 'unsolvable' targetID, it means that an unsolvable conflict with target files happened with this partial hash. This is only possible at end-of-file. We drop the file. - If we end on an empty entry, there is no match with SOURCE and we drop the file. - If we end on a non-empty entry without previous matches, we store the match. - Else we end on a non-empty entry with one match already present. This is a conflict. We solve the conflict as for the SOURCE walk except that we need to update the partial hashes of three files: the SOURCE file, the first TARGET match and the new TARGET match. 3. We generate the 'renameOps' and 'reverseOps' maps. They map 'oldpath' to 'newpath' and 'newpath' to 'oldpath' respectively. We drop entries where 'oldpath==newpath' to spare a lot of noise. Note that file names are not used to compute a match since they could be identical while the content would be different. 4. We proceed with the renames. Chains and cycles may occur. - Example of a chain of renames: a->b, b->c, c->d. - Example of a cycle of renames: a->b, b->c, c->a. TARGET must be fully analyzed before proceeding with the renames so that we can detect chains. We always traverse chains until we reach the end, then rename the elements while going backward till the beginning. The beginning can be before the entry point. 'reverseOps' is used for going backward. When a cycle is detected, we break it down to a chain. We rename one file to a temporary name. Then we add this new file to the other end of the chain so that it gets renamed to its original new name once all files have been processed.
Package XGB provides the X Go Binding, which is a low-level API to communicate with the core X protocol and many of the X extensions. It is *very* closely modeled on XCB, so that experience with XCB (or xpyb) is easily translatable to XGB. That is, it uses the same cookie/reply model and is thread safe. There are otherwise no major differences (in the API). Most uses of XGB typically fall under the realm of window manager and GUI kit development, but other applications (like pagers, panels, tilers, etc.) may also require XGB. Moreover, it is a near certainty that if you need to work with X, xgbutil will be of great use to you as well: https://github.com/BurntSushi/xgbutil This is an extremely terse example that demonstrates how to connect to X, create a window, listen to StructureNotify events and Key{Press,Release} events, map the window, and print out all events received. An example with accompanying documentation can be found in examples/create-window. This is another small example that shows how to query Xinerama for geometry information of each active head. Accompanying documentation for this example can be found in examples/xinerama. XGB can benefit greatly from parallelism due to its concurrent design. For evidence of this claim, please see the benchmarks in xproto/xproto_test.go. xproto/xproto_test.go contains a number of contrived tests that stress particular corners of XGB that I presume could be problem areas. Namely: requests with no replies, requests with replies, checked errors, unchecked errors, sequence number wrapping, cookie buffer flushing (i.e., forcing a round trip every N requests made that don't have a reply), getting/setting properties and creating a window and listening to StructureNotify events. Both XCB and xpyb use the same Python module (xcbgen) for a code generator. XGB (before this fork) used the same code generator as well, but in my attempt to add support for more extensions, I found the code generator extremely difficult to work with. Therefore, I re-wrote the code generator in Go. It can be found in its own sub-package, xgbgen, of xgb. My design of xgbgen includes a rough consideration that it could be used for other languages. I am reasonably confident that the core X protocol is in full working form. I've also tested the Xinerama and RandR extensions sparingly. Many of the other existing extensions have Go source generated (and are compilable) and are included in this package, but I am currently unsure of their status. They *should* work. XKB is the only extension that intentionally does not work, although I suspect that GLX also does not work (however, there is Go source code for GLX that compiles, unlike XKB). I don't currently have any intention of getting XKB working, due to its complexity and my current mental incapacity to test it.
Package p9p implements a compliant 9P2000 client and server library for use in modern, production Go services. This package differentiates itself in that is has departed from the plan 9 implementation primitives and better follows idiomatic Go style. The package revolves around the session type, which is an enumeration of raw 9p message calls. A few calls, such as flush and version, have been elided, defering their usage to the server implementation. Sessions can be trivially proxied through clients and servers. The best place to get started is with Serve. Serve can be provided a connection and a handler. A typical implementation will call Serve as part of a listen/accept loop. As each network connection is created, Serve can be called with a handler for the specific connection. The handler can be implemented with a Session via the Dispatch function or can generate sessions for dispatch in response to client messages. (See cmd/9ps for an example) On the client side, NewSession provides a 9p session from a connection. After a version negotiation, methods can be called on the session, in parallel, and calls will be sent over the connection. Call timeouts can be controlled via the context provided to each method call. This package has the beginning of a nice client-server framework for working with 9p. Some of the abstractions aren't entirely fleshed out, but most of this can center around the Handler. Missing from this are a number of tools for implementing 9p servers. The most glaring are directory read and walk helpers. Other, more complex additions might be a system to manage in memory filesystem trees that expose multi-user sessions. The largest difference between this package and other 9p packages is simplification of the types needed to implement a server. To avoid confusing bugs and odd behavior, the components are separated by each level of the protocol. One example is that requests and responses are separated and they no longer hold mutable state. This means that framing, transport management, encoding, and dispatching are componentized. Little work will be required to swap out encodings, transports or connection implementations. This package has been wired from top to bottom to support context-based resource management. Everything from startup to shutdown can have timeouts using contexts. Not all close methods are fully in place, but we are very close to having controlled, predictable cleanup for both servers and clients. Timeouts can be very granular or very course, depending on the context of the timeout. For example, it is very easy to set a short timeout for a stat call but a long timeout for reading data. Currently, there is not multiversion support. The hooks and functionality are in place to add multi-version support. Generally, the correct space to do this is in the codec. Types, such as Dir, simply need to be extended to support the possibility of extra fields. The real question to ask here is what is the role of the version number in the 9p protocol. It really comes down to the level of support required. Do we just need it at the protocol level, or do handlers and sessions need to be have differently based on negotiated versions? This package has a number of TODOs to make it easier to use. Most of the existing code provides a solid base to work from. Don't be discouraged by the sawdust. In addition, the testing is embarassingly lacking. With time, we can get full testing going and ensure we have confidence in the implementation.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/fairyhunter13/validator/tree/master/_examples Validator is designed to be thread-safe and used as a singleton instance. It caches information about your struct and validations, in essence only parsing your validation tags once per struct type. Using multiple instances neglects the benefit of caching. The not thread-safe functions are explicitly marked as such in the documentation. Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rgb|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty unless all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must not be present or not empty only if all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must not be present or empty unless all the other specified fields are equal to the value following the specified field. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, len will ensure that the value is equal to the duration given in the parameter. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, max will ensure that the value is less than or equal to the duration given in the parameter. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, min will ensure that the value is greater than or equal to the duration given in the parameter. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, eq will ensure that the value is equal to the duration given in the parameter. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. Example #1 Example #2 (time.Duration) For time.Duration, ne will ensure that the value is not equal to the duration given in the parameter. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. To match strings with spaces in them, include the target string between single quotes. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Example #3 (time.Duration) For time.Duration, gt will ensure that the value is greater than the duration given in the parameter. Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). Example #3 (time.Duration) For time.Duration, gte will ensure that the value is greater than or equal to the duration given in the parameter. For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Example #3 (time.Duration) For time.Duration, lt will ensure that the value is less than the duration given in the parameter. Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). Example #3 (time.Duration) For time.Duration, lte will ensure that the value is less than or equal to the duration given in the parameter. This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers, time.Duration and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value can successfully be parsed into a boolean with strconv.ParseBool This validates that a string value contains number values only. For integers or float it returns true. This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains only lowercase characters. An empty string is not a valid lowercase string. This validates that a string value contains only uppercase characters. An empty string is not a valid uppercase string. This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid E.164 Phone number https://en.wikipedia.org/wiki/E.164 (ex. +1123456789) This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value is valid JSON This validates that a string value is a valid JWT This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format. This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value does not start with the supplied string value This validates that a string value does not end with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains a valid ULID value. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid DNS hostname and port that can be used to valiate fields typically passed to sockets and connections. This validates that a string value is a valid datetime based on the supplied datetime format. Supplied format must match the official Go time format layout as documented in https://golang.org/pkg/time/ This validates that a string value is a valid country code based on iso3166-1 alpha-2 standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid country code based on iso3166-1 alpha-3 standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid country code based on iso3166-1 alpha-numeric standard. see: https://www.iso.org/iso-3166-country-codes.html This validates that a string value is a valid BCP 47 language tag, as parsed by language.Parse. More information on https://pkg.go.dev/golang.org/x/text/language BIC (SWIFT code) This validates that a string value is a valid Business Identifier Code (SWIFT code), defined in ISO 9362. More information on https://www.iso.org/standard/60390.html This validates that a string value is a valid dns RFC 1035 label, defined in RFC 1035. More information on https://datatracker.ietf.org/doc/html/rfc1035 This validates that a string value is a valid time zone based on the time zone database present on the system. Although empty value and Local value are allowed by time.LoadLocation golang function, they are not allowed by this validator. More information on https://golang.org/pkg/time/#LoadLocation This validates that a string value is a valid semver version, defined in Semantic Versioning 2.0.0. More information on https://semver.org/ This validates that a string value contains a valid credit card number using Luhn algoritm. NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/v9/_examples Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rbg|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format Full validation is blocked by https://github.com/golang/crypto/pull/28 This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
The pipeline API allows us to create pipelines/workflows/templates were we can link different units of works (we will call them steps) between them to produce a graph of work. A step is a contract that allows us to represent or run a unit of work. A step requires an input and may return an output or an error depending on whether it failed or not. A step is declared as Steps are considered the backbone of the API. The API already provides a set of steps that should suffice to create any type of pipeline, but there may be specific scenarios were the given API gets too verbose or its not enough. In these type of scenarios we can create our own custom steps to match our needs. The steps provided by the API are: The most simple and atomic step. This step lets us run a single unit of work. A sequential step allows us to "link" two steps together sequentially. A concurrent step allows us to "link" multiple steps concurrently and once they're done reduce them to a single output. A conditional step allows us to evaluate a condition and depending on its result branch to specific step. This step allows us to branch the graph in two different branches. An optional step is similar to a conditional one, although it only has a single branch. It either runs the given Step or it skips it (returning the initial input), depending on the result of the statement evaluation. It also supports altering the output, but when doing so you need to provide how to default to it when the step is skipped Steps need to comply to an extremely simple interface. Hence, we can create our own custom steps by simply creating a struct that matches the given contract. There are no restrictions besides these two so it's highly flexible when wanting to create custom behaviors or logics. For example, a step that always succeeds and doesn't mutate the result might be: Running a pipeline is as simple as running the final step. You will need a context of your own (steps are context aware) and an initial input so the graph can be traversed with it and mutate it to yield a final output. You can render a graph by simply creating a graph and drawing the steps on it Eg. for rendering an UML you should do Example basic showcases a simple graph that uses the basic API steps to produce a simple result based on a given input. The input will be mutated across different steps (incrementing or doubling it) and finally, print if it's a 3 digit number or not For showing purposes, all steps and pipeline building are in the same function and use basic parameter types and logics (we don't showcase a real life usecase with infrastructure / http calls / etc), just note that it's quite similar. In the examples directory you can find more elaborate samples on how to do this better. Example complex showcases a complex graph that uses most of the API steps to produce a simple result based on a given input. The input will be mutated across different steps (incrementing or doubling it) and finally, print if it's a 3 digit number or not For showing purposes, all steps and pipeline building are in the same function and use basic parameter types and logics (we don't showcase a real life usecase with infrastructure / http calls / etc), just note that it's quite similar. In the examples directory you can find more elaborate samples on how to do this better.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/IlyaLab/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/IlyaLab/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/IlyaLab/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/IlyaLab/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/v9/_examples Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rbg|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format Full validation is blocked by https://github.com/golang/crypto/pull/28 This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
SQL Schema migration tool for Go. Key features: To install the library and command line program, use the following: The main command is called sql-migrate. Each command requires a configuration file (which defaults to dbconfig.yml, but can be specified with the -config flag). This config file should specify one or more environments: The `table` setting is optional and will default to `gorp_migrations`. The environment that will be used can be specified with the -env flag (defaults to development). Use the --help flag in combination with any of the commands to get an overview of its usage: The up command applies all available migrations. By contrast, down will only apply one migration by default. This behavior can be changed for both by using the -limit parameter. The redo command will unapply the last migration and reapply it. This is useful during development, when you're writing migrations. Use the status command to see the state of the applied migrations: Import sql-migrate into your application: Set up a source of migrations, this can be from memory, from a set of files or from bindata (more on that later): Then use the Exec function to upgrade your database: Note that n can be greater than 0 even if there is an error: any migration that succeeded will remain applied even if a later one fails. The full set of capabilities can be found in the API docs below. Migrations are defined in SQL files, which contain a set of SQL statements. Special comments are used to distinguish up and down migrations. You can put multiple statements in each block, as long as you end them with a semicolon (;). If you have complex statements which contain semicolons, use StatementBegin and StatementEnd to indicate boundaries: The order in which migrations are applied is defined through the filename: sql-migrate will sort migrations based on their name. It's recommended to use an increasing version number or a timestamp as the first part of the filename. Normally each migration is run within a transaction in order to guarantee that it is fully atomic. However some SQL commands (for example creating an index concurrently in PostgreSQL) cannot be executed inside a transaction. In order to execute such a command in a migration, the migration can be run using the notransaction option: If you like your Go applications self-contained (that is: a single binary): use bindata (https://github.com/jteeuwen/go-bindata) to embed the migration files. Just write your migration files as usual, as a set of SQL files in a folder. Then use bindata to generate a .go file with the migrations embedded: The resulting bindata.go file will contain your migrations. Remember to regenerate your bindata.go file whenever you add/modify a migration (go generate will help here, once it arrives). Use the AssetMigrationSource in your application to find the migrations: Both Asset and AssetDir are functions provided by bindata. Then proceed as usual. Adding a new migration source means implementing MigrationSource. The resulting slice of migrations will be executed in the given order, so it should usually be sorted by the Id field.
A filesystem hierarchy synchronizer Rename files in TARGET so that identical files found in SOURCE and TARGET have the same relative path. The main goal of the program is to make folders synchronization faster by sparing big file transfers when a simple rename suffices. It complements other synchronization programs that lack this capability. See https://ambrevar.xyz/hsync and 'hsync -h' for more details. Usage: For usage options, see: We store the file entries in the following structure: This 'entries' map indexes the possible file matches by content ('partialHash'). A file match references the paths that will be used to rename the file in TARGET from 'oldpath' to 'newpath'. Note that 'newpath' is given by 'sourceID.path', and 'oldpath' by 'targetID.path'. The algorithm is centered around one main optimization: rolling-checksums. We assume that two files match if they have the same partial hash. The initial partial hash is just the size with an empty hash. This speeds up the process since this saves an open/close of the file. We just need a 'stat'. Files will not be read unless a rolling-checksum is required. As a consequence, unreadable files with a unique size will be stored in 'entries', while unreadable conflicting files will be discarded. Note that the system allows to rename files that cannot be read. One checksum roll increments 'pos' and updates the hash by hashing the next BLOCKSIZE bytes of the file. BLOCKSIZE is set to a value that is commonly believed to be optimal in most cases. The optimal value would be the device blocksize where the file resides. It would be more complex and memory consuming to query this value for each file. We choose md5 (128 bits) as the checksum algorithm. Adler32, CRC-32 and CRC-64 are only a tiny little faster while suffering from more clashes. This choice should be backed up with a proper benchmark. A conflict arises when two files in either SOURCE or TARGET have the same partial hash. We solve the conflict by updating the partial hashes until they differ. If the partial hashes cannot be updated any further (i.e. we reached end-of-file), it means that the files are duplicates. Notes: - Partial hashes of conflicting files will be complete at the same roll since they have the same size. - When a partial hash is complete, we have the following relation: - There is only one possible conflicting file at a time. A file match may be erroneous if the partial hash is not complete. The most obvious case is when two different files are the only ones of size N in SOURCE and TARGET. This down-side is a consequence of the design choice, i.e. focus on speed. Erroneous matches can be corrected in the preview file. If we wanted no ambiguity, we would have to compute the full hashes and this would take approximately as much time as copying files from SOURCE to TARGET, like a regular synchronization tool would do. We store the digest 'hash.Hash' together with the file path for when we update a partial hash. Process: 1. We walk SOURCE completely. Only regular files are processed. The 'sourceID' are stored. If two entries conflict (they have the same partial hash), we compute update the partial hashes until they do not conflict anymore. If the conflict is not resolvable, i.e. the partial hash is complete and files are identical, we drop both files from 'entries'. Future files can have the same partial hash that led to a former conflict. To distinguish the content from former conflicts when adding a new file, we must compute the partial hash up to the 'pos' of the last conflict (the number of checksum rolls). To keep track of this 'pos' when there is a conflict, we mark all computed partial hash as dummy values. When the next entry will be added, we will have to compute the partial hash until it does not match a dummy value in 'entries'. Duplicates are not processed but display a warning. Usually the user does not want duplicates, so she is better off fixing them before processing with the renames. It would add a lot of complexity to handle duplicates properly. 2. We walk TARGET completely. We skip all dummies as source the SOURCE walk. We need to analyze SOURCE completely before we can check for matches. - If there are only dummy entries, there was an unsolvable conflict in SOURCE. We drop the file. - If we end on a non-empty entry with an 'unsolvable' targetID, it means that an unsolvable conflict with target files happened with this partial hash. This is only possible at end-of-file. We drop the file. - If we end on an empty entry, there is no match with SOURCE and we drop the file. - If we end on a non-empty entry without previous matches, we store the match. - Else we end on a non-empty entry with one match already present. This is a conflict. We solve the conflict as for the SOURCE walk except that we need to update the partial hashes of three files: the SOURCE file, the first TARGET match and the new TARGET match. 3. We generate the 'renameOps' and 'reverseOps' maps. They map 'oldpath' to 'newpath' and 'newpath' to 'oldpath' respectively. We drop entries where 'oldpath==newpath' to spare a lot of noise. Note that file names are not used to compute a match since they could be identical while the content would be different. 4. We proceed with the renames. Chains and cycles may occur. - Example of a chain of renames: a->b, b->c, c->d. - Example of a cycle of renames: a->b, b->c, c->a. TARGET must be fully analyzed before proceeding with the renames so that we can detect chains. We always traverse chains until we reach the end, then rename the elements while going backward till the beginning. The beginning can be before the entry point. 'reverseOps' is used for going backward. When a cycle is detected, we break it down to a chain. We rename one file to a temporary name. Then we add this new file to the other end of the chain so that it gets renamed to its original new name once all files have been processed.
Package htree implements the in-memory hash tree. Hash-Tree is a key-value multi-tree with fast indexing performance and high space utilization. Take 10 consecutive prime numbers: And they can distinguish all uint32 numbers: Key insertion steps: Example htree: Child nodes are stored in an array orderly, and checked by binary-search but not indexed by remainders, array indexing will result redundancy entries, this is for less memory usage, with a bit performance loss. A node is created only when a new key is inserted. So the worst time complexity is O(Sum(lg2~lg29)), is constant level, and the entries utilization is 100%. Although hashtable is very fast with O(1) time complexity, but there is always about ~25% table entries are unused, because the hash-table load factor is usually .75. And this htree is suitable for memory-bounded cases. HTree is better for local locks if you want a safe container. Map may need to rehash and resize on insertions. No. Lock granularity depends on the use case.
Package skipper provides an HTTP routing library with flexible configuration as well as a runtime update of the routing rules. Skipper works as an HTTP reverse proxy that is responsible for mapping incoming requests to multiple HTTP backend services, based on routes that are selected by the request attributes. At the same time, both the requests and the responses can be augmented by a filter chain that is specifically defined for each route. Skipper can load and update the route definitions from multiple data sources without being restarted. It provides a default executable command with a few built-in filters, however, its primary use case is to be extended with custom filters, predicates or data sources. For futher information read 'Extending Skipper'. Skipper took the core design and inspiration from Vulcand: https://github.com/mailgun/vulcand. Skipper is 'go get' compatible. If needed, create a 'go workspace' first: Get the Skipper packages: Create a file with a route: Optionally, verify the syntax of the file: Start Skipper and make an HTTP request: The core of Skipper's request processing is implemented by a reverse proxy in the 'proxy' package. The proxy receives the incoming request, forwards it to the routing engine in order to receive the most specific matching route. When a route matches, the request is forwarded to all filters defined by it. The filters can modify the request or execute any kind of program logic. Once the request has been processed by all the filters, it is forwarded to the backend endpoint of the route. The response from the backend goes once again through all the filters in reverse order. Finally, it is mapped as the response of the original incoming request. Besides the default proxying mechanism, it is possible to define routes without a real network backend endpoint. One of these cases is called a 'shunt' backend, in which case one of the filters needs to handle the request providing its own response (e.g. the 'static' filter). Actually, filters themselves can instruct the request flow to shunt by calling the Serve(*http.Response) method of the filter context. Another case of a route without a network backend is the 'loopback'. A loopback route can be used to match a request, modified by filters, against the lookup tree with different conditions and then execute a different route. One example scenario can be to use a single route as an entry point to execute some calculation to get an A/B testing decision and then matching the updated request metadata for the actual destination route. This way the calculation can be executed for only those requests that don't contain information about a previously calculated decision. For further details, see the 'proxy' and 'filters' package documentation. Finding a request's route happens by matching the request attributes to the conditions in the route's definitions. Such definitions may have the following conditions: - method - path (optionally with wildcards) - path regular expressions - host regular expressions - headers - header regular expressions It is also possible to create custom predicates with any other matching criteria. The relation between the conditions in a route definition is 'and', meaning, that a request must fulfill each condition to match a route. For further details, see the 'routing' package documentation. Filters are applied in order of definition to the request and in reverse order to the response. They are used to modify request and response attributes, such as headers, or execute background tasks, like logging. Some filters may handle the requests without proxying them to service backends. Filters, depending on their implementation, may accept/require parameters, that are set specifically to the route. For further details, see the 'filters' package documentation. Each route has one of the following backends: HTTP endpoint, shunt or loopback. Backend endpoints can be any HTTP service. They are specified by their network address, including the protocol scheme, the domain name or the IP address, and optionally the port number: e.g. "https://www.example.org:4242". (The path and query are sent from the original request, or set by filters.) A shunt route means that Skipper handles the request alone and doesn't make requests to a backend service. In this case, it is the responsibility of one of the filters to generate the response. A loopback route executes the routing mechanism on current state of the request from the start, including the route lookup. This way it serves as a form of an internal redirect. Route definitions consist of the following: - request matching conditions (predicates) - filter chain (optional) - backend (either an HTTP endpoint or a shunt) The eskip package implements the in-memory and text representations of route definitions, including a parser. (Note to contributors: in order to stay compatible with 'go get', the generated part of the parser is stored in the repository. When changing the grammar, 'go generate' needs to be executed explicitly to update the parser.) For further details, see the 'eskip' package documentation Skipper's route definitions of Skipper are loaded from one or more data sources. It can receive incremental updates from those data sources at runtime. It provides three different data clients: - Innkeeper: the Innkeeper service implements a storage for large sets of Skipper routes, with an HTTP+JSON API, OAuth2 authentication and role management. See the 'innkeeper' package and https://github.com/zalando/innkeeper. - etcd: Skipper can load routes and receive updates from etcd clusters (https://github.com/coreos/etcd). See the 'etcd' package. - static file: package eskipfile implements a simple data client, which can load route definitions from a static file in eskip format. Currently, it loads the routes on startup. It doesn't support runtime updates. Skipper can use additional data sources, provided by extensions. Sources must implement the DataClient interface in the routing package. Skipper can be started with the default executable command 'skipper', or as a library built into an application. The easiest way to start Skipper as a library is to execute the 'Run' function of the current, root package. Each option accepted by the 'Run' function is wired in the default executable as well, as a command line flag. E.g. EtcdUrls becomes -etcd-urls as a comma separated list. For command line help, enter: An additional utility, eskip, can be used to verify, print, update and delete routes from/to files or etcd (Innkeeper on the roadmap). See the cmd/eskip command package, and/or enter in the command line: Skipper doesn't use dynamically loaded plugins, however, it can be used as a library, and it can be extended with custom predicates, filters and/or custom data sources. To create a custom predicate, one needs to implement the PredicateSpec interface in the routing package. Instances of the PredicateSpec are used internally by the routing package to create the actual Predicate objects as referenced in eskip routes, with concrete arguments. Example, randompredicate.go: In the above example, a custom predicate is created, that can be referenced in eskip definitions with the name 'Random': To create a custom filter we need to implement the Spec interface of the filters package. 'Spec' is the specification of a filter, and it is used to create concrete filter instances, while the raw route definitions are processed. Example, hellofilter.go: The above example creates a filter specification, and in the routes where they are included, the filter instances will set the 'X-Hello' header for each and every response. The name of the filter is 'hello', and in a route definition it is referenced as: The easiest way to create a custom Skipper variant is to implement the required filters (as in the example above) by importing the Skipper package, and starting it with the 'Run' command. Example, hello.go: A file containing the routes, routes.eskip: Start the custom router: The 'Run' function in the root Skipper package starts its own listener but it doesn't provide the best composability. The proxy package, however, provides a standard http.Handler, so it is possible to use it in a more complex solution as a building block for routing. Skipper provides detailed logging of failures, and access logs in Apache log format. Skipper also collects detailed performance metrics, and exposes them on a separate listener endpoint for pulling snapshots. For details, see the 'logging' and 'metrics' packages documentation. The router's performance depends on the environment and on the used filters. Under ideal circumstances, and without filters, the biggest time factor is the route lookup. Skipper is able to scale to thousands of routes with logarithmic performance degradation. However, this comes at the cost of increased memory consumption, due to storing the whole lookup tree in a single structure. Benchmarks for the tree lookup can be run by: In case more aggressive scale is needed, it is possible to setup Skipper in a cascade model, with multiple Skipper instances for specific route segments.
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
bíogo is a bioinformatics library for the Go language. It is a work in progress. bíogo stems from the need to address the size and structure of modern genomic and metagenomic data sets. These properties enforce requirements on the libraries and languages used for analysis: In addition to the computational burden of massive data set sizes in modern genomics there is an increasing need for complex pipelines to resolve questions in tightening problem space and also a developing need to be able to develop new algorithms to allow novel approaches to interesting questions. These issues suggest the need for a simplicity in syntax to facilitate: Related to the second issue is the reluctance of some researchers to release code because of quality concerns http://www.nature.com/news/2010/101013/full/467753a.html The issue of code release is the first of the principles formalised in the Science Code Manifesto http://sciencecodemanifesto.org/ A language with a simple, yet expressive, syntax should facilitate development of higher quality code and thus help reduce this barrier to research code release. It seems that nearly every language has it own bioinformatics library, some of which are very mature, for example BioPerl and BioPython. Why add another one? The different libraries excel in different fields, acting as scripting glue for applications in a pipeline (much of [1-3]) and interacting with external hosts [1, 2, 4, 5], wrapping lower level high performance languages with more user friendly syntax [1-4] or providing bioinformatics functions for high performance languages [5, 6]. The intended niche for bíogo lies somewhere between the scripting libraries and high performance language libraries in being easy to use for both small and large projects while having reasonable performance with computationally intensive tasks. The intent is to reduce the level of investment required to develop new research software for computationally intensive tasks. The bíogo library structure is influenced both by the structure of BioPerl and the Go core libraries. The coding style should be aligned with normal Go idioms as represented in the Go core libraries. Position numbering in the bíogo library conforms to the zero-based indexing of Go and range indexing conforms to Go's half-open zero-based slice indexing. This is at odds with the 'normal' inclusive indexing used by molecular biologists. This choice was made to avoid inconsistent indexing spaces being used — one-based inclusive for bíogo functions and methods and zero-based for native Go slices and arrays — and so avoid errors that this would otherwise facilitate. Note that the GFF package does allow, and defaults to, one-based inclusive indexing in its input and output of GFF files. Quality scores are supported for all sequence types, including protein. Phred and Solexa scoring systems are able to be read from files, however internal representation of quality scores is with Phred, so there will be precision loss in conversion. A Solexa quality score type is provided for use where this will be a problem. Copyright ©2011-2012 The bíogo Authors except where otherwise noted. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.
Package skipper provides an HTTP routing library with flexible configuration as well as a runtime update of the routing rules. Skipper works as an HTTP reverse proxy that is responsible for mapping incoming requests to multiple HTTP backend services, based on routes that are selected by the request attributes. At the same time, both the requests and the responses can be augmented by a filter chain that is specifically defined for each route. Skipper can load and update the route definitions from multiple data sources without being restarted. It provides a default executable command with a few built-in filters, however, its primary use case is to be extended with custom filters, predicates or data sources. For futher information read 'Extending Skipper'. Skipper took the core design and inspiration from Vulcand: https://github.com/mailgun/vulcand. Skipper is 'go get' compatible. If needed, create a 'go workspace' first: Get the Skipper packages: Create a file with a route: Optionally, verify the syntax of the file: Start Skipper and make an HTTP request: The core of Skipper's request processing is implemented by a reverse proxy in the 'proxy' package. The proxy receives the incoming request, forwards it to the routing engine in order to receive the most specific matching route. When a route matches, the request is forwarded to all filters defined by it. The filters can modify the request or execute any kind of program logic. Once the request has been processed by all the filters, it is forwarded to the backend endpoint of the route. The response from the backend goes once again through all the filters in reverse order. Finally, it is mapped as the response of the original incoming request. Besides the default proxying mechanism, it is possible to define routes without a real network backend endpoint. One of these cases is called a 'shunt' backend, in which case one of the filters needs to handle the request providing its own response (e.g. the 'static' filter). Actually, filters themselves can instruct the request flow to shunt by calling the Serve(*http.Response) method of the filter context. Another case of a route without a network backend is the 'loopback'. A loopback route can be used to match a request, modified by filters, against the lookup tree with different conditions and then execute a different route. One example scenario can be to use a single route as an entry point to execute some calculation to get an A/B testing decision and then matching the updated request metadata for the actual destination route. This way the calculation can be executed for only those requests that don't contain information about a previously calculated decision. For further details, see the 'proxy' and 'filters' package documentation. Finding a request's route happens by matching the request attributes to the conditions in the route's definitions. Such definitions may have the following conditions: - method - path (optionally with wildcards) - path regular expressions - host regular expressions - headers - header regular expressions It is also possible to create custom predicates with any other matching criteria. The relation between the conditions in a route definition is 'and', meaning, that a request must fulfill each condition to match a route. For further details, see the 'routing' package documentation. Filters are applied in order of definition to the request and in reverse order to the response. They are used to modify request and response attributes, such as headers, or execute background tasks, like logging. Some filters may handle the requests without proxying them to service backends. Filters, depending on their implementation, may accept/require parameters, that are set specifically to the route. For further details, see the 'filters' package documentation. Each route has one of the following backends: HTTP endpoint, shunt or loopback. Backend endpoints can be any HTTP service. They are specified by their network address, including the protocol scheme, the domain name or the IP address, and optionally the port number: e.g. "https://www.example.org:4242". (The path and query are sent from the original request, or set by filters.) A shunt route means that Skipper handles the request alone and doesn't make requests to a backend service. In this case, it is the responsibility of one of the filters to generate the response. A loopback route executes the routing mechanism on current state of the request from the start, including the route lookup. This way it serves as a form of an internal redirect. Route definitions consist of the following: - request matching conditions (predicates) - filter chain (optional) - backend (either an HTTP endpoint or a shunt) The eskip package implements the in-memory and text representations of route definitions, including a parser. (Note to contributors: in order to stay compatible with 'go get', the generated part of the parser is stored in the repository. When changing the grammar, 'go generate' needs to be executed explicitly to update the parser.) For further details, see the 'eskip' package documentation Skipper's route definitions of Skipper are loaded from one or more data sources. It can receive incremental updates from those data sources at runtime. It provides three different data clients: - Innkeeper: the Innkeeper service implements a storage for large sets of Skipper routes, with an HTTP+JSON API, OAuth2 authentication and role management. See the 'innkeeper' package and https://github.com/zalando/innkeeper. - etcd: Skipper can load routes and receive updates from etcd clusters (https://github.com/coreos/etcd). See the 'etcd' package. - static file: package eskipfile implements a simple data client, which can load route definitions from a static file in eskip format. Currently, it loads the routes on startup. It doesn't support runtime updates. Skipper can use additional data sources, provided by extensions. Sources must implement the DataClient interface in the routing package. Skipper can be started with the default executable command 'skipper', or as a library built into an application. The easiest way to start Skipper as a library is to execute the 'Run' function of the current, root package. Each option accepted by the 'Run' function is wired in the default executable as well, as a command line flag. E.g. EtcdUrls becomes -etcd-urls as a comma separated list. For command line help, enter: An additional utility, eskip, can be used to verify, print, update and delete routes from/to files or etcd (Innkeeper on the roadmap). See the cmd/eskip command package, and/or enter in the command line: Skipper doesn't use dynamically loaded plugins, however, it can be used as a library, and it can be extended with custom predicates, filters and/or custom data sources. To create a custom predicate, one needs to implement the PredicateSpec interface in the routing package. Instances of the PredicateSpec are used internally by the routing package to create the actual Predicate objects as referenced in eskip routes, with concrete arguments. Example, randompredicate.go: In the above example, a custom predicate is created, that can be referenced in eskip definitions with the name 'Random': To create a custom filter we need to implement the Spec interface of the filters package. 'Spec' is the specification of a filter, and it is used to create concrete filter instances, while the raw route definitions are processed. Example, hellofilter.go: The above example creates a filter specification, and in the routes where they are included, the filter instances will set the 'X-Hello' header for each and every response. The name of the filter is 'hello', and in a route definition it is referenced as: The easiest way to create a custom Skipper variant is to implement the required filters (as in the example above) by importing the Skipper package, and starting it with the 'Run' command. Example, hello.go: A file containing the routes, routes.eskip: Start the custom router: The 'Run' function in the root Skipper package starts its own listener but it doesn't provide the best composability. The proxy package, however, provides a standard http.Handler, so it is possible to use it in a more complex solution as a building block for routing. Skipper provides detailed logging of failures, and access logs in Apache log format. Skipper also collects detailed performance metrics, and exposes them on a separate listener endpoint for pulling snapshots. For details, see the 'logging' and 'metrics' packages documentation. The router's performance depends on the environment and on the used filters. Under ideal circumstances, and without filters, the biggest time factor is the route lookup. Skipper is able to scale to thousands of routes with logarithmic performance degradation. However, this comes at the cost of increased memory consumption, due to storing the whole lookup tree in a single structure. Benchmarks for the tree lookup can be run by: In case more aggressive scale is needed, it is possible to setup Skipper in a cascade model, with multiple Skipper instances for specific route segments.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package p9p implements a compliant 9P2000 client and server library for use in modern, production Go services. This package differentiates itself in that is has departed from the plan 9 implementation primitives and better follows idiomatic Go style. The package revolves around the session type, which is an enumeration of raw 9p message calls. A few calls, such as flush and version, have been elided, defering their usage to the server implementation. Sessions can be trivially proxied through clients and servers. The best place to get started is with Serve. Serve can be provided a connection and a handler. A typical implementation will call Serve as part of a listen/accept loop. As each network connection is created, Serve can be called with a handler for the specific connection. The handler can be implemented with a Session via the Dispatch function or can generate sessions for dispatch in response to client messages. (See cmd/9ps for an example) On the client side, NewSession provides a 9p session from a connection. After a version negotiation, methods can be called on the session, in parallel, and calls will be sent over the connection. Call timeouts can be controlled via the context provided to each method call. This package has the beginning of a nice client-server framework for working with 9p. Some of the abstractions aren't entirely fleshed out, but most of this can center around the Handler. Missing from this are a number of tools for implementing 9p servers. The most glaring are directory read and walk helpers. Other, more complex additions might be a system to manage in memory filesystem trees that expose multi-user sessions. The largest difference between this package and other 9p packages is simplification of the types needed to implement a server. To avoid confusing bugs and odd behavior, the components are separated by each level of the protocol. One example is that requests and responses are separated and they no longer hold mutable state. This means that framing, transport management, encoding, and dispatching are componentized. Little work will be required to swap out encodings, transports or connection implementations. This package has been wired from top to bottom to support context-based resource management. Everything from startup to shutdown can have timeouts using contexts. Not all close methods are fully in place, but we are very close to having controlled, predictable cleanup for both servers and clients. Timeouts can be very granular or very course, depending on the context of the timeout. For example, it is very easy to set a short timeout for a stat call but a long timeout for reading data. Currently, there is not multiversion support. The hooks and functionality are in place to add multi-version support. Generally, the correct space to do this is in the codec. Types, such as Dir, simply need to be extended to support the possibility of extra fields. The real question to ask here is what is the role of the version number in the 9p protocol. It really comes down to the level of support required. Do we just need it at the protocol level, or do handlers and sessions need to be have differently based on negotiated versions? This package has a number of TODOs to make it easier to use. Most of the existing code provides a solid base to work from. Don't be discouraged by the sawdust. In addition, the testing is embarassingly lacking. With time, we can get full testing going and ensure we have confidence in the implementation.
A dynamic and extensible music library organizer Demlo is a music library organizer. It can encode, fix case, change folder hierarchy according to tags or file properties, tag from an online database, copy covers while ignoring duplicates or those below a quality threshold, and much more. It makes it possible to manage your libraries uniformly and dynamically. You can write your own rules to fit your needs best. Demlo aims at being as lightweight and portable as possible. Its major runtime dependency is the transcoder FFmpeg. The scripts are written in Lua for portability and speed while allowing virtually unlimited extensibility. Usage: For usage options, see: First Demlo creates a list of all input files. When a folder is specified, all files matching the extensions from the 'extensions' variable will be appended to the list. Identical files are appended only once. Next all files get analyzed: - The audio file details (tags, stream properties, format properties, etc.) are stored into the 'input' variable. The 'output' variable gets its default values from 'input', or from an index file if specified from command-line. If no index has been specified and if an attached cuesheet is found, all cuesheet details are appended accordingly. Cuesheet tags override stream tags, which override format tags. Finally, still without index, tags can be retrieved from Internet if the command-line option is set. - If a prescript has been specified, it gets executed. It makes it possible to adjust the input values and global variables before running the other scripts. - The scripts, if any, get executed in the lexicographic order of their basename. The 'output' variable is transformed accordingly. Scripts may contain rules such as defining a new file name, new tags, new encoding properties, etc. You can use conditions on input values to set the output properties, which makes it virtually possible to process a full music library in one single run. - If a postscript has been specified, it gets executed. It makes it possible to adjust the output of the script for the current run only. - Demlo makes some last-minute tweaking if need be: it adjusts the bitrate, the path, the encoding parameters, and so on. - A preview of changes is displayed. - When applying changes, the covers get copied if required and the audio file gets processed: tags are modified as specified, the file is re-encoded if required, and the output is written to the appropriate folder. When destination already exists, the 'exist' action is executed. The program's default behaviour can be changed from the user configuration file. (See the 'Files' section for a template.) Most command-line flags default value can be changed. The configuration file is loaded on startup, before parsing the command-line options. Review the default value of the CLI flags with 'demlo -h'. If you wish to use no configuration file, set the environment variable DEMLORC to ".". Scripts can contain any safe Lua code. Some functions like 'os.execute' are not available for security reasons. It is not possible to print to the standard output/error unless running in debug mode and using the 'debug' function. See the 'sandbox.go' file for a list of allowed functions and variables. Lua patterns are replaced by Go regexps. See https://github.com/google/re2/wiki/Syntax. Scripts have no requirements at all. However, to be useful, they should set values of the 'output' table detailed in the 'Variables' section. You can use the full power of the Lua to set the variables dynamically. For instance: 'input' and 'output' are both accessible from any script. All default functions and variables (excluding 'output') are reset on every script call to enforce consistency. Local variables are lost from one script call to another. Global variables are preserved. Use this feature to pass data like options or new functions. 'output' structure consistency is guaranteed at the start of every script. Demlo will only extract the fields with the right type as described in the 'Variables' section. Warning: Do not abuse of global variables, especially when processing non-fixed size data (e.g. tables). Data could grow big and slow down the program. By default, when the destination exists, Demlo will append a suffix to the output destination. This behaviour can be changed from the 'exist' action specified by the user. Demlo comes with a few default actions. The 'exist' action works just like scripts with the following differences: - Any change to 'output.path' will be skipped. - An additional variable is accessible from the action: 'existinfo' holds the file details of the existing files in the same fashion as 'input'. This allows for comparing the input file and the existing destination. The writing rules can be tweaked the following way: Word of caution: overwriting breaks Demlo's rule of not altering existing files. It can lead to undesired results if the overwritten file is also part of the (yet to be processed) input. The overwrite capability can be useful when syncing music libraries however. The user scripts should be generic. Therefore they may not properly handle some uncommon input values. Tweak the input with temporary overrides from command-line. The prescript and postscript defined on command-line will let you run arbitrary code that is run before and after all other scripts, respectively. Use global variables to transfer data and parameters along. If the prescript and postscript end up being too long, consider writing a demlo script. You can also define shell aliases or use wrapper scripts as convenience. The 'input' table describes the file: Bitrate is in bits per seconds (bps). That is, for 320 kbps you would specify The 'time' is the modification time of the file. It holds the sec seconds and nsec nanoseconds since January 1, 1970 UTC. The entry 'streams' and 'format' are as returned by It gives access to most metadata that FFmpeg can return. For instance, to get the duration of the track in seconds, query the variable 'input.format.duration'. Since there may be more than one stream (covers, other data), the first audio stream is assumed to be the music stream. For convenience, the index of the music stream is stored in 'audioindex'. The tags returned by FFmpeg are found in streams, format and in the cuesheet. To make tag queries easier, all tags are stored in the 'tags' table, with the following precedence: You can remove a tag by setting it to 'nil' or the empty string. This is equivalent, except that 'nil' saves some memory during the process. The 'output' table describes the transformation to apply to the file: The 'parameters' array holds the CLI parameters passed to FFmpeg. It can be anything supported by FFmpeg, although this variable is supposed to hold encoding information. See the 'Examples' section. The 'embeddedcovers', 'externalcovers' and 'onlinecover' variables are detailed in the 'Covers' section. The 'write' variable is covered in the 'Existing destination' section. The 'rmsrc' variable is a boolean: when true, Demlo removes the source file after processing. This can speed up the process when not re-encoding. This option is ignored for multi-track files. For convenience, the following shortcuts are provided: Demlo provides some non-standard Lua functions to ease scripting. Display a message on stderr if debug mode is on. Return lowercase string without non-alphanumeric characters nor leading zeros. Return the relation coefficient of the two input strings. The result is a float in 0.0...1.0, 0.0 means no relation at all, 1.0 means identical strings. A format is a container in FFmpeg's terminology. 'output.parameters' contains CLI flags passed to FFmpeg. They are meant to set the stream codec, the bitrate, etc. If 'output.parameters' is {'-c:a', 'copy'} and the format is identical, then taglib will be used instead of FFmpeg. Use this rule from a (post)script to disable encoding by setting the same format and the copy parameters. This speeds up the process. The official scripts are usually very smart at guessing the right values. They might make mistakes however. If you are unsure, you can (and you are advised to) preview the results before proceeding. The 'diff' preview is printed to stderr. A JSON preview of the changes is printed to stdout if stdout is redirected. The initial values of the 'output' table can be completed with tags fetched from the MusicBrainz database. Audio files are fingerprinted for the queries, so even with initially wrong file names and tags, the right values should still be retrieved. The front album cover can also be retrieved. Proxy parameters will be fetched automatically from the 'http_proxy' and 'https_proxy' environment variables. As this process requires network access it can be quite slow. Nevertheless, Demlo is specifically optimized for albums, so that network queries are used for only one track per album, when possible. Some tracks can be released on different albums: Demlo tries to guess it from the tags, but if the tags are wrong there is no way to know which one it is. There is a case where the selection can be controlled: let's assume we have tracks A, B and C from the same album Z. A and B were also released in album Y, whereas C was release in Z only. Tags for A will be checked online; let's assume it gets tagged to album Y. B will use A details, so album Y too. Then C does not match neither A's nor B's album, so another online query will be made and it will be tagged to album Z. This is slow and does not yield the expected result. Now let's call Tags for C will be queried online, and C will be tagged to Z. Then both A and B will match album Z so they will be tagged using C details, which is the desired result. Conclusion: when using online tagging, the first argument should be the lesser known track of the album. Demlo can set the output variables according to the values set in a text file before calling the script. The input values are ignored as well as online tagging, but it is still possible to access the input table from scripts. This 'index' file is formatted in JSON. It corresponds to what Demlo outputs when printing the JSON preview. This is valid JSON except for the missing beginning and the missing end. It makes it possible to concatenate and to append to existing index files. Demlo will automatically complete the missing parts so that it becomes valid JSON. The index file is useful when you want to edit tags manually: You can redirect the output to a file, edit the content manually with your favorite text editor, then run Demlo again with the index as argument. See the 'Examples' section. This feature can also be used to interface Demlo with other programs. Demlo can manage embedded covers as well as external covers. External covers are queried from files matching known extensions in the file's folder. Embedded covers are queried from static video streams in the file. Covers are accessed from The embedded covers are indexed numerically by order of appearance in the streams. The first cover will be at index 1 and so on. This is not necessarily the index of the stream. 'inputcover' is the following structure: 'format' is the picture format. FFmpeg makes a distinction between format and codec, but it is not useful for covers. The name of the format is specified by Demlo, not by FFmpeg. Hence the 'jpeg' name, instead of 'mjpeg' as FFmpeg puts it. 'width' and 'height' hold the size in pixels. 'checksum' can be used to identify files uniquely. For performance reasons, only a partial checksum is performed. This variable is typically used for skipping duplicates. Cover transformations are specified in 'outputcover' has the following structure: The format is specified by FFmpeg this time. See the comments on 'format' for 'inputcover'. 'parameters' is used in the same fashion as 'output.parameters'. User configuration: This must be a Lua file. See the 'demlorc' file provided with this package for an exhaustive list of options. Folder containing the official scripts: User script folder: Create this folder and add your own scripts inside. This folder takes precedence over the system folder, so scripts with the same name will be found in the user folder first. The following examples will not proceed unless the '-p' command-line option is true. Important: you _must_ use single quotes for the runtime Lua command to prevent expansion. Inside the Lua code, use double quotes for strings and escape single quotes. Show default options: Preview changes made by the default scripts: Use 'alternate' script if found in user or system script folder (user folder first): Add the Lua file to the list of scripts. This feature is convenient if you want to write scripts that are too complex to fit on the command-line, but not generic enough to fit the user or system script folders. Remove all script from the list, then add '30-case' and '60-path' scripts. Note that '30-case' will be run before '60-path'. Do not use any script but '60-path'. The file content is unchanged and the file is renamed to a dynamically computed destination. Demlo performs an instant rename if destination is on the same device. Otherwise it copies the file and removes the source. Use the default scripts (if set in configuration file), but do not re-encode: Set 'artist' to the value of 'composer', and 'title' to be preceded by the new value of 'artist', then apply the default script. Do not re-encode. Order in runtime script matters. Mind the double quotes. Set track number to first number in input file name: Use the default scripts but keep original value for the 'artist' tag: 1) Preview default scripts transformation and save it to an index. 2) Edit file to fix any potential mistake. 3) Run Demlo over the same files using the index information only. Same as above but generate output filename according to the custom '61-rename' script. The numeric prefix is important: it ensures that '61-rename' will be run after all the default tag related scripts and after '60-path'. Otherwise, if a change in tags would occur later on, it would not affect the renaming script. Retrieve tags from Internet: Same as above but for a whole album, and saving the result to an index: Only download the cover for the album corresponding to the track. Use 'rmsrc' to avoid duplicating the audio file. Change tags inplace with entries from MusicBrainz: Set tags to titlecase while casing AC-DC correctly: To easily switch between formats from command-line, create one script per format (see 50-encoding.lua), e.g. ogg.lua and flac.lua. Then Add support for non-default formats from CLI: Overwrite existing destination if input is newer: ffmpeg(1), ffprobe(1), http://www.lua.org/pil/contents.html
SQL Schema migration tool for Go. Key features: To install the library and command line program, use the following: The main command is called sql-migrate. Each command requires a configuration file (which defaults to dbconfig.yml, but can be specified with the -config flag). This config file should specify one or more environments: The `table` setting is optional and will default to `gorp_migrations`. The environment that will be used can be specified with the -env flag (defaults to development). Use the --help flag in combination with any of the commands to get an overview of its usage: The up command applies all available migrations. By contrast, down will only apply one migration by default. This behavior can be changed for both by using the -limit parameter. The redo command will unapply the last migration and reapply it. This is useful during development, when you're writing migrations. Use the status command to see the state of the applied migrations: Import sql-migrate into your application: Set up a source of migrations, this can be from memory, from a set of files or from bindata (more on that later): Then use the Exec function to upgrade your database: Note that n can be greater than 0 even if there is an error: any migration that succeeded will remain applied even if a later one fails. The full set of capabilities can be found in the API docs below. Migrations are defined in SQL files, which contain a set of SQL statements. Special comments are used to distinguish up and down migrations. You can put multiple statements in each block, as long as you end them with a semicolon (;). If you have complex statements which contain semicolons, use StatementBegin and StatementEnd to indicate boundaries: The order in which migrations are applied is defined through the filename: sql-migrate will sort migrations based on their name. It's recommended to use an increasing version number or a timestamp as the first part of the filename. Normally each migration is run within a transaction in order to guarantee that it is fully atomic. However some SQL commands (for example creating an index concurrently in PostgreSQL) cannot be executed inside a transaction. In order to execute such a command in a migration, the migration can be run using the notransaction option: If you like your Go applications self-contained (that is: a single binary): use bindata (https://github.com/jteeuwen/go-bindata) to embed the migration files. Just write your migration files as usual, as a set of SQL files in a folder. Then use bindata to generate a .go file with the migrations embedded: The resulting bindata.go file will contain your migrations. Remember to regenerate your bindata.go file whenever you add/modify a migration (go generate will help here, once it arrives). Use the AssetMigrationSource in your application to find the migrations: Both Asset and AssetDir are functions provided by bindata. Then proceed as usual. Adding a new migration source means implementing MigrationSource. The resulting slice of migrations will be executed in the given order, so it should usually be sorted by the Id field.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package XGB provides the X Go Binding, which is a low-level API to communicate with the core X protocol and many of the X extensions. It is *very* closely modeled on XCB, so that experience with XCB (or xpyb) is easily translatable to XGB. That is, it uses the same cookie/reply model and is thread safe. There are otherwise no major differences (in the API). Most uses of XGB typically fall under the realm of window manager and GUI kit development, but other applications (like pagers, panels, tilers, etc.) may also require XGB. Moreover, it is a near certainty that if you need to work with X, xgbutil will be of great use to you as well: https://github.com/JonathanLogan/xgbutil This is an extremely terse example that demonstrates how to connect to X, create a window, listen to StructureNotify events and Key{Press,Release} events, map the window, and print out all events received. An example with accompanying documentation can be found in examples/create-window. This is another small example that shows how to query Xinerama for geometry information of each active head. Accompanying documentation for this example can be found in examples/xinerama. XGB can benefit greatly from parallelism due to its concurrent design. For evidence of this claim, please see the benchmarks in xproto/xproto_test.go. xproto/xproto_test.go contains a number of contrived tests that stress particular corners of XGB that I presume could be problem areas. Namely: requests with no replies, requests with replies, checked errors, unchecked errors, sequence number wrapping, cookie buffer flushing (i.e., forcing a round trip every N requests made that don't have a reply), getting/setting properties and creating a window and listening to StructureNotify events. Both XCB and xpyb use the same Python module (xcbgen) for a code generator. XGB (before this fork) used the same code generator as well, but in my attempt to add support for more extensions, I found the code generator extremely difficult to work with. Therefore, I re-wrote the code generator in Go. It can be found in its own sub-package, xgbgen, of xgb. My design of xgbgen includes a rough consideration that it could be used for other languages. I am reasonably confident that the core X protocol is in full working form. I've also tested the Xinerama and RandR extensions sparingly. Many of the other existing extensions have Go source generated (and are compilable) and are included in this package, but I am currently unsure of their status. They *should* work. XKB is the only extension that intentionally does not work, although I suspect that GLX also does not work (however, there is Go source code for GLX that compiles, unlike XKB). I don't currently have any intention of getting XKB working, due to its complexity and my current mental incapacity to test it.
Package chromedp is a high level Chrome DevTools Protocol client that simplifies driving browsers for scraping, unit testing, or profiling web pages using the CDP. chromedp requires no third-party dependencies, implementing the async Chrome DevTools Protocol entirely in Go. This package includes a number of simple examples. Additionally, https://github.com/chromedp/examples contains more complex examples.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package chromedp is a high level Chrome DevTools Protocol client that simplifies driving browsers for scraping, unit testing, or profiling web pages using the CDP. chromedp requires no third-party dependencies, implementing the async Chrome DevTools Protocol entirely in Go. This package includes a number of simple examples. Additionally, https://github.com/chromedp/examples contains more complex examples.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package p9p implements a compliant 9P2000 client and server library for use in modern, production Go services. This package differentiates itself in that is has departed from the plan 9 implementation primitives and better follows idiomatic Go style. The package revolves around the session type, which is an enumeration of raw 9p message calls. A few calls, such as flush and version, have been elided, deferring their usage to the server implementation. Sessions can be trivially proxied through clients and servers. The best place to get started is with Serve. Serve can be provided a connection and a handler. A typical implementation will call Serve as part of a listen/accept loop. As each network connection is created, Serve can be called with a handler for the specific connection. The handler can be implemented with a Session via the Dispatch function or can generate sessions for dispatch in response to client messages. (See cmd/9ps for an example) On the client side, NewSession provides a 9p session from a connection. After a version negotiation, methods can be called on the session, in parallel, and calls will be sent over the connection. Call timeouts can be controlled via the context provided to each method call. This package has the beginning of a nice client-server framework for working with 9p. Some of the abstractions aren't entirely fleshed out, but most of this can center around the Handler. Missing from this are a number of tools for implementing 9p servers. The most glaring are directory read and walk helpers. Other, more complex additions might be a system to manage in memory filesystem trees that expose multi-user sessions. The largest difference between this package and other 9p packages is simplification of the types needed to implement a server. To avoid confusing bugs and odd behavior, the components are separated by each level of the protocol. One example is that requests and responses are separated and they no longer hold mutable state. This means that framing, transport management, encoding, and dispatching are componentized. Little work will be required to swap out encodings, transports or connection implementations. This package has been wired from top to bottom to support context-based resource management. Everything from startup to shutdown can have timeouts using contexts. Not all close methods are fully in place, but we are very close to having controlled, predictable cleanup for both servers and clients. Timeouts can be very granular or very course, depending on the context of the timeout. For example, it is very easy to set a short timeout for a stat call but a long timeout for reading data. Currently, there is not multiversion support. The hooks and functionality are in place to add multi-version support. Generally, the correct space to do this is in the codec. Types, such as Dir, simply need to be extended to support the possibility of extra fields. The real question to ask here is what is the role of the version number in the 9p protocol. It really comes down to the level of support required. Do we just need it at the protocol level, or do handlers and sessions need to be have differently based on negotiated versions? This package has a number of TODOs to make it easier to use. Most of the existing code provides a solid base to work from. Don't be discouraged by the sawdust. In addition, the testing is embarassingly lacking. With time, we can get full testing going and ensure we have confidence in the implementation.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/v9/_examples Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rbg|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format Full validation is blocked by https://github.com/golang/crypto/pull/28 This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. If the Command implements the BasicAuthProvider interface, a Basic Authentication header will be put in place with the given credentials to fetch the URL. Similarly, the CookiesProvider and HeaderProvider interfaces offer the expected features (setting cookies and header values on the request). The ReaderProvider and ValuesProvider interfaces are also supported, although they should be mutually exclusive as they both set the body of the request. If both are supported, the ReaderProvider interface is used. It sets the body of the request (e.g. for a "POST") using the given io.Reader instance. The ValuesProvider does the same, but using the given url.Values instance, and sets the Content-Type of the body to "application/x-www-form-urlencoded" (unless it is explicitly set by a HeaderProvider). Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString* methods. The Fetcher has a number of fields that provide further customization: - HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. - CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. - UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. - WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. - DisablePoliteness : If true, disables fetching of robots.txt, effectively forcing the use of the CrawlDelay value between calls to a host. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package skipper provides an HTTP routing library with flexible configuration as well as a runtime update of the routing rules. Skipper works as an HTTP reverse proxy that is responsible for mapping incoming requests to multiple HTTP backend services, based on routes that are selected by the request attributes. At the same time, both the requests and the responses can be augmented by a filter chain that is specifically defined for each route. Optionally, it can provide circuit breaker mechanism individually for each backend host. Skipper can load and update the route definitions from multiple data sources without being restarted. It provides a default executable command with a few built-in filters, however, its primary use case is to be extended with custom filters, predicates or data sources. For further information read 'Extending Skipper'. Skipper took the core design and inspiration from Vulcand: https://github.com/mailgun/vulcand. Skipper is 'go get' compatible. If needed, create a 'go workspace' first: Get the Skipper packages: Create a file with a route: Optionally, verify the syntax of the file: Start Skipper and make an HTTP request: The core of Skipper's request processing is implemented by a reverse proxy in the 'proxy' package. The proxy receives the incoming request, forwards it to the routing engine in order to receive the most specific matching route. When a route matches, the request is forwarded to all filters defined by it. The filters can modify the request or execute any kind of program logic. Once the request has been processed by all the filters, it is forwarded to the backend endpoint of the route. The response from the backend goes once again through all the filters in reverse order. Finally, it is mapped as the response of the original incoming request. Besides the default proxying mechanism, it is possible to define routes without a real network backend endpoint. One of these cases is called a 'shunt' backend, in which case one of the filters needs to handle the request providing its own response (e.g. the 'static' filter). Actually, filters themselves can instruct the request flow to shunt by calling the Serve(*http.Response) method of the filter context. Another case of a route without a network backend is the 'loopback'. A loopback route can be used to match a request, modified by filters, against the lookup tree with different conditions and then execute a different route. One example scenario can be to use a single route as an entry point to execute some calculation to get an A/B testing decision and then matching the updated request metadata for the actual destination route. This way the calculation can be executed for only those requests that don't contain information about a previously calculated decision. For further details, see the 'proxy' and 'filters' package documentation. Finding a request's route happens by matching the request attributes to the conditions in the route's definitions. Such definitions may have the following conditions: - method - path (optionally with wildcards) - path regular expressions - host regular expressions - headers - header regular expressions It is also possible to create custom predicates with any other matching criteria. The relation between the conditions in a route definition is 'and', meaning, that a request must fulfill each condition to match a route. For further details, see the 'routing' package documentation. Filters are applied in order of definition to the request and in reverse order to the response. They are used to modify request and response attributes, such as headers, or execute background tasks, like logging. Some filters may handle the requests without proxying them to service backends. Filters, depending on their implementation, may accept/require parameters, that are set specifically to the route. For further details, see the 'filters' package documentation. Each route has one of the following backends: HTTP endpoint, shunt, loopback or dynamic. Backend endpoints can be any HTTP service. They are specified by their network address, including the protocol scheme, the domain name or the IP address, and optionally the port number: e.g. "https://www.example.org:4242". (The path and query are sent from the original request, or set by filters.) A shunt route means that Skipper handles the request alone and doesn't make requests to a backend service. In this case, it is the responsibility of one of the filters to generate the response. A loopback route executes the routing mechanism on current state of the request from the start, including the route lookup. This way it serves as a form of an internal redirect. A dynamic route means that the final target will be defined in a filter. One of the filters in the chain must set the target backend url explicitly. Route definitions consist of the following: - request matching conditions (predicates) - filter chain (optional) - backend The eskip package implements the in-memory and text representations of route definitions, including a parser. (Note to contributors: in order to stay compatible with 'go get', the generated part of the parser is stored in the repository. When changing the grammar, 'go generate' needs to be executed explicitly to update the parser.) For further details, see the 'eskip' package documentation Skipper has filter implementations of basic auth and OAuth2. It can be integrated with tokeninfo based OAuth2 providers. For details, see: https://godoc.org/github.com/zalando/skipper/filters/auth. Skipper's route definitions of Skipper are loaded from one or more data sources. It can receive incremental updates from those data sources at runtime. It provides three different data clients: - Kubernetes: Skipper can be used as part of a Kubernetes Ingress Controller implementation together with https://github.com/zalando-incubator/kube-ingress-aws-controller . In this scenario, Skipper uses the Kubernetes API's Ingress extensions as a source for routing. For a complete deployment example, see more details in: https://github.com/zalando-incubator/kubernetes-on-aws/ . - Innkeeper: the Innkeeper service implements a storage for large sets of Skipper routes, with an HTTP+JSON API, OAuth2 authentication and role management. See the 'innkeeper' package and https://github.com/zalando/innkeeper. - etcd: Skipper can load routes and receive updates from etcd clusters (https://github.com/coreos/etcd). See the 'etcd' package. - static file: package eskipfile implements a simple data client, which can load route definitions from a static file in eskip format. Currently, it loads the routes on startup. It doesn't support runtime updates. Skipper can use additional data sources, provided by extensions. Sources must implement the DataClient interface in the routing package. Skipper provides circuit breakers, configured either globally, based on backend hosts or based on individual routes. It supports two types of circuit breaker behavior: open on N consecutive failures, or open on N failures out of M requests. For details, see: https://godoc.org/github.com/zalando/skipper/circuit. Skipper can be started with the default executable command 'skipper', or as a library built into an application. The easiest way to start Skipper as a library is to execute the 'Run' function of the current, root package. Each option accepted by the 'Run' function is wired in the default executable as well, as a command line flag. E.g. EtcdUrls becomes -etcd-urls as a comma separated list. For command line help, enter: An additional utility, eskip, can be used to verify, print, update and delete routes from/to files or etcd (Innkeeper on the roadmap). See the cmd/eskip command package, and/or enter in the command line: Skipper doesn't use dynamically loaded plugins, however, it can be used as a library, and it can be extended with custom predicates, filters and/or custom data sources. To create a custom predicate, one needs to implement the PredicateSpec interface in the routing package. Instances of the PredicateSpec are used internally by the routing package to create the actual Predicate objects as referenced in eskip routes, with concrete arguments. Example, randompredicate.go: In the above example, a custom predicate is created, that can be referenced in eskip definitions with the name 'Random': To create a custom filter we need to implement the Spec interface of the filters package. 'Spec' is the specification of a filter, and it is used to create concrete filter instances, while the raw route definitions are processed. Example, hellofilter.go: The above example creates a filter specification, and in the routes where they are included, the filter instances will set the 'X-Hello' header for each and every response. The name of the filter is 'hello', and in a route definition it is referenced as: The easiest way to create a custom Skipper variant is to implement the required filters (as in the example above) by importing the Skipper package, and starting it with the 'Run' command. Example, hello.go: A file containing the routes, routes.eskip: Start the custom router: The 'Run' function in the root Skipper package starts its own listener but it doesn't provide the best composability. The proxy package, however, provides a standard http.Handler, so it is possible to use it in a more complex solution as a building block for routing. Skipper provides detailed logging of failures, and access logs in Apache log format. Skipper also collects detailed performance metrics, and exposes them on a separate listener endpoint for pulling snapshots. For details, see the 'logging' and 'metrics' packages documentation. The router's performance depends on the environment and on the used filters. Under ideal circumstances, and without filters, the biggest time factor is the route lookup. Skipper is able to scale to thousands of routes with logarithmic performance degradation. However, this comes at the cost of increased memory consumption, due to storing the whole lookup tree in a single structure. Benchmarks for the tree lookup can be run by: In case more aggressive scale is needed, it is possible to setup Skipper in a cascade model, with multiple Skipper instances for specific route segments.
Package sdk is the official AWS SDK for the Go programming language. The AWS SDK for Go provides APIs and utilities that developers can use to build Go applications that use AWS services, such as Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3). The SDK removes the complexity of coding directly against a web service interface. It hides a lot of the lower-level plumbing, such as authentication, request retries, and error handling. The SDK also includes helpful utilities on top of the AWS APIs that add additional capabilities and functionality. For example, the Amazon S3 Download and Upload Manager will automatically split up large objects into multiple parts and transfer them concurrently. See the s3manager package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/s3/s3manager/ Checkout the Getting Started Guide and API Reference Docs detailed the SDK's components and details on each AWS client the SDK supports. The Getting Started Guide provides examples and detailed description of how to get setup with the SDK. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/welcome.html The API Reference Docs include a detailed breakdown of the SDK's components such as utilities and AWS clients. Use this as a reference of the Go types included with the SDK, such as AWS clients, API operations, and API parameters. https://docs.aws.amazon.com/sdk-for-go/api/ The SDK is composed of two main components, SDK core, and service clients. The SDK core packages are all available under the aws package at the root of the SDK. Each client for a supported AWS service is available within its own package under the service folder at the root of the SDK. aws - SDK core, provides common shared types such as Config, Logger, and utilities to make working with API parameters easier. awserr - Provides the error interface that the SDK will use for all errors that occur in the SDK's processing. This includes service API response errors as well. The Error type is made up of a code and message. Cast the SDK's returned error type to awserr.Error and call the Code method to compare returned error to specific error codes. See the package's documentation for additional values that can be extracted such as RequestId. credentials - Provides the types and built in credentials providers the SDK will use to retrieve AWS credentials to make API requests with. Nested under this folder are also additional credentials providers such as stscreds for assuming IAM roles, and ec2rolecreds for EC2 Instance roles. endpoints - Provides the AWS Regions and Endpoints metadata for the SDK. Use this to lookup AWS service endpoint information such as which services are in a region, and what regions a service is in. Constants are also provided for all region identifiers, e.g UsWest2RegionID for "us-west-2". session - Provides initial default configuration, and load configuration from external sources such as environment and shared credentials file. request - Provides the API request sending, and retry logic for the SDK. This package also includes utilities for defining your own request retryer, and configuring how the SDK processes the request. service - Clients for AWS services. All services supported by the SDK are available under this folder. The SDK includes the Go types and utilities you can use to make requests to AWS service APIs. Within the service folder at the root of the SDK you'll find a package for each AWS service the SDK supports. All service clients follows a common pattern of creation and usage. When creating a client for an AWS service you'll first need to have a Session value constructed. The Session provides shared configuration that can be shared between your service clients. When service clients are created you can pass in additional configuration via the aws.Config type to override configuration provided by in the Session to create service client instances with custom configuration. Once the service's client is created you can use it to make API requests the AWS service. These clients are safe to use concurrently. In the AWS SDK for Go, you can configure settings for service clients, such as the log level and maximum number of retries. Most settings are optional; however, for each service client, you must specify a region and your credentials. The SDK uses these values to send requests to the correct AWS region and sign requests with the correct credentials. You can specify these values as part of a session or as environment variables. See the SDK's configuration guide for more information. https://docs.aws.amazon.com/sdk-for-go/v1/developer-guide/configuring-sdk.html See the session package documentation for more information on how to use Session with the SDK. https://docs.aws.amazon.com/sdk-for-go/api/aws/session/ See the Config type in the aws package for more information on configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config When using the SDK you'll generally need your AWS credentials to authenticate with AWS services. The SDK supports multiple methods of supporting these credentials. By default the SDK will source credentials automatically from its default credential chain. See the session package for more information on this chain, and how to configure it. The common items in the credential chain are the following: Environment Credentials - Set of environment variables that are useful when sub processes are created for specific roles. Shared Credentials file (~/.aws/credentials) - This file stores your credentials based on a profile name and is useful for local development. EC2 Instance Role Credentials - Use EC2 Instance Role to assign credentials to application running on an EC2 instance. This removes the need to manage credential files in production. Credentials can be configured in code as well by setting the Config's Credentials value to a custom provider or using one of the providers included with the SDK to bypass the default credential chain and use a custom one. This is helpful when you want to instruct the SDK to only use a specific set of credentials or providers. This example creates a credential provider for assuming an IAM role, "myRoleARN" and configures the S3 service client to use that role for API requests. See the credentials package documentation for more information on credential providers included with the SDK, and how to customize the SDK's usage of credentials. https://docs.aws.amazon.com/sdk-for-go/api/aws/credentials The SDK has support for the shared configuration file (~/.aws/config). This support can be enabled by setting the environment variable, "AWS_SDK_LOAD_CONFIG=1", or enabling the feature in code when creating a Session via the Option's SharedConfigState parameter. In addition to the credentials you'll need to specify the region the SDK will use to make AWS API requests to. In the SDK you can specify the region either with an environment variable, or directly in code when a Session or service client is created. The last value specified in code wins if the region is specified multiple ways. To set the region via the environment variable set the "AWS_REGION" to the region you want to the SDK to use. Using this method to set the region will allow you to run your application in multiple regions without needing additional code in the application to select the region. The endpoints package includes constants for all regions the SDK knows. The values are all suffixed with RegionID. These values are helpful, because they reduce the need to type the region string manually. To set the region on a Session use the aws package's Config struct parameter Region to the AWS region you want the service clients created from the session to use. This is helpful when you want to create multiple service clients, and all of the clients make API requests to the same region. See the endpoints package for the AWS Regions and Endpoints metadata. https://docs.aws.amazon.com/sdk-for-go/api/aws/endpoints/ In addition to setting the region when creating a Session you can also set the region on a per service client bases. This overrides the region of a Session. This is helpful when you want to create service clients in specific regions different from the Session's region. See the Config type in the aws package for more information and additional options such as setting the Endpoint, and other service client configuration options. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config Once the client is created you can make an API request to the service. Each API method takes a input parameter, and returns the service response and an error. The SDK provides methods for making the API call in multiple ways. In this list we'll use the S3 ListObjects API as an example for the different ways of making API requests. ListObjects - Base API operation that will make the API request to the service. ListObjectsRequest - API methods suffixed with Request will construct the API request, but not send it. This is also helpful when you want to get a presigned URL for a request, and share the presigned URL instead of your application making the request directly. ListObjectsPages - Same as the base API operation, but uses a callback to automatically handle pagination of the API's response. ListObjectsWithContext - Same as base API operation, but adds support for the Context pattern. This is helpful for controlling the canceling of in flight requests. See the Go standard library context package for more information. This method also takes request package's Option functional options as the variadic argument for modifying how the request will be made, or extracting information from the raw HTTP response. ListObjectsPagesWithContext - same as ListObjectsPages, but adds support for the Context pattern. Similar to ListObjectsWithContext this method also takes the request package's Option function option types as the variadic argument. In addition to the API operations the SDK also includes several higher level methods that abstract checking for and waiting for an AWS resource to be in a desired state. In this list we'll use WaitUntilBucketExists to demonstrate the different forms of waiters. WaitUntilBucketExists. - Method to make API request to query an AWS service for a resource's state. Will return successfully when that state is accomplished. WaitUntilBucketExistsWithContext - Same as WaitUntilBucketExists, but adds support for the Context pattern. In addition these methods take request package's WaiterOptions to configure the waiter, and how underlying request will be made by the SDK. The API method will document which error codes the service might return for the operation. These errors will also be available as const strings prefixed with "ErrCode" in the service client's package. If there are no errors listed in the API's SDK documentation you'll need to consult the AWS service's API documentation for the errors that could be returned. Pagination helper methods are suffixed with "Pages", and provide the functionality needed to round trip API page requests. Pagination methods take a callback function that will be called for each page of the API's response. Waiter helper methods provide the functionality to wait for an AWS resource state. These methods abstract the logic needed to to check the state of an AWS resource, and wait until that resource is in a desired state. The waiter will block until the resource is in the state that is desired, an error occurs, or the waiter times out. If a resource times out the error code returned will be request.WaiterResourceNotReadyErrorCode. This example shows a complete working Go file which will upload a file to S3 and use the Context pattern to implement timeout logic that will cancel the request if it takes too long. This example highlights how to use sessions, create a service client, make a request, handle the error, and process the response.
Package validator implements value validations for structs and individual fields based on tags. It can also handle Cross-Field and Cross-Struct validation for nested structs and has the ability to dive into arrays and maps of any type. see more examples https://github.com/go-playground/validator/tree/v9/_examples Doing things this way is actually the way the standard library does, see the file.Open method here: The authors return type "error" to avoid the issue discussed in the following, where err is always != nil: Validator only InvalidValidationError for bad validation input, nil or ValidationErrors as type error; so, in your code all you need to do is check if the error returned is not nil, and if it's not check if error is InvalidValidationError ( if necessary, most of the time it isn't ) type cast it to type ValidationErrors like so err.(validator.ValidationErrors). Custom Validation functions can be added. Example: Cross-Field Validation can be done via the following tags: If, however, some custom cross-field validation is required, it can be done using a custom validation. Why not just have cross-fields validation tags (i.e. only eqcsfield and not eqfield)? The reason is efficiency. If you want to check a field within the same struct "eqfield" only has to find the field on the same struct (1 level). But, if we used "eqcsfield" it could be multiple levels down. Example: Multiple validators on a field will process in the order defined. Example: Bad Validator definitions are not handled by the library. Example: Baked In Cross-Field validation only compares fields on the same struct. If Cross-Field + Cross-Struct validation is needed you should implement your own custom validator. Comma (",") is the default separator of validation tags. If you wish to have a comma included within the parameter (i.e. excludesall=,) you will need to use the UTF-8 hex representation 0x2C, which is replaced in the code as a comma, so the above will become excludesall=0x2C. Pipe ("|") is the 'or' validation tags deparator. If you wish to have a pipe included within the parameter i.e. excludesall=| you will need to use the UTF-8 hex representation 0x7C, which is replaced in the code as a pipe, so the above will become excludesall=0x7C Here is a list of the current built in validators: Tells the validation to skip this struct field; this is particularly handy in ignoring embedded structs from being validated. (Usage: -) This is the 'or' operator allowing multiple validators to be used and accepted. (Usage: rbg|rgba) <-- this would allow either rgb or rgba colors to be accepted. This can also be combined with 'and' for example ( Usage: omitempty,rgb|rgba) When a field that is a nested struct is encountered, and contains this flag any validation on the nested struct will be run, but none of the nested struct fields will be validated. This is useful if inside of your program you know the struct will be valid, but need to verify it has been assigned. NOTE: only "required" and "omitempty" can be used on a struct itself. Same as structonly tag except that any struct level validations will not run. Allows conditional validation, for example if a field is not set with a value (Determined by the "required" validator) then other validation such as min or max won't run, but if a value is set validation will run. This tells the validator to dive into a slice, array or map and validate that level of the slice, array or map with the validation tags that follow. Multidimensional nesting is also supported, each level you wish to dive will require another dive tag. dive has some sub-tags, 'keys' & 'endkeys', please see the Keys & EndKeys section just below. Example #1 Example #2 Keys & EndKeys These are to be used together directly after the dive tag and tells the validator that anything between 'keys' and 'endkeys' applies to the keys of a map and not the values; think of it like the 'dive' tag, but for map keys instead of values. Multidimensional nesting is also supported, each level you wish to validate will require another 'keys' and 'endkeys' tag. These tags are only valid for maps. Example #1 Example #2 This validates that the value is not the data types default zero value. For numbers ensures value is not zero. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. The field under validation must be present and not empty only if any of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only if all of the other specified fields are present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: The field under validation must be present and not empty only when any of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Examples: The field under validation must be present and not empty only when all of the other specified fields are not present. For strings ensures value is not "". For slices, maps, pointers, interfaces, channels and functions ensures the value is not nil. Example: This validates that the value is the default value and is almost the opposite of required. For numbers, length will ensure that the value is equal to the parameter given. For strings, it checks that the string length is exactly that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, max will ensure that the value is less than or equal to the parameter given. For strings, it checks that the string length is at most that number of characters. For slices, arrays, and maps, validates the number of items. For numbers, min will ensure that the value is greater or equal to the parameter given. For strings, it checks that the string length is at least that number of characters. For slices, arrays, and maps, validates the number of items. For strings & numbers, eq will ensure that the value is equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings & numbers, ne will ensure that the value is not equal to the parameter given. For slices, arrays, and maps, validates the number of items. For strings, ints, and uints, oneof will ensure that the value is one of the values in the parameter. The parameter should be a list of values separated by whitespace. Values may be strings or numbers. For numbers, this will ensure that the value is greater than the parameter given. For strings, it checks that the string length is greater than that number of characters. For slices, arrays and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than time.Now.UTC(). Same as 'min' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is greater than or equal to time.Now.UTC(). For numbers, this will ensure that the value is less than the parameter given. For strings, it checks that the string length is less than that number of characters. For slices, arrays, and maps it validates the number of items. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than time.Now.UTC(). Same as 'max' above. Kept both to make terminology with 'len' easier. Example #1 Example #2 (time.Time) For time.Time ensures the time value is less than or equal to time.Now.UTC(). This will validate the field value against another fields value either within a struct or passed in field. Example #1: Example #2: Field Equals Another Field (relative) This does the same as eqfield except that it validates the field provided relative to the top level struct. This will validate the field value against another fields value either within a struct or passed in field. Examples: Field Does Not Equal Another Field (relative) This does the same as nefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as gtefield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltfield except that it validates the field provided relative to the top level struct. Only valid for Numbers and time.Time types, this will validate the field value against another fields value either within a struct or passed in field. usage examples are for validation of a Start and End date: Example #1: Example #2: This does the same as ltefield except that it validates the field provided relative to the top level struct. This does the same as contains except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. This does the same as excludes except for struct fields. It should only be used with string types. See the behavior of reflect.Value.String() for behavior on other types. For arrays & slices, unique will ensure that there are no duplicates. For maps, unique will ensure that there are no duplicate values. For slices of struct, unique will ensure that there are no duplicate values in a field of the struct specified via a parameter. This validates that a string value contains ASCII alpha characters only This validates that a string value contains ASCII alphanumeric characters only This validates that a string value contains unicode alpha characters only This validates that a string value contains unicode alphanumeric characters only This validates that a string value contains a basic numeric value. basic excludes exponents etc... for integers or float it returns true. This validates that a string value contains a valid hexadecimal. This validates that a string value contains a valid hex color including hashtag (#) This validates that a string value contains a valid rgb color This validates that a string value contains a valid rgba color This validates that a string value contains a valid hsl color This validates that a string value contains a valid hsla color This validates that a string value contains a valid email This may not conform to all possibilities of any rfc standard, but neither does any email provider accept all possibilities. This validates that a string value contains a valid file path and that the file exists on the machine. This is done using os.Stat, which is a platform independent function. This validates that a string value contains a valid url This will accept any url the golang request uri accepts but must contain a schema for example http:// or rtmp:// This validates that a string value contains a valid uri This will accept any uri the golang request uri accepts This validataes that a string value contains a valid URN according to the RFC 2141 spec. This validates that a string value contains a valid base64 value. Although an empty string is valid base64 this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid base64 URL safe value according the the RFC4648 spec. Although an empty string is a valid base64 URL safe value, this will report an empty string as an error, if you wish to accept an empty string as valid you can use this with the omitempty tag. This validates that a string value contains a valid bitcoin address. The format of the string is checked to ensure it matches one of the three formats P2PKH, P2SH and performs checksum validation. Bitcoin Bech32 Address (segwit) This validates that a string value contains a valid bitcoin Bech32 address as defined by bip-0173 (https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki) Special thanks to Pieter Wuille for providng reference implementations. This validates that a string value contains a valid ethereum address. The format of the string is checked to ensure it matches the standard Ethereum address format Full validation is blocked by https://github.com/golang/crypto/pull/28 This validates that a string value contains the substring value. This validates that a string value contains any Unicode code points in the substring value. This validates that a string value contains the supplied rune value. This validates that a string value does not contain the substring value. This validates that a string value does not contain any Unicode code points in the substring value. This validates that a string value does not contain the supplied rune value. This validates that a string value starts with the supplied string value This validates that a string value ends with the supplied string value This validates that a string value contains a valid isbn10 or isbn13 value. This validates that a string value contains a valid isbn10 value. This validates that a string value contains a valid isbn13 value. This validates that a string value contains a valid UUID. Uppercase UUID values will not pass - use `uuid_rfc4122` instead. This validates that a string value contains a valid version 3 UUID. Uppercase UUID values will not pass - use `uuid3_rfc4122` instead. This validates that a string value contains a valid version 4 UUID. Uppercase UUID values will not pass - use `uuid4_rfc4122` instead. This validates that a string value contains a valid version 5 UUID. Uppercase UUID values will not pass - use `uuid5_rfc4122` instead. This validates that a string value contains only ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains only printable ASCII characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains one or more multibyte characters. NOTE: if the string is blank, this validates as true. This validates that a string value contains a valid DataURI. NOTE: this will also validate that the data portion is valid base64 This validates that a string value contains a valid latitude. This validates that a string value contains a valid longitude. This validates that a string value contains a valid U.S. Social Security Number. This validates that a string value contains a valid IP Address. This validates that a string value contains a valid v4 IP Address. This validates that a string value contains a valid v6 IP Address. This validates that a string value contains a valid CIDR Address. This validates that a string value contains a valid v4 CIDR Address. This validates that a string value contains a valid v6 CIDR Address. This validates that a string value contains a valid resolvable TCP Address. This validates that a string value contains a valid resolvable v4 TCP Address. This validates that a string value contains a valid resolvable v6 TCP Address. This validates that a string value contains a valid resolvable UDP Address. This validates that a string value contains a valid resolvable v4 UDP Address. This validates that a string value contains a valid resolvable v6 UDP Address. This validates that a string value contains a valid resolvable IP Address. This validates that a string value contains a valid resolvable v4 IP Address. This validates that a string value contains a valid resolvable v6 IP Address. This validates that a string value contains a valid Unix Address. This validates that a string value contains a valid MAC Address. Note: See Go's ParseMAC for accepted formats and types: This validates that a string value is a valid Hostname according to RFC 952 https://tools.ietf.org/html/rfc952 This validates that a string value is a valid Hostname according to RFC 1123 https://tools.ietf.org/html/rfc1123 Full Qualified Domain Name (FQDN) This validates that a string value contains a valid FQDN. This validates that a string value appears to be an HTML element tag including those described at https://developer.mozilla.org/en-US/docs/Web/HTML/Element This validates that a string value is a proper character reference in decimal or hexadecimal format This validates that a string value is percent-encoded (URL encoded) according to https://tools.ietf.org/html/rfc3986#section-2.1 This validates that a string value contains a valid directory and that it exists on the machine. This is done using os.Stat, which is a platform independent function. NOTE: When returning an error, the tag returned in "FieldError" will be the alias tag unless the dive tag is part of the alias. Everything after the dive tag is not reported as the alias tag. Also, the "ActualTag" in the before case will be the actual tag within the alias that failed. Here is a list of the current built in alias tags: Validator notes: A collection of validation rules that are frequently needed but are more complex than the ones found in the baked in validators. A non standard validator must be registered manually like you would with your own custom validation functions. Example of registration and use: Here is a list of the current non standard validators: This package panics when bad input is provided, this is by design, bad code like that should not make it to production.
SQL Schema migration tool for Go. Key features: To install the library and command line program, use the following: The main command is called sql-migrate. Each command requires a configuration file (which defaults to dbconfig.yml, but can be specified with the -config flag). This config file should specify one or more environments: The `table` setting is optional and will default to `gorp_migrations`. The environment that will be used can be specified with the -env flag (defaults to development). Use the --help flag in combination with any of the commands to get an overview of its usage: The up command applies all available migrations. By contrast, down will only apply one migration by default. This behavior can be changed for both by using the -limit parameter. The redo command will unapply the last migration and reapply it. This is useful during development, when you're writing migrations. Use the status command to see the state of the applied migrations: If you are using MySQL, you must append ?parseTime=true to the datasource configuration. For example: See https://github.com/go-sql-driver/mysql#parsetime for more information. Import sql-migrate into your application: Set up a source of migrations, this can be from memory, from a set of files or from bindata (more on that later): Then use the Exec function to upgrade your database: Note that n can be greater than 0 even if there is an error: any migration that succeeded will remain applied even if a later one fails. The full set of capabilities can be found in the API docs below. Migrations are defined in SQL files, which contain a set of SQL statements. Special comments are used to distinguish up and down migrations. You can put multiple statements in each block, as long as you end them with a semicolon (;). If you have complex statements which contain semicolons, use StatementBegin and StatementEnd to indicate boundaries: The order in which migrations are applied is defined through the filename: sql-migrate will sort migrations based on their name. It's recommended to use an increasing version number or a timestamp as the first part of the filename. Normally each migration is run within a transaction in order to guarantee that it is fully atomic. However some SQL commands (for example creating an index concurrently in PostgreSQL) cannot be executed inside a transaction. In order to execute such a command in a migration, the migration can be run using the notransaction option: If you like your Go applications self-contained (that is: a single binary): use packr (https://github.com/gobuffalo/packr) to embed the migration files. Just write your migration files as usual, as a set of SQL files in a folder. Use the PackrMigrationSource in your application to find the migrations: If you already have a box and would like to use a subdirectory: As an alternative, but slightly less maintained, you can use bindata (https://github.com/shuLhan/go-bindata) to embed the migration files. Just write your migration files as usual, as a set of SQL files in a folder. Then use bindata to generate a .go file with the migrations embedded: The resulting bindata.go file will contain your migrations. Remember to regenerate your bindata.go file whenever you add/modify a migration (go generate will help here, once it arrives). Use the AssetMigrationSource in your application to find the migrations: Both Asset and AssetDir are functions provided by bindata. Then proceed as usual. Adding a new migration source means implementing MigrationSource. The resulting slice of migrations will be executed in the given order, so it should usually be sorted by the Id field.
bíogo is a bioinformatics library for the Go language. It is a work in progress. bíogo stems from the need to address the size and structure of modern genomic and metagenomic data sets. These properties enforce requirements on the libraries and languages used for analysis: In addition to the computational burden of massive data set sizes in modern genomics there is an increasing need for complex pipelines to resolve questions in tightening problem space and also a developing need to be able to develop new algorithms to allow novel approaches to interesting questions. These issues suggest the need for a simplicity in syntax to facilitate: Related to the second issue is the reluctance of some researchers to release code because of quality concerns http://www.nature.com/news/2010/101013/full/467753a.html The issue of code release is the first of the principles formalised in the Science Code Manifesto http://sciencecodemanifesto.org/ A language with a simple, yet expressive, syntax should facilitate development of higher quality code and thus help reduce this barrier to research code release. It seems that nearly every language has it own bioinformatics library, some of which are very mature, for example BioPerl and BioPython. Why add another one? The different libraries excel in different fields, acting as scripting glue for applications in a pipeline (much of [1-3]) and interacting with external hosts [1, 2, 4, 5], wrapping lower level high performance languages with more user friendly syntax [1-4] or providing bioinformatics functions for high performance languages [5, 6]. The intended niche for bíogo lies somewhere between the scripting libraries and high performance language libraries in being easy to use for both small and large projects while having reasonable performance with computationally intensive tasks. The intent is to reduce the level of investment required to develop new research software for computationally intensive tasks. The bíogo library structure is influenced both by the structure of BioPerl and the Go core libraries. The coding style should be aligned with normal Go idioms as represented in the Go core libraries. Position numbering in the bíogo library conforms to the zero-based indexing of Go and range indexing conforms to Go's half-open zero-based slice indexing. This is at odds with the 'normal' inclusive indexing used by molecular biologists. This choice was made to avoid inconsistent indexing spaces being used — one-based inclusive for bíogo functions and methods and zero-based for native Go slices and arrays — and so avoid errors that this would otherwise facilitate. Note that the GFF package does allow, and defaults to, one-based inclusive indexing in its input and output of GFF files. Quality scores are supported for all sequence types, including protein. Phred and Solexa scoring systems are able to be read from files, however internal representation of quality scores is with Phred, so there will be precision loss in conversion. A Solexa quality score type is provided for use where this will be a problem. Copyright ©2011-2012 The bíogo Authors except where otherwise noted. All rights reserved. Use of this source code is governed by a BSD-style license that can be found in the LICENSE file.
Package CloudForest implements ensembles of decision trees for machine learning in pure Go (golang to search engines). It allows for a number of related algorithms for classification, regression, feature selection and structure analysis on heterogeneous numerical/categorical data with missing values. These include: Breiman and Cutler's Random Forest for Classification and Regression Adaptive Boosting (AdaBoost) Classification Gradiant Boosting Tree Regression Entropy and Cost driven classification L1 regression Feature selection with artificial contrasts Proximity and model structure analysis Roughly balanced bagging for unbalanced classification The API hasn't stabilized yet and may change rapidly. Tests and benchmarks have been performed only on embargoed data sets and can not yet be released. Library Documentation is in code and can be viewed with godoc or live at: http://godoc.org/github.com/ryanbressler/CloudForest Documentation of command line utilities and file formats can be found in README.md, which can be viewed fromated on github: http://github.com/ryanbressler/CloudForest Pull requests and bug reports are welcome. CloudForest was created by Ryan Bressler and is being developed in the Shumelivich Lab at the Institute for Systems Biology for use on genomic/biomedical data with partial support from The Cancer Genome Atlas and the Inova Translational Medicine Institute. CloudForest is intended to provide fast, comprehensible building blocks that can be used to implement ensembles of decision trees. CloudForest is written in Go to allow a data scientist to develop and scale new models and analysis quickly instead of having to modify complex legacy code. Data structures and file formats are chosen with use in multi threaded and cluster environments in mind. Go's support for function types is used to provide a interface to run code as data is percolated through a tree. This method is flexible enough that it can extend the tree being analyzed. Growing a decision tree using Breiman and Cutler's method can be done in an anonymous function/closure passed to a tree's root node's Recurse method: This allows a researcher to include whatever additional analysis they need (importance scores, proximity etc) in tree growth. The same Recurse method can also be used to analyze existing forests to tabulate scores or extract structure. Utilities like leafcount and errorrate use this method to tabulate data about the tree in collection objects. Decision tree's are grown with the goal of reducing "Impurity" which is usually defined as Gini Impurity for categorical targets or mean squared error for numerical targets. CloudForest grows trees against the Target interface which allows for alternative definitions of impurity. CloudForest includes several alternative targets: Additional targets can be stacked on top of these target to add boosting functionality: Repeatedly splitting the data and searching for the best split at each node of a decision tree are the most computationally intensive parts of decision tree learning and CloudForest includes optimized code to perform these tasks. Go's slices are used extensively in CloudForest to make it simple to interact with optimized code. Many previous implementations of Random Forest have avoided reallocation by reordering data in place and keeping track of start and end indexes. In go, slices pointing at the same underlying arrays make this sort of optimization transparent. For example a function like: can return left and right slices that point to the same underlying array as the original slice of cases but these slices should not have their values changed. Functions used while searching for the best split also accepts pointers to reusable slices and structs to maximize speed by keeping memory allocations to a minimum. BestSplitAllocs contains pointers to these items and its use can be seen in functions like: For categorical predictors, BestSplit will also attempt to intelligently choose between 4 different implementations depending on user input and the number of categories. These include exhaustive, random, and iterative searches for the best combination of categories implemented with bitwise operations against int and big.Int. See BestCatSplit, BestCatSplitIter, BestCatSplitBig and BestCatSplitIterBig. All numerical predictors are handled by BestNumSplit which relies on go's sorting package. Training a Random forest is an inherently parallel process and CloudForest is designed to allow parallel implementations that can tackle large problems while keeping memory usage low by writing and using data structures directly to/from disk. Trees can be grown in separate go routines. The growforest utility provides an example of this that uses go routines and channels to grow trees in parallel and write trees to disk as the are finished by the "worker" go routines. The few summary statistics like mean impurity decrease per feature (importance) can be calculated using thread safe data structures like RunningMean. Trees can also be grown on separate machines. The .sf stochastic forest format allows several small forests to be combined by concatenation and the ForestReader and ForestWriter structs allow these forests to be accessed tree by tree (or even node by node) from disk. For data sets that are too big to fit in memory on a single machine Tree.Grow and FeatureMatrix.BestSplitter can be reimplemented to load candidate features from disk, distributed database etc. By default cloud forest uses a fast heuristic for missing values. When proposing a split on a feature with missing data the missing cases are removed and the impurity value is corrected to use three way impurity which reduces the bias towards features with lots of missing data: Missing values in the target variable are left out of impurity calculations. This provided generally good results at a fraction of the computational costs of imputing data. Optionally, feature.ImputeMissing or featurematrixImputeMissing can be called before forest growth to impute missing values to the feature mean/mode which Brieman [2] suggests as a fast method for imputing values. This forest could also be analyzed for proximity (using leafcount or tree.GetLeaves) to do the more accurate proximity weighted imputation Brieman describes. Experimental support is provided for 3 way splitting which splits missing cases onto a third branch. [2] This has so far yielded mixed results in testing. At some point in the future support may be added for local imputing of missing values during tree growth as described in [3] [1] http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm#missing1 [2] https://code.google.com/p/rf-ace/ [3] http://projecteuclid.org/DPubS?verb=Display&version=1.0&service=UI&handle=euclid.aoas/1223908043&page=record In CloudForest data is stored using the FeatureMatrix struct which contains Features. The Feature struct implements storage and methods for both categorical and numerical data and calculations of impurity etc and the search for the best split. The Target interface abstracts the methods of Feature that are needed for a feature to be predictable. This allows for the implementation of alternative types of regression and classification. Trees are built from Nodes and Splitters and stored within a Forest. Tree has a Grow implements Brieman and Cutler's method (see extract above) for growing a tree. A GrowForest method is also provided that implements the rest of the method including sampling cases but it may be faster to grow the forest to disk as in the growforest utility. Prediction and Voting is done using Tree.Vote and CatBallotBox and NumBallotBox which implement the VoteTallyer interface.