Package applicationdiscoveryservice provides the client and types for making API requests to AWS Application Discovery Service. AWS Application Discovery Service helps you plan application migration projects by automatically identifying servers, virtual machines (VMs), software, and software dependencies running in your on-premises data centers. Application Discovery Service also collects application performance data, which can help you assess the outcome of your migration. The data collected by Application Discovery Service is securely retained in an Amazon-hosted and managed database in the cloud. You can export the data as a CSV or XML file into your preferred visualization tool or cloud-migration solution to plan your migration. For more information, see the Application Discovery Service FAQ (http://aws.amazon.com/application-discovery/faqs/). Application Discovery Service offers two modes of operation. Agentless discovery mode is recommended for environments that use VMware vCenter Server. This mode doesn't require you to install an agent on each host. Agentless discovery gathers server information regardless of the operating systems, which minimizes the time required for initial on-premises infrastructure assessment. Agentless discovery doesn't collect information about software and software dependencies. It also doesn't work in non-VMware environments. We recommend that you use agent-based discovery for non-VMware environments and if you want to collect information about software and software dependencies. You can also run agent-based and agentless discovery simultaneously. Use agentless discovery to quickly complete the initial infrastructure assessment and then install agents on select hosts to gather information about software and software dependencies. Agent-based discovery mode collects a richer set of data than agentless discovery by using Amazon software, the AWS Application Discovery Agent, which you install on one or more hosts in your data center. The agent captures infrastructure and application information, including an inventory of installed software applications, system and process performance, resource utilization, and network dependencies between workloads. The information collected by agents is secured at rest and in transit to the Application Discovery Service database in the cloud. Application Discovery Service integrates with application discovery solutions from AWS Partner Network (APN) partners. Third-party application discovery tools can query Application Discovery Service and write to the Application Discovery Service database using a public API. You can then import the data into either a visualization tool or cloud-migration solution. Application Discovery Service doesn't gather sensitive information. All data is handled according to the AWS Privacy Policy (http://aws.amazon.com/privacy/). You can operate Application Discovery Service using offline mode to inspect collected data before it is shared with the service. Your AWS account must be granted access to Application Discovery Service, a process called whitelisting. This is true for AWS partners and customers alike. To request access, sign up for AWS Application Discovery Service here (http://aws.amazon.com/application-discovery/preview/). We send you information about how to get started. This API reference provides descriptions, syntax, and usage examples for each of the actions and data types for Application Discovery Service. The topic for each action shows the API request parameters and the response. Alternatively, you can use one of the AWS SDKs to access an API that is tailored to the programming language or platform that you're using. For more information, see AWS SDKs (http://aws.amazon.com/tools/#SDKs). This guide is intended for use with the AWS Application Discovery Service User Guide (http://docs.aws.amazon.com/application-discovery/latest/userguide/). See https://docs.aws.amazon.com/goto/WebAPI/discovery-2015-11-01 for more information on this service. See applicationdiscoveryservice package documentation for more information. https://docs.aws.amazon.com/sdk-for-go/api/service/applicationdiscoveryservice/ To AWS Application Discovery Service with the SDK use the New function to create a new service client. With that client you can make API requests to the service. These clients are safe to use concurrently. See the SDK's documentation for more information on how to use the SDK. https://docs.aws.amazon.com/sdk-for-go/api/ See aws.Config documentation for more information on configuring SDK clients. https://docs.aws.amazon.com/sdk-for-go/api/aws/#Config See the AWS Application Discovery Service client ApplicationDiscoveryService for more information on creating client for this service. https://docs.aws.amazon.com/sdk-for-go/api/service/applicationdiscoveryservice/#New
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. If the Command implements the BasicAuthProvider interface, a Basic Authentication header will be put in place with the given credentials to fetch the URL. Similarly, the CookiesProvider and HeaderProvider interfaces offer the expected features (setting cookies and header values on the request). The ReaderProvider and ValuesProvider interfaces are also supported, although they should be mutually exclusive as they both set the body of the request. If both are supported, the ReaderProvider interface is used. It sets the body of the request (e.g. for a "POST") using the given io.Reader instance. The ValuesProvider does the same, but using the given url.Values instance, and sets the Content-Type of the body to "application/x-www-form-urlencoded" (unless it is explicitly set by a HeaderProvider). Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString* methods. The Fetcher has a number of fields that provide further customization: - HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. - CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. - UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. - WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package robotstxt implements the Robots Exclusion Protocol, https://en.wikipedia.org/wiki/Robots_exclusion_standard, with a simple API. A large portion of how this package handles the specification comes from https://developers.google.com/search/reference/robots_txt. In fact this package tests against all of the examples listed at https://developers.google.com/search/reference/robots_txt#url-matching-based-on-path-values plus many more. 1. User Agents are case insensitive so "googlebot" and "Googlebot" are the same thing. 2. Directive "Allow" and "Disallow" values are case sensitive so "/pricing" and "/Pricing" are not the same thing. 3. The entire file must be valid UTF-8 encoded, this package will return an error if that is not the case. 4. The most specific user agent wins. 5. Allow and disallow directives also respect the one that is most specific and in the event of a tie the allow directive will win. 6. Directives listed in the robots.txt file apply only to a host, protocol, and port number, https://developers.google.com/search/reference/robots_txt#file-location--range-of-validity. This package validates the host, protocol, and port number every time it is asked if a robot "CanCrawl" a path and the path contains the host, protocol, and port.
Package glesys is the official Go client for interacting with the GleSYS API. Please note that only a subset of features available in the GleSYS API has been implemented. We greatly appreciate contributions. To get started you need to signup for a GleSYS Cloud account and create an API key. Signup is available at https://glesys.com/signup and API keys can be created at https://cloud.glesys.com. CL12345 is the key of the Project you want to work with. To be able to monitor usage and help track down issues, we encourage you to provide a user agent string identifying your application or library. Recommended syntax is "my-library/version" or "www.example.com". The different modules of the GleSYS API are available on the client. For example: More examples provided below.
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. If the Command implements the BasicAuthProvider interface, a Basic Authentication header will be put in place with the given credentials to fetch the URL. Similarly, the CookiesProvider and HeaderProvider interfaces offer the expected features (setting cookies and header values on the request). The ReaderProvider and ValuesProvider interfaces are also supported, although they should be mutually exclusive as they both set the body of the request. If both are supported, the ReaderProvider interface is used. It sets the body of the request (e.g. for a "POST") using the given io.Reader instance. The ValuesProvider does the same, but using the given url.Values instance, and sets the Content-Type of the body to "application/x-www-form-urlencoded" (unless it is explicitly set by a HeaderProvider). Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString* methods. The Fetcher has a number of fields that provide further customization: - HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. - CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. - UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. - WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. If the Command implements the BasicAuthProvider interface, a Basic Authentication header will be put in place with the given credentials to fetch the URL. Similarly, the CookiesProvider and HeaderProvider interfaces offer the expected features (setting cookies and header values on the request). The ReaderProvider and ValuesProvider interfaces are also supported, although they should be mutually exclusive as they both set the body of the request. If both are supported, the ReaderProvider interface is used. It sets the body of the request (e.g. for a "POST") using the given io.Reader instance. The ValuesProvider does the same, but using the given url.Values instance, and sets the Content-Type of the body to "application/x-www-form-urlencoded" (unless it is explicitly set by a HeaderProvider). Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString* methods. The Fetcher has a number of fields that provide further customization: - HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. - CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. - UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. - WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. If the Command implements the BasicAuthProvider interface, a Basic Authentication header will be put in place with the given credentials to fetch the URL. Similarly, the CookiesProvider and HeaderProvider interfaces offer the expected features (setting cookies and header values on the request). The ReaderProvider and ValuesProvider interfaces are also supported, although they should be mutually exclusive as they both set the body of the request. If both are supported, the ReaderProvider interface is used. It sets the body of the request (e.g. for a "POST") using the given io.Reader instance. The ValuesProvider does the same, but using the given url.Values instance, and sets the Content-Type of the body to "application/x-www-form-urlencoded" (unless it is explicitly set by a HeaderProvider). Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString* methods. The Fetcher has a number of fields that provide further customization: - HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. - CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. - UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. - WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package glesys is the official Go client for interacting with the GleSYS API. Please note that only a subset of features available in the GleSYS API has been implemented. We greatly appreciate contributions. To get started you need to signup for a GleSYS Cloud account and create an API key. Signup is available at https://glesys.com/signup and API keys can be created at https://customer.glesys.com. CL12345 is the key of the Project you want to work with. To be able to monitor usage and help track down issues, we encourage you to provide a user agent string identifying your application or library. Recommended syntax is "my-library/version" or "www.example.com". The different modules of the GleSYS API are available on the client. For example: More examples provided below.
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. If the Command implements the BasicAuthProvider interface, a Basic Authentication header will be put in place with the given credentials to fetch the URL. Similarly, the CookiesProvider and HeaderProvider interfaces offer the expected features (setting cookies and header values on the request). The ReaderProvider and ValuesProvider interfaces are also supported, although they should be mutually exclusive as they both set the body of the request. If both are supported, the ReaderProvider interface is used. It sets the body of the request (e.g. for a "POST") using the given io.Reader instance. The ValuesProvider does the same, but using the given url.Values instance, and sets the Content-Type of the body to "application/x-www-form-urlencoded" (unless it is explicitly set by a HeaderProvider). Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString* methods. The Fetcher has a number of fields that provide further customization: - HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. - CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. - UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. - WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package model provides utilities to transform from the OpenTelemetry OTLP data model to the Datadog Agent data model. This module is used in the Datadog Agent and the OpenTelemetry Collector Datadog exporter. End-user behavior is stable, but there are no stability guarantees on its public Go API. Nonetheless, if editing try to avoid breaking API changes if possible and double check the API usage on all module dependents. The 'attributes' packages provide utilities for semantic conventions mapping, while the translator model translates telemetry signals (currently only metrics are translated).
Package qconnect provides the API client, operations, and parameter types for Amazon Q Connect. in Connect is built on Amazon Bedrock, users can take full advantage of the controls implemented in Amazon Bedrock to enforce safety, security, and the responsible use of artificial intelligence (AI). Amazon Q in Connect is a generative AI customer service assistant. It is an LLM-enhanced evolution of Amazon Connect Wisdom that delivers real-time recommendations to help contact center agents resolve customer issues quickly and accurately. Amazon Q in Connect automatically detects customer intent during calls and chats using conversational analytics and natural language understanding (NLU). It then provides agents with immediate, real-time generative responses and suggested actions, and links to relevant documents and articles. Agents can also query Amazon Q in Connect directly using natural language or keywords to answer customer requests. Use the Amazon Q in Connect APIs to create an assistant and a knowledge base, for example, or manage content by uploading custom files. For more information, see Use Amazon Q in Connect for generative AI powered agent assistance in real-time in the Amazon Connect Administrator Guide.
Package dpo provides functionality for interacting with DPO Group's payment gateway from Go applications. Currently the module only supports performing payments through DPOs verify token workflow. You are recommended to set the user agent for the client to some string that identifies your application. The dpo package exposes errors that are thrown from DPO API.
Package discogs is a Go client library for the Discogs API. The discogs package provides a client for accessing the Discogs API. First of all import library and init client variable. According to discogs api documentation you must provide your user-agent. Some requests require authentification (as any user). According to Discogs, to send requests with Discogs Auth, you have two options: sending your credentials in the query string with key and secret parameters or a token parameter. This is token way example:
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt-go). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package fetchbot provides a simple and flexible web crawler that follows the robots.txt policies and crawl delays. It is very much a rewrite of gocrawl (https://github.com/PuerkitoBio/gocrawl) with a simpler API, less features built-in, but at the same time more flexibility. As for Go itself, sometimes less is more! To install, simply run in a terminal: The package has a single external dependency, robotstxt (https://github.com/temoto/robotstxt). It also integrates code from the iq package (https://github.com/kylelemons/iq). The API documentation is available on godoc.org (http://godoc.org/github.com/PuerkitoBio/fetchbot). The following example (taken from /example/short/main.go) shows how to create and start a Fetcher, one way to send commands, and how to stop the fetcher once all commands have been handled. A more complex and complete example can be found in the repository, at /example/full/. Basically, a Fetcher is an instance of a web crawler, independent of other Fetchers. It receives Commands via the Queue, executes the requests, and calls a Handler to process the responses. A Command is an interface that tells the Fetcher which URL to fetch, and which HTTP method to use (i.e. "GET", "HEAD", ...). A call to Fetcher.Start() returns the Queue associated with this Fetcher. This is the thread-safe object that can be used to send commands, or to stop the crawler. Both the Command and the Handler are interfaces, and may be implemented in various ways. They are defined like so: A Context is a struct that holds the Command and the Queue, so that the Handler always knows which Command initiated this call, and has a handle to the Queue. A Handler is similar to the net/http Handler, and middleware-style combinations can be built on top of it. A HandlerFunc type is provided so that simple functions with the right signature can be used as Handlers (like net/http.HandlerFunc), and there is also a multiplexer Mux that can be used to dispatch calls to different Handlers based on some criteria. The Fetcher recognizes a number of interfaces that the Command may implement, for more advanced needs. * BasicAuthProvider: Implement this interface to specify the basic authentication credentials to set on the request. * CookiesProvider: If the Command implements this interface, the provided Cookies will be set on the request. * HeaderProvider: Implement this interface to specify the headers to set on the request. * ReaderProvider: Implement this interface to set the body of the request, via an io.Reader. * ValuesProvider: Implement this interface to set the body of the request, as form-encoded values. If the Content-Type is not specifically set via a HeaderProvider, it is set to "application/x-www-form-urlencoded". ReaderProvider and ValuesProvider should be mutually exclusive as they both set the body of the request. If both are implemented, the ReaderProvider interface is used. * Handler: Implement this interface if the Command's response should be handled by a specific callback function. By default, the response is handled by the Fetcher's Handler, but if the Command implements this, this handler function takes precedence and the Fetcher's Handler is ignored. Since the Command is an interface, it can be a custom struct that holds additional information, such as an ID for the URL (e.g. from a database), or a depth counter so that the crawling stops at a certain depth, etc. For basic commands that don't require additional information, the package provides the Cmd struct that implements the Command interface. This is the Command implementation used when using the various Queue.SendString\* methods. There is also a convenience HandlerCmd struct for the commands that should be handled by a specific callback function. It is a Command with a Handler interface implementation. The Fetcher has a number of fields that provide further customization: * HttpClient : By default, the Fetcher uses the net/http default Client to make requests. A different client can be set on the Fetcher.HttpClient field. * CrawlDelay : That value is used only if there is no delay specified by the robots.txt of a given host. * UserAgent : Sets the user agent string to use for the requests and to validate against the robots.txt entries. * WorkerIdleTTL : Sets the duration that a worker goroutine can wait without receiving new commands to fetch. If the idle time-to-live is reached, the worker goroutine is stopped and its resources are released. This can be especially useful for long-running crawlers. * AutoClose : If true, closes the queue automatically once the number of active hosts reach 0. * DisablePoliteness : If true, ignores the robots.txt policies of the hosts. What fetchbot doesn't do - especially compared to gocrawl - is that it doesn't keep track of already visited URLs, and it doesn't normalize the URLs. This is outside the scope of this package - all commands sent on the Queue will be fetched. Normalization can easily be done (e.g. using https://github.com/PuerkitoBio/purell) before sending the Command to the Fetcher. How to keep track of visited URLs depends on the use-case of the specific crawler, but for an example, see /example/full/main.go. The BSD 3-Clause license (http://opensource.org/licenses/BSD-3-Clause), the same as the Go language. The iq_slice.go file is under the CDDL-1.0 license (details in the source file).
Package discogs is a Go client library for the Discogs API. The discogs package provides a client for accessing the Discogs API. First of all import library and init client variable. According to discogs api documentation you must provide your user-agent. Some requests require authentification (as any user). According to Discogs, to send requests with Discogs Auth, you have two options: sending your credentials in the query string with key and secret parameters or a token parameter. This is token way example:
The package cliniko provides the means to interact with the cliniko api from golang. To get started create a new ClinikoClient: Where token is the the token copied directly from the Cliniko API. The shard is deduced from the token. The vendor name and email will be passed in the User-Agent field with each outgoing request. Use any *WithResponse function to execute a query and get a parsed response: One special case exists for creating an attachment as this is a multi-step process:
Package discogs is a Go client library for the Discogs API. The discogs package provides a client for accessing the Discogs API. First of all import library and init client variable. According to discogs api documentation you must provide your user-agent. Some requests require authentification (as any user). According to Discogs, to send requests with Discogs Auth, you have two options: sending your credentials in the query string with key and secret parameters or a token parameter. This is token way example: