Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package markdown implements markdown parser and HTML renderer. It parses markdown into AST format which can be serialized to HTML (using html.Renderer) or possibly other formats (using alternate renderers). The simplest way to convert markdown document to HTML You can customize parser and HTML renderer: For a cmd-line tool see https://github.com/gomarkdown/mdtohtml
Package libxml2 is an interface to libxml2 library, providing XML and HTML parsers with DOM interface. The inspiration is Perl5's XML::LibXML module. This library is still in very early stages of development. API may still change without notice. For the time being, the API is being written so that thye are as close as we can get to DOM Layer 3, but some methods will, for the time being, be punted and aliases for simpler methods that don't necessarily check for the DOM's correctness will be used. Also, the return values are still shaky -- I'm still debating how to handle error cases gracefully.
Package microformats provides a microformats parser, supporting both v1 and v2 syntax. Usage: Retrieve the HTML contents of a page, and call Parse or ParseNode, depending on what input you have (an io.Reader or an html.Node). To parse only a section of an HTML document, use a package like goquery to select the root node to parse from. For example, see cmd/gomf/main.go. See also: http://microformats.org/wiki/microformats2
Package pagser is a simple, easy, extensible, configurable HTML parser to struct based on goquery and struct tags, It's parser library from scrago. The project source code: https://github.com/foolin/pagser * Simple - Use golang struct tag syntax. * Easy - Easy use for your spider/crawler/colly application. * Extensible - Support for extension functions. * Struct tag grammar - Grammar is simple, like \`pagser:"a->attr(href)"\`. * Nested Structure - Support Nested Structure for node. * Configurable - Support configuration. * GoQuery/Colly - Support all goquery project, such as go-colly. More info: https://github.com/foolin/pagser
Package bbcode implements a parser and HTML generator for BBCode.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package robots implements robots.txt parsing and matching based on Google's specification. For a robots.txt primer, please read the full specification at: https://developers.google.com/search/reference/robots_txt. Clients of this package have one obligation: when testing whether a URL can be crawled, use the correct robots.txt file. The specification uses scheme, port, and punycode variations to define which URLs are in scope. To get the right robots.txt file, use Locate. Locate takes as its only argument the URL you want to access. It returns the URL of the robots.txt file that governs access. Locate will always return a single unique robots.txt URL for all input URLs sharing a scope. In practice, a client pattern for testing whether a URL is accessible would be: a) Locate the robots.txt file for the URL; b) check whether you have fetched data for that robots.txt file; c) if yes, use the data to Test the URL against your user agent; d) if no, fetch the robots.txt data and try again. For details, see "File location & range of validity" in the specification: https://developers.google.com/search/reference/robots_txt#file-location--range-of-validity. A generous parser is specified. A valid line is accepted, and an invalid line is silently discarded. This is true even if the content parsed is in an unexpected format, like HTML. For details, see "File format" in the specification: https://developers.google.com/search/reference/robots_txt#file-format The specification states that a crawler will assume all URLs are accessible, even if there is no robots.txt file, or the body of the robots.txt file is empty. So a robots.txt file with a 404 status code will result in all URLs being crawlable. The exception to this is a 5xx status code. This is treated as a temporary "full disallow" of crawling. For details, see "Handling HTTP result codes" in the specification: https://developers.google.com/search/reference/robots_txt#handling-http-result-codes
Package microformats provides a microformats parser, supporting both v1 and v2 syntax. Usage: Retrive the HTML contents of a page, and call Parse or ParseNode, depending on what input you have (an io.Reader or an html.Node). To parse only a section of an HTML document, use a package like goquery to select the root node to parse from. For example, see cmd/gomf/main.go. See also: http://microformats.org/wiki/microformats2
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/fresh8/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package goquery implements features similar to jQuery, including the chainable syntax, to manipulate and query an HTML document. It brings a syntax and a set of features similar to jQuery to the Go language. It is based on Go's net/html package and the CSS Selector library cascadia. Since the net/html parser returns nodes, and not a full-featured DOM tree, jQuery's stateful manipulation functions (like height(), css(), detach()) have been left off. Also, because the net/html parser requires UTF-8 encoding, so does goquery: it is the caller's responsibility to ensure that the source document provides UTF-8 encoded HTML. See the repository's wiki for various options on how to do this. Syntax-wise, it is as close as possible to jQuery, with the same method names when possible, and that warm and fuzzy chainable interface. jQuery being the ultra-popular library that it is, writing a similar HTML-manipulating library was better to follow its API than to start anew (in the same spirit as Go's fmt package), even though some of its methods are less than intuitive (looking at you, index()...). It is hosted on GitHub, along with additional documentation in the README.md file: https://github.com/puerkitobio/goquery Please note that because of the net/html dependency, goquery requires Go1.1+. The various methods are split into files based on the category of behavior. The three dots (...) indicate that various "overloads" are available. * array.go : array-like positional manipulation of the selection. * expand.go : methods that expand or augment the selection's set. * filter.go : filtering methods, that reduce the selection's set. * iteration.go : methods to loop over the selection's nodes. * manipulation.go : methods for modifying the document * property.go : methods that inspect and get the node's properties values. * query.go : methods that query, or reflect, a node's identity. * traversal.go : methods to traverse the HTML document tree. * type.go : definition of the types exposed by goquery. * utilities.go : definition of helper functions (and not methods on a *Selection) that are not part of jQuery, but are useful to goquery. This example scrapes the reviews shown on the home page of metalsucks.net.
Package ghaml is a go parser for a haml-like language. It's goals are to provide a HAML-like syntax for creating views with a focus on being fast and efficient. Ghaml achieves this by compiling ghaml templates into go code, which can then be compiled into your application directly. To ease this burden, the ghaml tool can call 'go build' for you, making a near seemless replacement. What is HAML? Haml stands for HTML Abstraction Markup Language, a templating system originally developed for the Ruby language. Ghaml is very similar in intent, differing where it makes sense to accomodate the differences between Go and Ruby. HAML's core principles are: 1) Markup Should be Beautiful 2) Markup Should be DRY 3) Markup Should be Well-Indented 4) HTML Structure Should be Clear Learn more about Haml here: http://haml.info/ A look at a Template The ghaml template above illustrates many features of Ghaml templates: Strongly-typed data type: the @data_type directive specifies that the template accepts a data parameter of type string. This will typically be a struct of view content. Outputting tag content via the = operator is syntactically equal to the variadic parameter definition of the fmt.Print() function. Therefore variables and strings can be concatenated by seperating them with commas The - operator lets the developer execute arbitrary Go code. The default workflow is to use the ghaml command in your working directory. This will compile all ghaml template, and then run the 'go build' command. Command line options are: