Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

html-metadata

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

html-metadata

Scrapes metadata of several different standards

0.1.1
Source
npm

Version published: 10 years ago

Weekly downloads: 673; increased by89.04%

Maintainers: 1

Weekly downloads

Created: 10 years ago

Source

html-metadata

MetaData html scraper and parser for Node.js

The aim of this library is to be a comprehensive source for extracting all html embedded metadata. Currently it supports Schema.org microdata using third party libraries, a native Dublin Core and Open Graph implementation, and some general metadata that doesn't belong to a particular standard (for instance, the content of the title tag, or meta description tags).

Planned is support for RDFa , twitter, AGLS, eprints, highwire, BEPress and other yet unheard of metadata types. Contributions and requests for other metadata types welcome!

Install

npm install git://github.com/mvolz/html-metadata.git

Usage

var scrape = require('html-metadata');

var url = "http://blog.woorank.com/2013/04/dublin-core-metadata-for-seo-and-usability/";

scrape(url, function(err, meta){
	console.log(meta);
})

The scrape method used here invokes the parseAll() method, which uses all the available methods registered in method metadataFunctions(), and are available for use separately as well, for example:

var cheerio = require('cheerio');
var request = require('request');
var dublinCore = require('html-metadata').parseDublinCore;

var url = "http://blog.woorank.com/2013/04/dublin-core-metadata-for-seo-and-usability/";

request(url, function(error, response, html){
	$ = cheerio.load(html);
	dublinCore($, function(err, results){
		console.log(results);
	});
});

The method parseGeneral obtains the following general metadata:

<meta name="author" content="">
<link rel="author" href="">
<link rel="canonical" href="">
<meta name ="description" content="">
<link rel="publisher" href="">
<meta name ="robots" content="">
<link rel="shortlink" href="">
<title></title>

Keywords

FAQs

What is html-metadata?

Is html-metadata popular?

Is html-metadata well maintained?

Package last updated on 22 Mar 2015

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

html-metadata

html-metadata

MetaData html scraper and parser for Node.js

Install

Usage

Keywords

Related posts

PyPI Introduces Digital Attestations to Strengthen Python Package Security

GitHub Removes Malicious Pull Requests Targeting Open Source Repositories