Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

floodesh

Package Overview
Dependencies
Maintainers
1
Versions
78
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

floodesh

Floodesh is a distributed web spider/crawler written with Nodejs.

  • 0.7.16
  • Source
  • npm
  • Socket score

Version published
Weekly downloads
101
increased by4950%
Maintainers
1
Weekly downloads
 
Created
Source

V0.4.x API is different from earlier version, please do not upgrade your previous dependence.

Floodesh

Floodesh is middleware based web spider written with Nodejs. "Floodesh" is a combination of two words, flood and mesh.

Requirement

  • Gearman Server
  • MongoDB

Gearman Server Installation

Make sure g++, make, libboost-all-dev, gperf, libevent-dev and uuid-dev have been installed.

wget https://launchpad.net/gearmand/1.2/1.1.12/+download/gearmand-1.1.12.tar.gz | tar xvf
cd gearmand-1.1.12
./configure
make
make install

Install

$ npm install -g floodesh-cli

Useage

Generate new app from templates by only one command.

$ mkdir floodesh_demo
$ cd floodesh_demo
$ floodesh-cli init // all necessary files will be generated in your directory.
$ npm install

Context

A context instance is a kind of Finite-State Machine implemented by Generators which is ECMAScript 6 feature. By context, we can access almost all fields in response and request, like:

worker.responsemw.use( (ctx,next) => {
	ctx.content = ctx.body.toString(); // totally do not care about the body 
	return next();
})

Request

ctx.querystring

  • String

Get querystring.

ctx.idempotent

  • Boolean

Check if the request is idempotent.

ctx.search

  • String

Get the search string. It includes the leading "?" compare to querystring.

ctx.method

  • String

Get request method.

ctx.query

  • Object

Get parsed query-string.

ctx.path

  • String

Get the request pathname

ctx.url

  • String

Return request url, the same as ctx.href.

ctx.origin

  • String

Get the origin of URL, for instance, "https://www.google.com".

ctx.protocol

  • String

Return the protocol string "http" or "https".

ctx.host

  • String, hostname:port

Parse the "Host" header field host and support X-Forwarded-Host when a proxy is enabled.

ctx.hostname

  • String

Parse the "Host" header field hostname and support X-Forwarded-Host when a proxy is enabled.

ctx.secure

  • Boolean

Check if protocol is https.

Response

ctx.status

  • Number

Get status code from response.

ctx.message

  • String

Get status message from response.

ctx.body

  • Buffer

Get the response body in Buffer.

ctx.length

  • Number

Get length of response body.

ctx.type

  • String

Get the response mime type, for instance, "text/html"

ctx.lastModifieds

  • Date

Get the Last-Modified date in Date form, if it exists.

ctx.etag

  • String

Get the ETag of a response.

ctx.header

  • Object

Return the response header.

ctx.contentType

  • String

ctx.get(key)

  • key String
  • Return: String

Get value by key in response headers

ctx.is(types)

  • types String|Array
  • Return: String|false|null

Check if the incoming response contains the "Content-Type" header field, and it contains any of the give mime types.If there is no response body, null is returned.If there is no content type, false is returned.Otherwise, it returns the first type that matches.

Other

tasks

* Array

Array of pending crawling tasks. A task is an object consists of Options and next, next is a function name in your spider you want to call in next task , Supported format:

* ```{
	opt:[Options](https://github.com/request/request#requestoptions-callback),
	next:String
}```

* ```{
	opt:String
}```

* `[Options](https://github.com/request/request#requestoptions-callback)`

dataSet

* Map

dataSet is a map to store result, that will be parsed and saved by floodesh.

Middlewares

Keywords

FAQs

Package last updated on 25 Oct 2016

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc