
Research
Security News
Malicious npm Package Wipes Codebases with Remote Trigger
A malicious npm typosquat uses remote commands to silently delete entire project directories after a single mistyped install.
@harvestapi/scraper
Advanced tools
HarvestAPI provides LinkedIn data scraping tools for real-time, high-performance scraping at a low cost. API allows to search for Linkedin `jobs`, `companies`, `profiles`, and `posts` using a wide range of filters.
HarvestAPI provides LinkedIn data scraping tools for real-time, high-performance scraping at a low cost.
API allows to search for Linkedin jobs
, companies
, profiles
, and posts
using a wide range of filters.
npm install @harvestapi/scraper
To search for specific items, such as job listings, you can use the searchJobs method. Below is an example of how to search for job listings and retrieve details for a specific job:
import { createLinkedinScraper } from '@harvestapi/scraper';
// Initialize the scraper with your API key
const scraper = createLinkedinScraper({
apiKey: 'your-api-key', // Replace with your HarvestAPI key. Obtain it at https://harvest-api.com/admin/api-keys
});
(async () => {
const jobs = await scraper.searchJobs({
search: 'software engineer', // Job title to search for
location: 'California', // Location filter
page: 1, // Page number to retrieve
});
console.log(`jobs`, JSON.stringify(jobs, null, 2));
const jobDetails = await scraper.getJob({
jobId: jobs.elements[0].id, // Use the job ID from the search results
});
console.log(`jobDetails`, JSON.stringify(jobDetails, null, 2));
})();
The scrape methods allows you to scrape all pages of search results and save the data either to a SQLite database or a JSON file. This method automatically handles pagination and will scrape all available pages based on the totalPages
metadata.
After fetching a page, the scraper will also make a separate request per each item, to fetch its details (default behavior).
await scraper.scrapeProfiles({
query: {
search: 'Mark',
companyId: '1441', // Google company id.
location: 'US',
},
outputType: 'sqlite',
});
If you you want make requests to only fetch search pages, without fetching item details, you can pass scrapeDetails: false
option to the scrape method. For example scrapeJobs
will not fetch job descriptions in this case, but you will get job title, links and some other basic info (check JobShort below).
After the scraping process is complete, you can view the data using any SQLite database browser. The data will be saved in a file located at ./output/{timestamp}_profiles_{id}.sqlite
.
For more detailed information on the available methods and their parameters, check the API reference below
createLinkedinScraper(
options
):LinkedinScraper
getProfile(
params
):Promise
<ApiItemResponse
<Profile
>>
Promise
<ApiItemResponse
<Profile
>>
getProfileId(
params
):Promise
<ApiItemResponse
<{id
:string
; }>>
string
string
Promise
<ApiItemResponse
<{ id
: string
; }>>
searchProfiles(
params
):Promise
<ApiListResponse
<ProfileShort
>>
Promise
<ApiListResponse
<ProfileShort
>>
searchProfilesV2(
params
):Promise
<ApiListResponse
<ProfileShort
>>
SearchLinkedInProfilesParamsV2
Promise
<ApiListResponse
<ProfileShort
>>
getCompany(
params
):Promise
<ApiItemResponse
<Company
>>
Promise
<ApiItemResponse
<Company
>>
searchCompanies(
params
):Promise
<ApiListResponse
<CompanyShort
>>
Promise
<ApiListResponse
<CompanyShort
>>
getJob(
params
):Promise
<ApiItemResponse
<Job
>>
Promise
<ApiItemResponse
<Job
>>
searchJobs(
params
):Promise
<ApiListResponse
<JobShort
>>
Promise
<ApiListResponse
<JobShort
>>
searchPosts(
params
):Promise
<ApiListResponse
<PostShort
>>
Promise
<ApiListResponse
<PostShort
>>
getPostReactions(
params
):Promise
<ApiListResponse
<PostReaction
>>
GetLinkedinPostReactionsParams
Promise
<ApiListResponse
<PostReaction
>>
getPostComments(
params
):Promise
<ApiListResponse
<PostComment
>>
Promise
<ApiListResponse
<PostComment
>>
searchCompanyAssociatedProfiles(
params
):Promise
<ApiListResponse
<ProfileShort
>>
SearchLinkedInCompanyAssociatedProfilesParams
Promise
<ApiListResponse
<ProfileShort
>>
scrapeJobs(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
scrapeCompanies(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
scrapeProfiles(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
scrapeProfilesV2(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
scrapePosts(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
scrapePostReactions(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
ScrapeLinkedinPostReactionsParams
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
scrapePostComments(
__namedParameters
):Promise
<undefined
| {pages
:number
;pagesSuccess
:number
;items
:number
;itemsSuccess
:number
;requests
:number
;requestsStartTime
:Date
; }>
ScrapeLinkedinPostCommentsParams
Promise
<undefined
| { pages
: number
; pagesSuccess
: number
; items
: number
; itemsSuccess
: number
; requests
: number
; requestsStartTime
: Date
; }>
optional
url:string
optional
publicIdentifier:string
optional
profileId:string
optional
query:string
optional
tryFindEmail:boolean
optional
company:string
|string
[]
optional
companyId:string
|string
[]
optional
companyUniversalName:string
|string
[]
optional
school:string
|string
[]
optional
schoolId:string
|string
[]
optional
schoolUniversalName:string
|string
[]
optional
geoId:string
|string
[]
optional
location:string
|string
[]
optional
search:string
optional
page:number
optional
currentCompanies:string
|string
[]
optional
pastCompanies:string
|string
[]
optional
school:string
|string
[]
optional
location:string
|string
[]
optional
search:string
optional
page:number
optional
company:string
|string
[]
optional
companyId:string
|string
[]
optional
companyUniversalName:string
|string
[]
optional
search:string
optional
page:number
optional
universalName:string
optional
url:string
optional
companyId:string
optional
search:string
optional
query:string
optional
geoId:string
optional
location:string
optional
search:string
optional
page:number
optional
companySize:LinkedinCompanySize
|LinkedinCompanySize
[]
optional
jobId:string
optional
url:string
optional
withCompany:boolean
optional
search:string
optional
company:string
|string
[]
optional
companyId:string
|string
[]
optional
companyUniversalName:string
|string
[]
optional
location:string
optional
geoId:string
optional
sortBy:"date"
|"relevance"
optional
workplaceType:LinkedinWorkplaceType
|LinkedinWorkplaceType
[]
optional
employmentType:LinkedinJobType
|LinkedinJobType
[]
optional
experienceLevel:ExperienceLevel
|ExperienceLevel
[]
optional
under10Applicants:boolean
optional
easyApply:boolean
optional
postedLimit:"24h"
|"week"
|"month"
optional
page:number
optional
salary:LinkedinSalaryRange
|LinkedinSalaryRange
[]
optional
search:string
optional
page:number
optional
sortBy:"date"
|"relevance"
optional
postedLimit:"24h"
|"week"
|"month"
optional
targetUrl:string
|string
[]
optional
profile:string
|string
[]
optional
companyId:string
|string
[]
optional
profileId:string
|string
[]
optional
company:string
|string
[]
optional
companyUniversalName:string
|string
[]
optional
profilePublicIdentifier:string
|string
[]
optional
authorsCompany:string
|string
[]
optional
authorsCompanyUniversalName:string
|string
[]
optional
authorsCompanyId:string
|string
[]
post:
string
|number
optional
page:number
post:
string
|number
optional
page:number
optional
paginationToken:null
|string
optional
sortBy:"date"
|"relevance"
entityId:
null
|string
status:
number
error:
any
query:
Record
<string
,any
>
optional
user:object
subscriptionPlan:
string
requestsThisCycle:
number
requestsLeftThisCycle:
number
requestsUsedThisCycle:
number
requestsConcurrency:
number
ListingScraperConfig<
TItemShot
,TItemDetails
>:object
• TItemShot
• TItemDetails
optional
outputType:"json"
|"sqlite"
|"callback"
optional
outputDir:string
optional
filename:string
optional
tableName:string
Table name for SQLite output.
optional
scrapeDetails:boolean
Whether to make an additional request for each item details.
true
optional
onItemScraped: (args
) =>any
TItemShot
| TItemDetails
Required
<ScraperOptions
>["logger"
]
any
optional
overrideConcurrency:number
optional
maxItems:number
optional
disableLog:boolean
optional
disableErrorLog:boolean
optional
optionsOverride:Partial
<ListingScraperOptions
<TItemShot
,TItemDetails
>>
ScraperOptions:
object
apiKey:
string
optional
baseUrl:string
optional
addHeaders:Record
<string
,string
>
optional
logger:object
log: (...
args
) =>void
...any
[]
void
error: (...
args
) =>void
...any
[]
void
LinkedinCompanySize:
"1-10"
|"11-50"
|"51-200"
|"201-500"
|"501-1000"
|"1001-5000"
|"5001-10000"
|"10001+"
LinkedinSalaryRange:
"40k+"
|"60k+"
|"80k+"
|"100k+"
|"120k+"
|"140k+"
|"160k+"
|"180k+"
|"200k+"
LinkedinJobType:
"full-time"
|"part-time"
|"contract"
|"internship"
LinkedinWorkplaceType:
"office"
|"hybrid"
|"remote"
ExperienceLevel:
"internship"
|"entry"
|"associate"
|"mid-senior"
|"director"
|"executive"
Profile:
object
id:
string
publicIdentifier:
string
lastName:
string
firstName:
string
headline:
string
about:
string
linkedinUrl:
string
photo:
string
emails:
string
[]
websites:
string
[]
registeredAt:
string
topSkills:
string
connectionsCount:
number
followerCount:
number
openToWork:
boolean
hiring:
boolean
location:
object
linkedinText:
string
countryCode:
string
parsed:
object
text:
string
countryCode:
string
regionCode:
string
|null
country:
string
countryFull:
string
state:
string
city:
string
currentPosition:
object
[]
experience:
object
[]
education:
object
[]
certifications:
object
[]
receivedRecommendations:
object
[]
skills:
object
[]
languages:
object
[]
projects:
object
[]
publications:
object
[]
featured:
object
images:
string
[]
link:
string
title:
string
subtitle:
string
verified:
boolean
ProfileShort:
object
id:
string
publicIdentifier:
string
optional
name:string
optional
position:string
optional
location:object
optional
linkedinText:string
optional
linkedinUrl:string
optional
photo:string
optional
hidden:boolean
Company:
object
id:
string
universalName:
string
optional
name:string
optional
tagline:string
optional
website:string
optional
linkedinUrl:string
optional
logo:string
optional
foundedOn:object
optional
month:string
|null
optional
year:number
optional
day:string
|null
optional
employeeCount:number
optional
employeeCountRange:object
optional
start:number
optional
end:number
optional
followerCount:number
optional
description:string
optional
headquarter:object
optional
geographicArea:string
optional
city:string
optional
country:string
optional
postalCode:string
optional
line2:string
|null
optional
line1:string
optional
description:string
optional
parsed:object
optional
text:string
optional
countryCode:string
optional
regionCode:string
|null
optional
country:string
optional
countryFull:string
optional
state:string
optional
city:string
optional
locations:object
[]
optional
specialities:string
[]
optional
industries:string
[]
optional
logos:object
[]
optional
backgroundCovers:object
[]
optional
active:boolean
optional
jobSearchUrl:string
optional
phone:string
|null
optional
crunchbaseFundingData:object
optional
numberOfFundingRounds:number
optional
lastFundingRound:object
optional
localizedFundingType:string
optional
leadInvestors:Record
<string
,never
>[]
optional
moneyRaised:object
optional
amount:string
optional
currencyCode:string
optional
fundingRoundUrl:string
optional
announcedOn:object
optional
month:number
optional
year:number
optional
day:number
optional
numberOfOtherInvestors:number
optional
investorsUrl:string
optional
organizationUrl:string
optional
updatedAt:number
optional
fundingRoundsUrl:string
optional
pageVerified:boolean
CompanyShort:
object
id:
string
universalName:
string
linkedinUrl:
string
optional
name:string
optional
industries:string
optional
location:object
optional
linkedinText:string
optional
followers:string
optional
summary:string
optional
logo:string
Job:
object
id:
string
optional
title:string
optional
url:string
optional
jobState:string
optional
postedDate:string
optional
descriptionText:string
optional
descriptionHtml:string
optional
location:object
optional
linkedinText:string
optional
postalAddress:string
|null
optional
parsed:object
optional
text:string
optional
countryCode:string
optional
regionCode:string
|null
optional
country:string
optional
countryFull:string
optional
state:string
optional
city:string
optional
employmentType:"full_time"
|"part_time"
|"contract"
|"internship"
optional
workplaceType:"on_site"
|"hybrid"
|"remote"
optional
workRemoteAllowed:boolean
optional
easyApplyUrl:string
optional
applicants:number
company:
Company
salary: {
text
:string
;min
:number
;max
:number
;currency
:string
;payPeriod
:string
;compensationType
:string
;compensationSource
:string
;providedByEmployer
:boolean
; } |null
optional
views:number
optional
expireAt:string
optional
new:boolean
optional
jobApplicationLimitReached:boolean
optional
applicantTrackingSystem:string
JobShort:
object
id:
string
optional
url:string
optional
title:string
optional
postedDate:string
optional
company:CompanyShort
optional
location:object
optional
linkedinText:string
optional
easyApply:boolean
PostShort:
object
id:
string
optional
content:string
author:
object
optional
universalName:string
|null
optional
publicIdentifier:string
|null
optional
type:"company"
|"profile"
optional
name:string
optional
linkedinUrl:string
optional
info:string
optional
website:string
|null
optional
websiteLabel:string
|null
optional
avatar:object
url:
string
width:
number
height:
number
expiresAt:
number
article: {
title
:string
|null
;subtitle
:string
|null
;link
:string
|null
;linkLabel
:string
|null
;description
:string
|null
;image
:string
|null
; } |null
optional
postedAgo:string
optional
postImages:object
[]
optional
repostId:string
|null
optional
repost:PostShort
optional
repostedBy:object
name:
string
publicIdentifier:
string
linkedinUrl:
string
optional
newsletterUrl:string
optional
newsletterTitle:string
optional
socialContent:object
hideCommentsCount:
boolean
hideReactionsCount:
boolean
hideSocialActivityCounts:
boolean
hideShareAction:
boolean
hideSendAction:
boolean
hideRepostsCount:
boolean
hideViewsCount:
boolean
hideReactAction:
boolean
hideCommentAction:
boolean
shareUrl:
string
showContributionExperience:
boolean
showSocialDetail:
boolean
optional
engagement:object
likes:
number
comments:
number
shares:
number
reactions:
object
[]
PostReaction:
object
id:
string
reactionType:
string
actor:
object
id:
string
name:
string
linkedinUrl:
string
position:
string
image:
object
url:
string
width:
number
height:
number
expiresAt:
number
PostComment:
object
id:
string
linkedinUrl:
string
commentary:
string
createdAt:
string
postId:
string
actor:
object
id:
string
name:
string
linkedinUrl:
string
position:
string
pictureUrl:
string
picture:
object
url:
string
width:
number
height:
number
expiresAt:
number
createdAtTimestamp:
number
optional
pinned:boolean
|null
optional
contributed:boolean
|null
optional
edited:boolean
|null
ScrapeLinkedinJobsParams:
object
&ListingScraperConfig
<JobShort
,Job
>
query:
SearchLinkedinJobsParams
ScrapeLinkedinCompaniesParams:
object
&ListingScraperConfig
<CompanyShort
,Company
>
ScrapeLinkedinProfilesParams:
object
&ListingScraperConfig
<ProfileShort
,Profile
>
query:
SearchLinkedInProfilesParams
optional
tryFindEmail:boolean
ScrapeLinkedinPostsParams:
object
&ListingScraperConfig
<PostShort
,PostShort
>
query:
SearchLinkedinPostsParams
ScrapeLinkedinPostReactionsParams:
object
&ListingScraperConfig
<PostReaction
,PostReaction
>
ScrapeLinkedinPostCommentsParams:
object
&ListingScraperConfig
<PostComment
,PostComment
>
ErrorResponse:
object
error:
string
message:
string
status:
number
ApiItemResponse<
TItem
>:BaseApiResponse
&object
element:
TItem
• TItem
ApiListResponse<
TItem
>:BaseApiResponse
&object
pagination: {
totalPages
:number
;totalElements
:number
;pageNumber
:number
;previousElements
:number
;pageSize
:number
;paginationToken
:string
|null
; } |null
elements:
TItem
[]
• TItem
FAQs
HarvestAPI provides LinkedIn data scraping tools for real-time, high-performance scraping at a low cost. API allows to search for Linkedin `jobs`, `companies`, `profiles`, and `posts` using a wide range of filters.
The npm package @harvestapi/scraper receives a total of 252 weekly downloads. As such, @harvestapi/scraper popularity was classified as not popular.
We found that @harvestapi/scraper demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Security News
A malicious npm typosquat uses remote commands to silently delete entire project directories after a single mistyped install.
Research
Security News
Malicious PyPI package semantic-types steals Solana private keys via transitive dependency installs using monkey patching and blockchain exfiltration.
Security News
New CNA status enables OpenJS Foundation to assign CVEs for security vulnerabilities in projects like ESLint, Fastify, Electron, and others, while leaving disclosure responsibility with individual maintainers.