Security News
Research
Data Theft Repackaged: A Case Study in Malicious Wrapper Packages on npm
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
proxycurl-py
- The official Python client for Proxycurl API to scrape and enrich LinkedIn profilesproxycurl-py
with an API KeyProxycurl is an enrichment API to fetch fresh data on people and businesses. We are a fully-managed API that sits between your application and raw data so that you can focus on building the application; instead of worrying about building a web-scraping team and processing data at scale.
With Proxycurl, you can programatically:
Visit Proxycurl's website for more details.
You should understand that proxycurl-py
was designed with concurrency as a first class citizen from ground-up. To install proxycurl-py
, you have to pick a concurency model.
We support the following concurrency models:
The right way to use Proxycurl API is to make API calls concurrently. In fact, making API requests concurrently is the only way to achieve a high rate of throughput. On the default rate limit, you can enrich up to 432,000 profiles per day. See this blog post for context.
proxycurl-py
is available on PyPi. For which you can install into your project with the following command:
# install proxycurl-py with asyncio
$ pip install 'proxycurl-py[asyncio]'
# install proxycurl-py with gevent
$ pip install 'proxycurl-py[gevent]'
# install proxycurl-py with twisted
$ pip install 'proxycurl-py[twisted]'
proxycurl-py
is tested on Python 3.7
, 3.8
and 3.9
.
proxycurl-py
with an API KeyYou can get an API key by registering an account with Proxycurl. The API Key can be retrieved from the dashboard.
To use Proxycurl with the API Key:
PROXYCURL_API_KEY
environment variable set.proxycurl/config.py
for an example.I will be using proxycurl-py
with the asyncio concurrency model to illustrate some examples on what you can do with Proxycurl and how the code will look with this library.
Forexamples with other concurrency models such as:
examples/lib-gevent.py
.examples/lib-twisted
.Given a LinkedIn Member Profile URL, you can get the entire profile back in structured data with Proxycurl's Person Profile API Endpoint.
from proxycurl.asyncio import Proxycurl, do_bulk
import asyncio
import csv
proxycurl = Proxycurl()
person = asyncio.run(proxycurl.linkedin.person.get(
url='https://www.linkedin.com/in/williamhgates/'
))
print('Person Result:', person)
Given a LinkedIn Company Profile URL, enrich the URL with it's full profile with Proxycurl's Company Profile API Endpoint.
company = asyncio.run(proxycurl.linkedin.company.get(
url='https://www.linkedin.com/company/tesla-motors'
))
print('Company Result:', company)
Given a first name and a company name or domain, lookup a person with Proxycurl's Person Lookup API Endpoint.
lookup_results = asyncio.run(proxycurl.linkedin.person.resolve(first_name="bill", last_name="gates", company_domain="microsoft"))
print('Person Lookup Result:', lookup_results)
Given a company name or a domain, lookup a company with Proxycurl's Company Lookup API Endpoint.
company_lookup_results = asyncio.run(proxycurl.linkedin.company.resolve(company_name="microsoft", company_domain="microsoft.com"))
print('Company Lookup Result:', company_lookup_results)
Given a work email address, lookup a LinkedIn Profile URL with Proxycurl's Reverse Work Email Lookup Endpoint.
lookup_results = asyncio.run(proxycurl.linkedin.person.resolve_by_email(work_email="anthony.tan@grab.com"))
print('Reverse Work Email Lookup Result:', lookup_results)
Given a CSV file with a list of LinkedIn member profile URLs, you can enrich the list in the following manner:
# PROCESS BULK WITH CSV
bulk_linkedin_person_data = []
with open('sample.csv', 'r') as file:
reader = csv.reader(file)
next(reader, None)
for row in reader:
bulk_linkedin_person_data.append(
(proxycurl.linkedin.person.get, {'url': row[0]})
)
results = asyncio.run(do_bulk(bulk_linkedin_person_data))
print('Bulk:', results)
More asyncio examples can be found at examples/lib-asyncio.py
There is no need for you to handle rate limits (429
HTTP status error). The library handles rate limits automatically with exponential backoff.
However, there is a need for you to handle other error codes. Errors will be returned in the form of ProxycurlException
. The list of possible errors is listed in our API documentation.
Here we list the possible API endpoints and their corresponding library functions. Do refer to each endpoint's relevant API documentation to find out the required arguments that needs to be fed into the function.
FAQs
Unknown package
We found that proxycurl-py demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Security News
Research
The Socket Research Team breaks down a malicious wrapper package that uses obfuscation to harvest credentials and exfiltrate sensitive data.
Research
Security News
Attackers used a malicious npm package typosquatting a popular ESLint plugin to steal sensitive data, execute commands, and exploit developer systems.
Security News
The Ultralytics' PyPI Package was compromised four times in one weekend through GitHub Actions cache poisoning and failure to rotate previously compromised API tokens.