.. image:: https://badge.fury.io/py/search-google.svg
:target: https://badge.fury.io/py/search-google
.. image:: https://travis-ci.org/rrwen/search_google.svg?branch=master
:target: https://travis-ci.org/rrwen/search_google
.. image:: https://coveralls.io/repos/github/rrwen/search_google/badge.svg?branch=master
:target: https://coveralls.io/github/rrwen/search_google?branch=master
.. image:: https://img.shields.io/github/issues/rrwen/search_google.svg
:target: https://github.com/rrwen/search_google/issues
.. image:: https://img.shields.io/badge/license-MIT-blue.svg
:target: https://raw.githubusercontent.com/rrwen/search_google/master/LICENSE
.. image:: https://img.shields.io/github/stars/rrwen/search_google.svg
:target: https://github.com/rrwen/search_google/stargazers
.. image:: https://img.shields.io/twitter/url/https/github.com/rrwen/search_google.svg?style=social
:target: https://twitter.com/intent/tweet?text=%23python%20%23dataextraction%20tool%20for%20%23googlesearch%20results%20and%20%23googleimages:%20https://github.com/rrwen/search_google
Install
- Install
Python <https://www.python.org/downloads/>
_ - Install
search_google <https://pypi.python.org/pypi/search-google>
_ via pip
::
pip install search_google
For the latest developer version, see Developer Install
_.
Usage
For help in the console::
search_google -h
Ensure that a CSE ID <https://support.google.com/customsearch/answer/2649143?hl=en>
_ and a Google API developer key <https://developers.google.com/api-client-library/python/auth/api-keys>
_ are set::
search_google -s cx="your_cse_id"
search_google -s build_developerKey="your_dev_key"
Search the web for keyword "cat"::
search_google "cat"
search_google "cat" --save_links=cat.txt
search_google "cat" --save_downloads=downloads
Search for "cat" images::
search_google cat --searchType=image
search_google "cat" --searchType=image --save_links=cat_images.txt
search_google "cat" --searchType=image --save_downloads=downloads
Use as a Python module:
.. code-block:: python
Import the api module for the results class
import search_google.api
Define buildargs for cse api
buildargs = {
'serviceName': 'customsearch',
'version': 'v1',
'developerKey': 'your_api_key'
}
Define cseargs for search
cseargs = {
'q': 'keyword query',
'cx': 'your_cse_id',
'num': 3
}
Create a results object
results = search_google.api.results(buildargs, cseargs)
Download the search results to a directory
results.download_links('downloads')
For more usage details, see the Documentation <https://rrwen.github.io/search_google>
_.
Contributions
Report Contributions
Reports for issues and suggestions can be made using the issue submission <https://github.com/rrwen/search_google/issues>
_ interface.
When possible, ensure that your submission is:
- Descriptive: has informative title, explanations, and screenshots
- Specific: has details of environment (such as operating system and hardware) and software used
- Reproducible: has steps, code, and examples to reproduce the issue
Code Contributions
Code contributions are submitted via pull requests <https://help.github.com/articles/about-pull-requests>
_:
- Ensure that you pass the
Tests
_ - Create a new
pull request <https://github.com/rrwen/search_google/pulls>
_ - Provide an explanation of the changes
A template of the code contribution explanation is provided below:
::
## Purpose
The purpose can mention goals that include fixes to bugs, addition of features, and other improvements, etc.
## Description
The description is a short summary of the changes made such as improved speeds, implementation
## Changes
The changes are a list of general edits made to the files and their respective components.
* `file_path1`:
* `function_module_etc`: changed loop to map
* `function_module_etc`: changed variable value
* `file_path2`:
* `function_module_etc`: changed loop to map
* `function_module_etc`: changed variable value
## Notes
The notes provide any additional text that do not fit into the above sections.
For more information, see Developer Install
_ and Implementation
_.
Developer Notes
Developer Install
Install the latest developer version with pip
from github::
pip install git+https://github.com/rrwen/search_google
Install from git
cloned source:
- Ensure
git <https://git-scm.com/>
_ is installed - Clone into current path
- Install via
pip
::
git clone https://github.com/rrwen/search_google
cd search_google
pip install . -I
Tests
- Clone into current path
git clone https://github.com/rrwen/search_google
- Enter into folder
cd search_google
- Ensure
unittest <https://docs.python.org/2.7/library/unittest.html>
_ is available - Set your
CSE ID <https://support.google.com/customsearch/answer/2649143?hl=en>
_ and Google API developer key <https://developers.google.com/api-client-library/python/auth/api-keys>
_ - Run tests
- Reset config file to defaults
- Please note that this will use up 7 requests from your quota
::
pip install . -I
python -m search_google -s cx="your_cse_id"
python -m search_google -s build_developerKey="your_dev_key"
python -m unittest
python -m search_google -d
Documentation Maintenance
- Ensure
sphinx <https://github.com/sphinx-doc/sphinx/>
_ is installed pip install -U sphinx
- Update the documentation in
docs/
::
pip install . -I
sphinx-build -b html docs/source docs
Upload to github
- Ensure
git <https://git-scm.com/>
_ is installed - Add all files and commit changes
- Push to github
::
git add .
git commit -a -m "Generic update"
git push
Upload to PyPi
- Ensure
twine <https://pypi.python.org/pypi/twine>
_ is installed pip install twine
- Ensure
sphinx <https://github.com/sphinx-doc/sphinx/>
_ is installed pip install -U sphinx
- Run tests and check for OK status
- Delete
dist
directory - Update the version
search_google/__init__.py
- Update the documentation in
docs/
- Create source distribution
- Upload to
PyPi <https://pypi.python.org/pypi>
_
::
pip install . -I
python -m search_google -s cx="your_cse_id"
python -m search_google -s build_developerKey="your_dev_key"
python -m unittest
python -m search_google -d
sphinx-build -b html docs/source docs
python setup.py sdist
twine upload dist/*
Implementation
This command line tool uses the Google Custom Search Engine (CSE) <https://developers.google.com/api-client-library/python/apis/customsearch/v1>
_ to perform web and image searches. It relies on googleapiclient.build <https://google.github.io/google-api-python-client/docs/epy/googleapiclient.discovery-module.html#build>
_ and cse.list <https://developers.google.com/resources/api-libraries/documentation/customsearch/v1/python/latest/customsearch_v1.cse.html>
_, where build
was used to create a Google API object and cse
was used to perform the searches.
The class search_google.api <https://rrwen.github.io/search_google/#module-api>
_ simply passed a dictionary of arguments into build
and cse
to process the returned results with properties and methods. search_google.cli <https://rrwen.github.io/search_google/#module-cli>
_ was then used to create a command line interface for search_google.api <https://rrwen.github.io/search_google/#module-api>
_.
In order to use build
and cse
, a Google Developer API Key <https://developers.google.com/api-client-library/python/auth/api-keys>
_ and a Google CSE ID <https://cse.google.com/all>
_ needs to be created for API access (see search_google Setup <https://rrwen.github.io/search_google/#setup>
). Creating these keys also required a Gmail <https://www.google.com/gmail>
account for login access.
::
googleapiclient.build <-- Google API
|
cse.list <-- Google CSE
|
search_google.api <-- search results
|
search_google.cli <-- command line
A rough example is provided below thanks to the customsearch example <https://github.com/google/google-api-python-client/blob/master/samples/customsearch/main.py>
_ from Google:
.. code-block:: python
from apiclient.discovery import build
Set developer key and CSE ID
dev_key = 'a_developer_key'
cse_id = 'a_cse_id'
Obtain search results from Google CSE
service = build("customsearch", "v1", developerKey=dev_key)
results = service.cse().list(q='cat', cx=cse_id).execute()
Manipulate search results after ...