🚀 Big News: Socket Acquires Coana to Bring Reachability Analysis to Every Appsec Team.Learn more
Socket
Book a DemoInstallSign in
Socket

Elasticsearch-to-GCS-Connector

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

Elasticsearch-to-GCS-Connector

0.0
PyPI
Maintainers
1

Python Library for connecting to connect to Elasticsearch, extract data, and upload it to Google Cloud Storage (GCS) in CSV format.

This Python library facilitates the extraction of data from Elasticsearch and uploading it directly to Google Cloud Storage as a CSV file. It is designed to make data migration between Elasticsearch and Google Cloud Storage straightforward by managing the connections and data handling efficiently.

Features

  • Connect to an Elasticsearch instance and fetch data.
  • Convert data to a pandas DataFrame and then to a CSV format.
  • Upload the CSV file directly to a specified Google Cloud Storage bucket.

Installation Install the package via pip:

pip install Elasticsearch_to_GCS_Connector

Dependencies

  • elasticsearch: To connect and interact with Elasticsearch.
  • google-cloud-storage: To handle operations related to Google Cloud Storage. pandas: To manage data in DataFrame format.

Make sure to have these installed using:

pip install elasticsearch google-cloud-storage pandas

Example Usage:


from your_library import Elasticsearch_to_GCS_Connector

Elasticsearch_to_GCS_Connector(
    es_index_name='your_index',
    es_host='localhost',
    es_port=port,
    es_scheme='http',
    es_http_auth=('user', 'password'),
    es_size=size,
    gcs_file_name='data.csv',
    gcs_bucket_name='your_bucket_name',
    gcs_bucket_name_prefix='your_prefix'
)

Parameters:

  • es_index_name (str): The name of the Elasticsearch index to query.
  • es_host (str): The hostname of the Elasticsearch server.
  • es_port (int): The port number on which the Elasticsearch server is listening.
  • es_scheme (str): The protocol scheme (e.g., 'http' or 'https').
  • es_http_auth (tuple): A tuple containing the username and password for basic authentication.
  • es_size (int, optional): The number of records to fetch in one query (default is 10000).
  • gcs_file_name (str): The name of the file to be saved on GCS.
  • gcs_bucket_name (str): The name of the GCS bucket where the file will be uploaded.
  • gcs_bucket_name_prefix (str, optional): Prefix for the file name in the bucket, useful for organizing files in folders.

Additional Notes:

Ensure you have configured credentials for both Elasticsearch and Google Cloud:

  • For Elasticsearch, provide the host, port, scheme, and authentication details.

  • For Google Cloud Storage, ensure your environment is set up with the appropriate credentials (using Google Cloud SDK or setting the GOOGLE_APPLICATION_CREDENTIALS environment variable to your service account key file).

FAQs

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts