Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

ecommercetools

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

ecommercetools

EcommerceTools is a data science toolkit for ecommerce, marketing science, and Python SEO.

  • 0.42.9
  • PyPI
  • Socket score

Maintainers
1

EcommerceTools

EcommerceTools

EcommerceTools is a data science toolkit for those working in technical ecommerce, marketing science, and technical seo and includes a wide range of features to aid analysis and model building. The package is written in Python and is designed to be used with Pandas and works within a Jupyter notebook environment or in standalone Python projects.

Installation

You can install EcommerceTools and its dependencies via PyPi by entering pip3 install ecommercetools in your terminal, or !pip3 install ecommercetools within a Jupyter notebook cell.


Modules

  • Transactions
  • Products
  • Customers
  • Advertising
  • Operations
  • Marketing
  • NLP
  • SEO
  • Reports

Transactions

  1. Load sample transaction items data

If you want to get started with the transactions, products, and customers features, you can use the load_sample_data() function to load a set of real world data. This imports the transaction items from widely-used Online Retail dataset and reformats it ready for use by EcommerceTools.

from ecommercetools import utilities

transaction_items = utilities.load_sample_data()
transaction_items.head()
order_idskudescriptionquantityorder_dateunit_pricecustomer_idcountryline_price
053636585123AWHITE HANGING HEART T-LIGHT HOLDER62010-12-01 08:26:002.5517850.0United Kingdom15.30
153636571053WHITE METAL LANTERN62010-12-01 08:26:003.3917850.0United Kingdom20.34
253636584406BCREAM CUPID HEARTS COAT HANGER82010-12-01 08:26:002.7517850.0United Kingdom22.00
353636584029GKNITTED UNION FLAG HOT WATER BOTTLE62010-12-01 08:26:003.3917850.0United Kingdom20.34
453636584029ERED WOOLLY HOTTIE WHITE HEART.62010-12-01 08:26:003.3917850.0United Kingdom20.34
  1. Create a transaction items dataframe

The utilities module includes a range of tools that allow you to format data, so it can be used within other EcommerceTools functions. The load_transaction_items() function is used to create a Pandas dataframe of formatted transactional item data. When loading your transaction items data, all you need to do is define the column mappings, and the function will reformat the dataframe accordingly.

import pandas as pd
from ecommercetools import utilities

transaction_items = utilities.load_transaction_items('transaction_items_non_standard_names.csv',
                                 date_column='InvoiceDate',
                                 order_id_column='InvoiceNo',
                                 customer_id_column='CustomerID',
                                 sku_column='StockCode',
                                 quantity_column='Quantity',
                                 unit_price_column='UnitPrice'
                                 )
transaction_items.to_csv('transaction_items.csv', index=False)
print(transaction_items.head())
order_idskudescriptionquantityorder_dateunit_pricecustomer_idcountryline_price
053636585123AWHITE HANGING HEART T-LIGHT HOLDER62010-12-01 08:26:002.5517850.0United Kingdom15.30
153636571053WHITE METAL LANTERN62010-12-01 08:26:003.3917850.0United Kingdom20.34
253636584406BCREAM CUPID HEARTS COAT HANGER82010-12-01 08:26:002.7517850.0United Kingdom22.00
353636584029GKNITTED UNION FLAG HOT WATER BOTTLE62010-12-01 08:26:003.3917850.0United Kingdom20.34
453636584029ERED WOOLLY HOTTIE WHITE HEART.62010-12-01 08:26:003.3917850.0United Kingdom20.34
  1. Create a transactions dataframe

The get_transactions() function takes the formatted Pandas dataframe of transaction items and returns a Pandas dataframe of aggregated transaction data, which includes features identifying the order number.

import pandas as pd
from ecommercetools import customers

transaction_items = pd.read_csv('transaction_items.csv')
transactions = transactions.get_transactions(transaction_items)
transactions.to_csv('transactions.csv', index=False)
print(transactions.head())
order_idorder_datecustomer_idskusitemsrevenuereplacementorder_number
05363652010-12-01 08:26:0017850.0740139.1201
15363662010-12-01 08:28:0017850.021222.2002
25363672010-12-01 08:34:0013047.01283278.7301
35363682010-12-01 08:34:0013047.041570.0502
45363692010-12-01 08:35:0013047.01317.8503

Products

1. Get product data from transaction items
products_df = products.get_products(transaction_items)
products_df.head()
skufirst_order_datelast_order_datecustomersordersitemsrevenueavg_unit_priceavg_quantityavg_revenueavg_ordersproduct_tenureproduct_recency
0100022010-12-01 08:45:002011-04-28 15:05:0040731037759.891.05684914.20547910.4094521.8237493600
1100802011-02-27 13:47:002011-11-21 17:04:001924495119.090.37666720.6250004.9620831.2636603393
2101202010-12-03 11:19:002011-12-04 13:15:00252919340.530.2100006.4333331.3510001.1637463380
310123C2010-12-03 11:19:002011-07-15 15:05:0034-133.250.487500-3.2500000.8125001.3337463522
410123G2011-04-08 11:13:002011-04-08 11:13:0001-380.000.000000-38.0000000.000000inf36203620
2. Calculate product consumption and repurchase rate
repurchase_rates = products.get_repurchase_rates(transaction_items)
repurchase_rates.head(3).T
012
sku100021008010120
revenue759.89119.0940.53
items1037495193
orders732429
customers401925
avg_unit_price1.056850.3766670.21
avg_line_price10.40954.962081.351
avg_items_per_order14.205520.6256.65517
avg_items_per_customer25.92526.05267.72
purchased_individually009
purchased_once341722
bulk_purchases732420
bulk_purchase_rate110.689655
repurchases3977
repurchase_rate0.5342470.2916670.241379
repurchase_rate_labelModerate repurchaseLow repurchaseLow repurchase
bulk_purchase_rate_labelVery high bulkVery high bulkHigh bulk
bulk_and_repurchase_labelModerate repurchase_Very high bulkLow repurchase_Very high bulkLow repurchase_High bulk

Customers

1. Create a customers dataset
from ecommercetools import customers

customers_df = customers.get_customers(transaction_items)
customers_df.head()
customer_idrevenueordersskusitemsfirst_order_datelast_order_dateavg_itemsavg_order_valuetenurerecencycohort
012346.00.002102011-01-18 10:01:002011-01-18 10:17:000.000.003701370020111
112347.04310.007724582010-12-07 14:57:002011-12-07 15:52:00351.14615.713742337720104
212348.01797.244423412010-12-16 19:09:002011-09-25 13:13:00585.25449.313733345020104
312349.01757.55116312011-11-21 09:51:002011-11-21 09:51:00631.001757.553394339420114
412350.0334.40111972011-02-02 16:01:002011-02-02 16:01:00197.00334.403685368520111
2. Create a customer cohort analysis dataset
from ecommercetools import customers

cohorts_df = customers.get_cohorts(transaction_items, period='M')
cohorts_df.head()
customer_idorder_idorder_dateacquisition_cohortorder_cohort
017850.05363652010-12-01 08:26:002010-122010-12
717850.05363662010-12-01 08:28:002010-122010-12
913047.05363672010-12-01 08:34:002010-122010-12
2113047.05363682010-12-01 08:34:002010-122010-12
2513047.05363692010-12-01 08:35:002010-122010-12
3. Create a customer cohort analysis matrix
from ecommercetools import customers

cohort_matrix_df = customers.get_cohort_matrix(transaction_items, period='M', percentage=True)
cohort_matrix_df.head()
periods0123456789101112
acquisition_cohort
2010-121.00.3818570.3343880.3871310.3597050.3966240.3797470.3544300.3544300.3945150.3734180.5000000.274262
2011-011.00.2399050.2826600.2422800.3277910.2992870.2612830.2565320.3111640.3467930.3681710.149644NaN
2011-021.00.2473680.1921050.2789470.2684210.2473680.2552630.2815790.2578950.3131580.092105NaNNaN
2011-031.00.1909090.2545450.2181820.2318180.1772730.2636360.2386360.2886360.088636NaNNaNNaN
2011-041.00.2274250.2207360.2107020.2073580.2374580.2307690.2608700.083612NaNNaNNaNNaN
from ecommercetools import customers

cohort_matrix_df = customers.get_cohort_matrix(transaction_items, period='M', percentage=False)
cohort_matrix_df.head()
periods0123456789101112
acquisition_cohort
2010-12948.0362.0317.0367.0341.0376.0360.0336.0336.0374.0354.0474.0260.0
2011-01421.0101.0119.0102.0138.0126.0110.0108.0131.0146.0155.063.0NaN
2011-02380.094.073.0106.0102.094.097.0107.098.0119.035.0NaNNaN
2011-03440.084.0112.096.0102.078.0116.0105.0127.039.0NaNNaNNaN
2011-04299.068.066.063.062.071.069.078.025.0NaNNaNNaNNaN
4. Create a customer "retention" dataset
from ecommercetools import customers

retention_df = customers.get_retention(transactions_df)
retention_df.head()
acquisition_cohortorder_cohortcustomersperiods
02010-122010-129480
12010-122011-013621
22010-122011-023172
32010-122011-033673
42010-122011-043414
5. Create an RFM (H) dataset

This is an extension of the regular Recency, Frequency, Monetary value (RFM) model that includes an additional parameter "H" for heterogeneity. This shows the number of unique SKUs purchased by each customer. While typically unassociated with targeting, this value can be very useful in identifying which customers should probably be buying a broader mix of products than they currently are, as well as spotting those who may have stopped buying certain items.

from ecommercetools import customers

rfm_df = customers.get_rfm_segments(customers_df)
rfm_df.head()
customer_idacquisition_daterecency_daterecencyfrequencymonetaryheterogeneitytenurerfmhrfmrfm_scorerfm_segment_name
012346.02011-01-18 10:01:002011-01-18 10:17:00370020.001370111111113Risky
112350.02011-02-02 16:01:002011-02-02 16:01:0036851334.401368511111113Risky
212365.02011-02-21 13:51:002011-02-21 14:04:0036663320.692366611111113Risky
312373.02011-02-01 13:10:002011-02-01 13:10:0036861364.601368611111113Risky
412377.02010-12-20 09:37:002011-01-28 15:45:00369021628.122373011111113Risky
6. Create a purchase latency dataset
from ecommercetools import customers 

latency_df = customers.get_latency(transactions_df)
latency_df.head()
customer_idfrequencyrecency_daterecencyavg_latencymin_latencymax_latencystd_latencycvdays_to_next_orderlabel
012680.042011-12-09 12:50:00338828167330.8598981.102139-3329.0Order overdue
113113.0242011-12-09 12:49:0033881505212.0601260.804008-3361.0Order overdue
215804.0132011-12-09 12:31:0033881513911.0082610.733884-3362.0Order overdue
313777.0332011-12-09 12:25:0033881104812.0552741.095934-3365.0Order overdue
417581.0252011-12-09 12:21:0033881406721.9742931.569592-3352.0Order overdue
7. Customer ABC segmentation
from ecommercetools import customers

abc_df = customers.get_abc_segments(customers_df, months=12, abc_class_name='abc_class_12m', abc_rank_name='abc_rank_12m')
abc_df.head()
customer_idabc_class_12mabc_rank_12m
012346.0D1.0
112347.0D1.0
212348.0D1.0
312349.0D1.0
412350.0D1.0
8. Predict customer AOV, CLV, and orders

EcommerceTools allows you to predict the AOV, Customer Lifetime Value (CLV) and expected number of orders via the Gamma-Gamma and BG/NBD models from the excellent Lifetimes package. By passing the dataframe of transactions from get_transactions() to the get_customer_predictions() function, EcommerceTools will fit the BG/NBD and Gamma-Gamma models and predict the AOV, order quantity, and CLV for each customer in the defined number of future days after the end of the observation period.

customer_predictions = customers.get_customer_predictions(transactions_df, 
                                                          observation_period_end='2011-12-09', 
                                                          days=90)
customer_predictions.head(10)
customer_idpredicted_purchasesaovclv
012346.00.188830NaNNaN
112347.01.408736569.978836836.846896
212348.00.805907333.784235308.247354
312349.00.855607NaNNaN
412350.00.196304NaNNaN
512352.01.682277376.175359647.826169
612353.00.272541NaNNaN
712354.00.247183NaNNaN
812355.00.262909NaNNaN
912356.00.645368324.039419256.855226
---

Advertising

1. Create paid search keywords
from ecommercetools import advertising

product_names = ['fly rods', 'fly reels']
keywords_prepend = ['buy', 'best', 'cheap', 'reduced']
keywords_append = ['for sale', 'price', 'promotion', 'promo', 'coupon', 'voucher', 'shop', 'suppliers']
campaign_name = 'fly_fishing'

keywords = advertising.generate_ad_keywords(product_names, keywords_prepend, keywords_append, campaign_name)
keywords.head()
productkeywordsmatch_typecampaign_name
0fly rods[fly rods]Exactfly_fishing
1fly rods[buy fly rods]Exactfly_fishing
2fly rods[best fly rods]Exactfly_fishing
3fly rods[cheap fly rods]Exactfly_fishing
4fly rods[reduced fly rods]Exactfly_fishing
2. Create paid search ad copy using Spintax
from ecommercetools import advertising

text = "Fly Reels from {Orvis|Loop|Sage|Airflo|Nautilus} for {trout|salmon|grayling|pike}"
spin = advertising.generate_spintax(text, single=False)

spin
['Fly Reels from Orvis for trout',
 'Fly Reels from Orvis for salmon',
 'Fly Reels from Orvis for grayling',
 'Fly Reels from Orvis for pike',
 'Fly Reels from Loop for trout',
 'Fly Reels from Loop for salmon',
 'Fly Reels from Loop for grayling',
 'Fly Reels from Loop for pike',
 'Fly Reels from Sage for trout',
 'Fly Reels from Sage for salmon',
 'Fly Reels from Sage for grayling',
 'Fly Reels from Sage for pike',
 'Fly Reels from Airflo for trout',
 'Fly Reels from Airflo for salmon',
 'Fly Reels from Airflo for grayling',
 'Fly Reels from Airflo for pike',
 'Fly Reels from Nautilus for trout',
 'Fly Reels from Nautilus for salmon',
 'Fly Reels from Nautilus for grayling',
 'Fly Reels from Nautilus for pike']

Operations

1. Create an ABC inventory classification
inventory_classification = operations.get_inventory_classification(transaction_items)
inventory_classification.head()
skuabc_classabc_rank
010002A1
110080A2
210120A3
310123CA4
410123GA4

Marketing

1. Get ecommerce trading calendar
from ecommercetools import marketing

trading_calendar_df = marketing.get_trading_calendar('2021-01-01', days=365)
trading_calendar_df.head()
dateevent
02021-01-01January sale
12021-01-02
22021-01-03
32021-01-04
42021-01-05
2. Get ecommerce trading events
from ecommercetools import marketing

trading_events_df = marketing.get_trading_events('2021-01-01', days=365)
trading_events_df.head()
dateevent
02021-01-01January sale
12021-01-29January Pay Day
22021-02-11Valentine's Day [last order date]
32021-02-14Valentine's Day
42021-02-26February Pay Day

NLP

1. Generate text summaries

The get_summaries() function of the nlp module takes a Pandas dataframe containing text and returns a machine-generated summary of the content using a Huggingface Transformers pipeline via PyTorch. To use this feature, first load your Pandas dataframe and import the nlp module from ecommercetools.

import pandas as pd
from ecommercetools import nlp 

pd.set_option('max_colwidth', 1000)
df = pd.read_csv('text.csv')
df.head()

Specify the name of the Pandas dataframe, the column containing the text you wish to summarise (i.e. product_description), and specify a column name in which to store the machine-generated summary. The min_length and max_length arguments control the number of words generated, while the do_sample argument controls whether the generated text is completely unique (do_sample=False) or extracted from the text (do_sample=True).

df = nlp.get_summaries(df, 'product_description', 'sampled_summary', min_length=50, max_length=100, do_sample=True)
df = nlp.get_summaries(df, 'product_description', 'unsampled_summary', min_length=50, max_length=100, do_sample=False)
df = nlp.get_summaries(df, 'product_description', 'unsampled_summary_20_to_30', min_length=20, max_length=30, do_sample=False)

Since the model used for text summarisation is very large (1.2 GB plus), this function will take some time to complete. Once loaded, summaries are generated within a second or two per piece of text, so it is advisable to try smaller volumes of data initially.

SEO

1. Discover XML sitemap locations

The get_sitemaps() function takes the location of a robots.txt file (always stored at the root of a domain), and returns the URLs of any XML sitemaps listed within.

from ecommercetools import seo

sitemaps = seo.get_sitemaps("http://www.flyandlure.org/robots.txt")
print(sitemaps)

2. Get an XML sitemap

The get_dataframe() function allows you to download the URLs in an XML sitemap to a Pandas dataframe. If the sitemap contains child sitemaps, each of these will be retrieved. You can save the Pandas dataframe to CSV in the usual way.

from ecommercetools import seo

df = seo.get_sitemap("http://flyandlure.org/sitemap.xml")
print(df.head())
locchangefreqprioritydomainsitemap_name
0http://flyandlure.org/hourly1.0flyandlure.orghttp://www.flyandlure.org/sitemap.xml
1http://flyandlure.org/aboutmonthly1.0flyandlure.orghttp://www.flyandlure.org/sitemap.xml
2http://flyandlure.org/termsmonthly1.0flyandlure.orghttp://www.flyandlure.org/sitemap.xml
3http://flyandlure.org/privacymonthly1.0flyandlure.orghttp://www.flyandlure.org/sitemap.xml
4http://flyandlure.org/copyrightmonthly1.0flyandlure.orghttp://www.flyandlure.org/sitemap.xml
3. Get Core Web Vitals from PageSpeed Insights

The get_core_web_vitals() function retrieves the Core Web Vitals metrics for a list of sites from the Google PageSpeed Insights API and returns results in a Pandas dataframe. The function requires a a Google PageSpeed Insights API key.

from ecommercetools import seo

pagespeed_insights_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
urls = ['https://www.bbc.co.uk', 'https://www.bbc.co.uk/iplayer']
df = seo.get_core_web_vitals(pagespeed_insights_key, urls)
print(df.head())
4. Get Google Knowledge Graph data

The get_knowledge_graph() function returns the Google Knowledge Graph data for a given search term. This requires the use of a Google Knowledge Graph API key. By default, the function returns output in a Pandas dataframe, but you can pass the output="json" argument if you wish to receive the JSON data back.

from ecommercetools import seo

knowledge_graph_key = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
knowledge_graph = seo.get_knowledge_graph(knowledge_graph_key, "tesla", output="dataframe")
print(knowledge_graph)
5. Get Google Search Console API data

The query_google_search_console() function runs a search query on the Google Search Console API and returns data in a Pandas dataframe. This function requires a JSON client secrets key with access to the Google Search Console API.

from ecommercetools import seo

key = "google-search-console.json"
site_url = "http://flyandlure.org"
payload = {
    'startDate': "2019-01-01",
    'endDate': "2019-12-31",
    'dimensions': ["page", "device", "query"],
    'rowLimit': 100,
    'startRow': 0
}

df = seo.query_google_search_console(key, site_url, payload)
print(df.head())

pagedevicequeryclicksimpressionsctrposition
0http://flyandlure.org/articles/fly_fishing_gea...MOBILEsimms freestone waders review5621725.813.12
1http://flyandlure.org/MOBILEfly and lure3715923.273.81
2http://flyandlure.org/articles/fly_fishing_gea...DESKTOPorvis encounter waders review3513426.124.04
3http://flyandlure.org/articles/fly_fishing_gea...DESKTOPsimms freestone waders review3520017.503.50
4http://flyandlure.org/DESKTOPfly and lure3217018.823.09
Fetching all results from Google Search Console

To fetch all results, set fetch_all to True. This will automatically paginate through your Google Search Console data and return all results. Be aware that if you do this you may hit Google's quota limit if you run a query over an extended period, or have a busy site with lots of page or query dimensions.

from ecommercetools import seo

key = "google-search-console.json"
site_url = "http://flyandlure.org"
payload = {
    'startDate': "2019-01-01",
    'endDate': "2019-12-31",
    'dimensions': ["page", "device", "query"],
    'rowLimit': 25000,
    'startRow': 0
}

df = seo.query_google_search_console(key, site_url, payload, fetch_all=True)
print(df.head())

Comparing two time periods in Google Search Console
payload_before = {
    'startDate': "2021-08-11",
    'endDate': "2021-08-31",
    'dimensions': ["page","query"],    
}

payload_after = {
    'startDate': "2021-07-21",
    'endDate': "2021-08-10",
    'dimensions': ["page","query"],    
}

df = seo.query_google_search_console_compare(key, site_url, payload_before, payload_after, fetch_all=False)
df.sort_values(by='clicks_change', ascending=False).head()
6. Get the number of "indexed" pages

The get_indexed_pages() function uses the "site:" prefix to search Google for the number of pages "indexed". This is very approximate and may not be a perfect representation, but it's usually a good guide of site "size" in the absence of other data.

from ecommercetools import seo

urls = ['https://www.bbc.co.uk', 'https://www.bbc.co.uk/iplayer', 'http://flyandlure.org']
df = seo.get_indexed_pages(urls)
print(df.head())
urlindexed_pages
2http://flyandlure.org2090
1https://www.bbc.co.uk/iplayer215000
0https://www.bbc.co.uk12700000
7. Get keyword suggestions from Google Autocomplete

The google_autocomplete() function returns a set of keyword suggestions from Google Autocomplete. The include_expanded=True argument allows you to expand the number of suggestions shown by appending prefixes and suffixes to the search terms.

from ecommercetools import seo

suggestions = seo.google_autocomplete("data science", include_expanded=False)
print(suggestions)

suggestions = seo.google_autocomplete("data science", include_expanded=True)
print(suggestions)
termrelevance
0data science jobs650
1data science jobs chester601
2data science course600
3data science masters554
4data science salary553
5data science internship552
6data science jobs london551
7data science graduate scheme550
8. Retrieve robots.txt content

The get_robots() function returns the contents of a robots.txt file in a Pandas dataframe so it can be parsed and analysed.

from ecommercetools import seo

robots = seo.get_robots("http://www.flyandlure.org/robots.txt")
print(robots)
directiveparameter
0User-agent*
1Disallow/signin
2Disallow/signup
3Disallow/users
4Disallow/contact
5Disallow/activate
6Disallow/*/page
7Disallow/articles/search
8Disallow/search.php
9Disallow*q=*
10Disallow*category_slug=*
11Disallow*country_slug=*
12Disallow*county_slug=*
13Disallow*features=*
9. Get Google SERPs

The get_serps() function returns a Pandas dataframe containing the Google search engine results for a given search term. Note that this function is not suitable for large-scale scraping and currently includes no features to prevent it from being blocked.

from ecommercetools import seo

serps = seo.get_serps("data science blog")
print(serps)
titlelinktext
010 of the best data science blogs to follow - ...https://www.tableau.com/learn/articles/data-sc...10 of the best data science blogs to follow. T...
1Best Data Science Blogs to Follow in 2020 | by...https://towardsdatascience.com/best-data-scien...14 Jul 2020 — 1. Towards Data Science · Joined...
2Top 20 Data Science Blogs And Websites For Dat...https://medium.com/@exastax/top-20-data-scienc...Top 20 Data Science Blogs And Websites For Dat...
3Data Science Blog – Dataquesthttps://www.dataquest.io/blog/Browse our data science blog to get helpful ti...
451 Awesome Data Science Blogs You Need To Chec...https://365datascience.com/trending/51-data-sc...Blog name: DataKind · datakind data science bl...
5Blogs on AI, Analytics, Data Science, Machine ...https://www.kdnuggets.com/websites/blogs.htmlIndividual/small group blogs · Ai4 blog, featu...
6Data Science Blog – Applied Data Sciencehttps://data-science-blog.com/... an Bedeutung – DevOps for Data Science. De...
7Top 10 Data Science and AI Blogs in 2020 - Liv...https://livecodestream.dev/post/top-data-scien...Some of the best data science and AI blogs for...
8Data Science Blogs: 17 Must-Read Blogs for Dat...https://www.thinkful.com/blog/data-science-blogs/Data scientists could be considered the magici...
9rushter/data-science-blogs: A curated list of ...https://github.com/rushter/data-science-blogsA curated list of data science blogs. Contribu...

To set the domain and host language you can use these parameters. This will search for "bmw" on the German Google domain and return the results in German.

df = seo.get_serps("bmw", pages=1, domain="google.de", host_language="de")
Create an ABCD classification of Google Search Console data

The classify_pages() function returns an ABCD classification of Google Search Console data. This calculates the cumulative sum of clicks and then categorises pages using the ABC algorithm (the first 80% are classed A, the next 10% are classed B, and the final 10% are classed C, with the zero click pages classed D).

from ecommercetools import seo

key = "client_secrets.json"
site_url = "example-domain.co.uk"
start_date = '2022-10-01'
end_date = '2022-10-31'

df_classes = seo.classify_pages(key, site_url, start_date, end_date, output='classes')
print(df_classes.head())

df_summary = seo.classify_pages(key, site_url, start_date, end_date, output='summary')
print(df_summary)

                                                page  clicks  impressions    ctr  position  clicks_cumsum  clicks_running_pc  pc_share class  class_rank
0  https://practicaldatascience.co.uk/machine-lea...    3890        36577  10.64     12.64           3890           8.382898  8.382898     A           1
1  https://practicaldatascience.co.uk/data-scienc...    2414        16618  14.53     14.30           6304          13.585036  5.202138     A           2
2  https://practicaldatascience.co.uk/data-scienc...    2378        71496   3.33     16.39           8682          18.709594  5.124558     A           3
3  https://practicaldatascience.co.uk/data-scienc...    1942        14274  13.61     15.02          10624          22.894578  4.184984     A           4
4  https://practicaldatascience.co.uk/data-scienc...    1738        23979   7.25     11.80          12362          26.639945  3.745367     A           5


class  pages  impressions  clicks   avg_ctr  avg_position  share_of_clicks  share_of_impressions
0     A     63       747643   36980  5.126349     22.706825             79.7                  43.7
1     B     46       639329    4726  3.228043     31.897826             10.2                  37.4
2     C    190       323385    4698  2.393632     38.259368             10.1                  18.9
3     D     36         1327       0  0.000000     25.804722              0.0                   0.1

Reports

The Reports module creates weekly, monthly, quarterly, or yearly reports for customers and orders and calculates a range of common ecommerce metrics to show business performance.

1. Customers report

The customers_report() function takes a formatted dataframe of transaction items (see above) and a desired frequency (D for daily, W for weekly, M for monthly, Q for quarterly) and calculates aggregate metrics for each period.

The function returns the number of orders, the number of customers, the number of new customers, the number of returning customers, and the acquisition rate (or proportion of new customers). For monthly reporting, I would recommend a 13-month period so you can compare the last month with the same month the previous year.

from ecommercetools import reports

df_customers_report = reports.customers_report(transaction_items, frequency='M')
print(df_customers_report.head(13))
2. Transactions report

The transactions_report() function takes a formatted dataframe of transaction items (see above) and a desired frequency (D for daily, W for weekly, M for monthly, Q for quarterly) and calculates aggregate metrics for each period.

The metrics returned are: customers, orders, revenue, SKUs, units, average order value, average SKUs per order, average units per order, and average revenue per customer.

from ecommercetools import reports

df_orders_report = reports.transactions_report(transaction_items, frequency='M')
print(df_orders_report.head(13))

Keywords

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc