Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More
Socket
Sign inDemoInstall
Socket

AnimatedWordCloud

Package Overview
Dependencies
Maintainers
1
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

AnimatedWordCloud

Animated version of classic word cloud for time-series text data

  • 1.0.9
  • PyPI
  • Socket score

Maintainers
1

pypi python License: MIT

AnimatedWordCloud

Animated version of classic word cloud for time-series text data

Classic word cloud graph does not consider the time variation in text data. Animated word cloud improves on this and displays text datasets collected over multiple periods in a single MP4 file. The core framework for the animation of word frequencies was developed by Michael Cane in the WordsSwarm project. AnimatedWordCloud makes the codes efficiently work on various text datasets of the Latin alphabet languages.

Installation

It requires Python 3.8, Box2D, beautifulsoup4, pygame, PyQt6 - visualization, Arabica and ftfy for text preprocessing.

To install using pip, use:

pip install AnimatedWordCloud

AnimatedWordCloud has been tested with PyCharm community ed. It's recommended to use this IDE and run .py files instead .ipynb.

Usage

  • Import the library:
from AnimatedWordCloud import animated_word_cloud
  • Generate frames:

animated_word_cloud generates 90 png word cloud images per period. It scales word frequencies to display word clouds on text datasets of different sizes. Frames are stored in the working directory in the newly created .post_processing/frames folder. It currently provides unigram frequencies (bigram frequencies will be added later). It reads dates in:

  • US-style: MM/DD/YYYY (2013-12-31, Feb-09-2009, 2013-12-31 11:46:17, etc.)
  • European-style: DD/MM/YYYY (2013-31-12, 09-Feb-2009, 2013-31-12 11:46:17, etc.) date and datetime formats.

It automatically cleans data from punctuation and numbers on input. It can also remove the standard list(s) of stopwods for languages in the NLTK corpus of stopwords.

def animated_word_cloud(text: str,         # Text
                        time: str,         # Time
                        date_format: str,  # Date format: 'eur' - European, 'us' - American
                        ngram: int,        # N-gram order, 1 = unigram     
                        freq: str ,        # Aggregation period: 'Y'/'M'
                        stopwords: [],     # Languages for stop words
                        skip: []           # Remove additional stop words 
) 

To apply the method, use:

import pandas as pd
data = pd.read_csv("data.csv")
animated_word_cloud(text = data['text'],                         # Read text column
                    time = data['date'],                         # Read date column
                    date_format = 'us',                          # Specify date format
                    ngram = 1,                                   # Show individual word frequencies
                    freq ='Y',                                   # Yearly frequency
                    stopwords = ['english', 'german','french'],  # Clean from English, German and French stop words
                    skip = ['good', 'bad','yellow'])             # Remove 'good', 'bad', and 'yellow' as additional stop words                                                               

  • Create video from frames:

Download the ffmpeg folder and the frames2video.bat file from here and place them into the postprocessing folder. Next, run frames2video.bat, which will generate a wordSwarmOut.mp4 file, which is the desired output.

AnimatedWordCloud

Documentation, examples and tutorials

Data Storytelling with Animated Word Clouds

  • For more examples of coding, read these tutorials: TBA

Here are examples of animated word clouds:

Research trends in Economics Youtube

European Central Bankers' speeches Youtube


Please visit here for any questions, issues, bugs, and suggestions.

FAQs


Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts

SocketSocket SOC 2 Logo

Product

  • Package Alerts
  • Integrations
  • Docs
  • Pricing
  • FAQ
  • Roadmap
  • Changelog

Packages

npm

Stay in touch

Get open source security insights delivered straight into your inbox.


  • Terms
  • Privacy
  • Security

Made with ⚡️ by Socket Inc