MTSB
MTSB (Movie Tweet Sentiment Boxoffice) is a python module that collects tweets about movies, performs a sentiment analysis and correlates it with the boxoffice result of the 7 days after the movie release.
Features
- Collect tweets about movies
- Creates hashtags for each movie
- Performs sentiment analysis on those tweets using Google's API or Textblob and returns the average score and the average magnitude
- Gets boxoffice data from boxofficemojo
- Performs correlation between the sentiment analysis and boxoffice data
Requirements
- Python >= 3.5 (Might work on older versions but it has not been tested)
- The package has only been tested on Linux, with the following docker compose environment: https://gitlab.com/aletundo/data-management-lab
- All module dependencies are installed on installation, but you will also need:
- You also need to have the following services installed (tested on Linux system)
- Jupyter-lab
- MongoDB
- Nifi
- Kafka
Installation
In order to install MTSB you can simply:
pip install mtsb
Docs
Collect tweets about movies. It lets you choose between movies released in 2019 and releasing in 2020. It then creates a list of hashtags based on the movie's name and top actors and uses it to collect tweets from twitter.
import mtsb
mtsb.tweet_collector()
Performs sentiment analysis on collected tweets using Google's API or Textblob and returns the average score, the average magnitude, their standard deviations and the percentage of positive tweets.
import mtsb
mtsb.sentiment()
- sentiment_boxoffice_all()
Creates a dataframe with the following info for each movie:
* Movie title and genres
* Average mean and std of the tweets' scores and magnitudes
* Percentage of positive and negative labelled tweets (if score==0 is labelled as positive)
* Sum of the boxoffice of the 7 days after the movie release
import mtsb
mtsb.sentiment_boxoffice_all()
Performs a spearman correlation using the df returned by sentiment_boxoffice_all().
mtsb.spearman_corr(df)
Links
Acknowledgements
Useful python libraries used:
Licence
MIT licensed. See the bundled LICENSE file for more details.