Huge News!Announcing our $40M Series B led by Abstract Ventures.Learn More →

tm-eval

Package Overview

Dependencies

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

tm-eval

Topic Modeling Evaluation

0.0.2
Source
PyPI

Sorry we don't scan binary artifacts yet

Maintainers: 1

Topic Modeling Evaluation

A toolkit to quickly evaluate model goodness over number of topics

Metrics

Coherence measure to be used.

Fastest method - 'u_mass', 'c_uci' also known as c_pmi.
For 'u_mass' corpus should be provided, if texts is provided, it will be converted to corpus using the dictionary.
For 'c_v', 'c_uci' and 'c_npmi' texts should be provided (corpus isn't needed)

Examples

Example 1: estimate metrics for one topic model with specific number of topics

from tm_eval import *
# load a dictionary with document key and its term list split by ','.
input_file = "datasets/covid19_symptoms.pickle"
output_folder = "outputs"
model_name = "symptom"
num_topics = 10
# run
results = evaluate_all_metrics_from_lda_model(input_file=input_file, 
                                              output_folder=output_folder,
                                              model_name=model_name, 
                                              num_topics=num_topics)
print(results)

Example 2: find model goodness change over number of topics

from tm_eval import *
if __name__=="__main__":
    # start configure
    # load a dictionary (key,value) with document id as key and its term list combined by ',' as value.
    input_file = "datasets/covid19_symptoms.pickle"
    output_folder = "outputs"
    model_name = "symptom"
    start=2
    end=5
    # end configure
    # run and explore

    list_results = explore_topic_model_metrics(input_file=input_file, 
                                               output_folder=output_folder,
                                               model_name=model_name,
                                               start=start,
                                               end=end)
    # summarize results
    show_topic_model_metric_change(list_results,save=True,
                                   save_path=f"{output_folder}/metrics.csv")

    # plot metric changes
    plot_tm_metric_change(csv_path=f"{output_folder}/metrics.csv",
                          save=True,save_folder=output_folder)

Output results

c_v

u_mass

c_npmi

c_uci

License

The tm-eval toolkit is provided by Donghua Chen with MIT License.

References

Keywords

FAQs

What is tm-eval?

Is tm-eval well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

tm-eval

Topic Modeling Evaluation

Metrics

Examples

Output results

License

References

Keywords

Related posts

Ultralytics PyPI Package Compromised Through GitHub Actions Cache Poisoning

Malicious Maven Package Impersonating 'XZ for Java' Library Introduces Backdoor Allowing Remote Code Execution