import pandas as pd

from delab_trees import TreeManager



d = {'tree_id': [1] * 4,

     'post_id': [1, 2, 3, 4],

     'parent_id': [None, 1, 2, 1],

     'author_id': ["james", "mark", "steven", "john"],

     'text': ["I am James", "I am Mark", " I am Steven", "I am John"],

     "created_at": [pd.Timestamp('2017-01-01T01'),

                    pd.Timestamp('2017-01-01T02'),

                    pd.Timestamp('2017-01-01T03'),

                    pd.Timestamp('2017-01-01T04')]}

df = pd.DataFrame(data=d)

manager = TreeManager(df) 

# creates one tree

test_tree = manager.random()

Note that the tree structure is based on the parent_id matching another rows post_id.

You can now analyze the reply trees basic metrics:


from delab_trees.main import get_test_tree

from delab_trees.delab_tree import DelabTree



test_tree : DelabTree = get_test_tree()

assert test_tree.total_number_of_posts() == 4

assert test_tree.average_branching_factor() > 0

A summary of basic metrics can be attained by calling


from delab_trees.test_data_manager import get_test_tree

from delab_trees.delab_tree import DelabTree



test_tree : DelabTree = get_test_tree()

print(test_tree.get_author_metrics())



# >>> removed [] and changed {} (merging subsequent posts of the same author)

# >>>{'james': <delab_trees.delab_author_metric.AuthorMetric object at 0x7fa9c5496110>, 'steven': <delab_trees.delab_author_metric.AuthorMetric object at 0x7fa9c5497dc0>, 'john': <delab_trees.delab_author_metric.AuthorMetric object at 0x7fa9c5497a00>, 'mark': <delab_trees.delab_author_metric.AuthorMetric object at 0x7fa9c5497bb0>}

More complex metrics that use the full dataset for training can be gotten by the manager:


import pandas as pd

from delab_trees import TreeManager



d = {'tree_id': [1] * 4,

     'post_id': [1, 2, 3, 4],

     'parent_id': [None, 1, 2, 1],

     'author_id': ["james", "mark", "steven", "john"],

     'text': ["I am James", "I am Mark", " I am Steven", "I am John"],

     "created_at": [pd.Timestamp('2017-01-01T01'),

                    pd.Timestamp('2017-01-01T02'),

                    pd.Timestamp('2017-01-01T03'),

                    pd.Timestamp('2017-01-01T04')]}

df = pd.DataFrame(data=d)

manager = TreeManager(df) # creates one tree

rb_vision_dictionary : dict["tree_id", dict["author_id", "vision_metric"]] = manager.get_rb_vision()

The following two complex metrics are implemented:


from delab_trees.test_data_manager import get_test_manager



manager = get_test_manager()

rb_vision_dictionary = manager.get_rb_vision() # predict an author having seen a post

pb_vision_dictionary = manager.get_pb_vision() # predict an author to write the next post

How to cite


    @article{dehne_dtrees_23,

    author    = {Dehne, Julian},

    title     = {Delab-Trees: measuring deliberation in online conversations},        

    url = {https://github.com/juliandehne/delab-trees}     

    year      = {2023},

}

FAQs

What is delab-trees?

Is delab-trees well maintained?

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

delab-trees

Delab Trees

Installation

Get started

How to cite

Related posts

TypeScript is Porting Its Compiler to Go for 10x Faster Builds

Lazarus Strikes npm Again with New Wave of Malicious Packages