
Product
Introducing Scala and Kotlin Support in Socket
Socket now supports Scala and Kotlin, bringing AI-powered threat detection to JVM projects with easy manifest generation and fast, accurate scans.
A lightweight and efficient library for transforming emoticons into their semantic meanings with sentiment analysis capabilities
A lightweight and efficient library for transforming emoticons into their semantic meanings. This is particularly useful for NLP preprocessing where emoticons need to be preserved as meaningful text.
An emoticon (short for "emotion icon") is a pictorial representation of a facial expression using characters—usually punctuation marks, numbers, and letters—to express a person's feelings or mood. The first ASCII emoticons, :-)
and :-(
, were written by Scott Fahlman in 1982, but emoticons actually originated on the PLATO IV computer system in 1972.
Kaomoji (顔文字) are Japanese emoticons that are read horizontally and are more elaborate than traditional Western emoticons. They often use Unicode characters to create more complex expressions and can represent a wider range of emotions and actions. For example, (。♥‿♥。)
represents being in love, and (ノ°益°)ノ
shows rage. Unlike Western emoticons that you read by tilting your head sideways, kaomoji are meant to be viewed straight on.
emoticon_fix supports a wide variety of kaomoji, making it particularly useful for processing text from Asian social media or any platform where kaomoji are commonly used.
When preprocessing text for NLP models, simply removing punctuation can leave emoticons and kaomoji as meaningless characters. For example, :D
(laugh) would become just D
, and (。♥‿♥。)
(in love) would be completely lost. This can negatively impact model performance. By transforming emoticons and kaomoji to their textual meanings, we preserve the emotional context in a format that's more meaningful for NLP tasks.
pip install emoticon-fix
from emoticon_fix import emoticon_fix, remove_emoticons, replace_emoticons
# Basic usage - transform emoticons to their meanings
text = 'Hello :) World :D'
result = emoticon_fix(text)
print(result) # Output: 'Hello Smile World Laugh'
# Remove emoticons completely
stripped_text = remove_emoticons(text)
print(stripped_text) # Output: 'Hello World'
# Replace with NER-friendly tags (customizable format)
ner_text = replace_emoticons(text, tag_format="__EMO_{tag}__")
print(ner_text) # Output: 'Hello __EMO_Smile__ World __EMO_Laugh__'
# Works with multiple emoticons
text = 'I am :-) but sometimes :-( and occasionally :-D'
result = emoticon_fix(text)
print(result) # Output: 'I am Smile but sometimes Sad and occasionally Laugh'
from emoticon_fix import analyze_sentiment, get_sentiment_score, classify_sentiment
# Analyze sentiment of emoticons in text
text = "Having a great day :) :D!"
analysis = analyze_sentiment(text)
print(f"Sentiment: {analysis.classification}") # "Very Positive"
print(f"Score: {analysis.average_score:.3f}") # "0.800"
# Get just the sentiment score (-1.0 to 1.0)
score = get_sentiment_score("Happy :) but sad :(")
print(score) # 0.05 (slightly positive)
# Get sentiment classification
classification = classify_sentiment("Love this (。♥‿♥。) so much!")
print(classification) # "Very Positive"
The analytics extension provides comprehensive emoticon usage analysis:
from emoticon_fix import (
get_emoticon_statistics,
create_emotion_profile,
compare_emotion_profiles,
get_emoticon_trends
)
# Get detailed statistics about emoticon usage
text = "Happy :) very :) extremely :D and sometimes sad :("
stats = get_emoticon_statistics(text)
print(f"Total emoticons: {stats.total_emoticons}") # 4
print(f"Unique emoticons: {stats.unique_emoticons}") # 3
print(f"Dominant emotion: {stats.dominant_emotion}") # "Smile"
print(f"Average sentiment: {stats.average_sentiment:.3f}") # 0.525
print(f"Emoticon density: {stats.get_emoticon_density():.1f}%") # per 100 chars
# Get top emoticons and emotions
print("Top emoticons:", stats.get_top_emoticons(3))
print("Top emotions:", stats.get_top_emotions(3))
Create comprehensive emotion profiles for users or text collections:
# Create emotion profile from multiple texts
texts = [
"Having a great day :) :D",
"Feeling sad today :(",
"Mixed emotions :) but also :/ sometimes",
"Super excited! :D :D (。♥‿♥。)"
]
profile = create_emotion_profile(texts, "User Profile")
print(f"Profile: {profile.name}")
print(f"Texts analyzed: {profile.texts_analyzed}")
print(f"Total emoticons: {profile.total_emoticons}")
print(f"Overall sentiment: {profile.get_overall_sentiment():.3f}")
print(f"Emotion diversity: {profile.get_emotion_diversity():.3f}")
print(f"Sentiment consistency: {profile.get_sentiment_consistency():.3f}")
# Get dominant emotions across all texts
dominant_emotions = profile.get_dominant_emotions(5)
print("Dominant emotions:", dominant_emotions)
Compare emotion patterns between different users or text collections:
# Create multiple profiles
happy_user = create_emotion_profile([
"Great day :D", "So happy :)", "Love this! (。♥‿♥。)"
], "Happy User")
sad_user = create_emotion_profile([
"Feeling down :(", "Bad day :(", "Not good :("
], "Sad User")
mixed_user = create_emotion_profile([
"Happy :) but worried :(", "Good :) and bad :(", "Mixed feelings :/ :)"
], "Mixed User")
# Compare profiles
comparison = compare_emotion_profiles([happy_user, sad_user, mixed_user])
print(f"Profiles compared: {comparison['profiles_compared']}")
print("Sentiment range:", comparison['overall_comparison']['sentiment_range'])
print("Diversity range:", comparison['overall_comparison']['diversity_range'])
# Individual profile summaries
for profile in comparison['profile_summaries']:
print(f"{profile['name']}: sentiment={profile['overall_sentiment']:.3f}")
Analyze emoticon trends across multiple texts or time periods:
# Analyze trends across multiple texts
texts = [
"Day 1: Excited to start :D",
"Day 2: Going well :)",
"Day 3: Some challenges :/",
"Day 4: Feeling better :)",
"Day 5: Great finish :D :D"
]
labels = [f"Day {i+1}" for i in range(len(texts))]
trends = get_emoticon_trends(texts, labels)
print(f"Total texts analyzed: {trends['total_texts']}")
print("Sentiment trend:", trends['trend_summary']['sentiment_trend'])
print("Average sentiment:", trends['trend_summary']['average_sentiment_across_texts'])
# Most common emotions across all texts
print("Most common emotions:", trends['trend_summary']['most_common_emotions'])
The sentiment analysis extension provides powerful emotion detection capabilities:
from emoticon_fix import analyze_sentiment, extract_emotions, batch_analyze
# Detailed sentiment analysis
text = "Mixed feelings :) but also :( about this"
analysis = analyze_sentiment(text)
print(analysis.summary())
# Extract individual emotions
emotions = extract_emotions("Happy :) but worried :(")
for emoticon, emotion, score in emotions:
print(f"'{emoticon}' → {emotion} (score: {score:.3f})")
# Batch processing
texts = ["Happy :)", "Sad :(", "Excited :D"]
results = batch_analyze(texts)
The analytics extension provides comprehensive emoticon usage analysis capabilities:
from emoticon_fix import get_emoticon_statistics, EmoticonProfile
# Detailed emoticon statistics
text = "Super happy :D today! Great mood :) and excited (。♥‿♥。) for later!"
stats = get_emoticon_statistics(text)
# Access detailed information
print(f"Emoticon positions: {stats.emoticon_positions}")
print(f"Sentiment distribution: {stats.sentiment_distribution}")
print(f"Top 3 emoticons: {stats.get_top_emoticons(3)}")
print(f"Analysis timestamp: {stats.analysis_timestamp}")
# Create custom profile
profile = EmoticonProfile("Custom Analysis")
profile.add_text("First text :)", "text_1")
profile.add_text("Second text :(", "text_2")
profile.add_text("Third text :D", "text_3")
print(f"Emotion diversity: {profile.get_emotion_diversity():.3f}")
print(f"Sentiment consistency: {profile.get_sentiment_consistency():.3f}")
Export analysis results for further processing or visualization:
from emoticon_fix import export_analysis, get_emoticon_statistics, create_emotion_profile
# Export statistics to JSON
text = "Happy :) day with multiple :D emoticons!"
stats = get_emoticon_statistics(text)
# Export to JSON (default format)
json_file = export_analysis(stats, format="json", filename="emoticon_stats.json")
print(f"Exported to: {json_file}")
# Export to CSV
csv_file = export_analysis(stats, format="csv", filename="emoticon_stats.csv")
print(f"Exported to: {csv_file}")
# Export emotion profile
texts = ["Happy :)", "Sad :(", "Excited :D"]
profile = create_emotion_profile(texts, "Sample Profile")
profile_file = export_analysis(profile, format="json", filename="emotion_profile.json")
# Auto-generate filename with timestamp
auto_file = export_analysis(stats) # Creates: emoticon_analysis_YYYYMMDD_HHMMSS.json
JSON Export: Complete data structure with all metrics and metadata
{
"total_emoticons": 3,
"unique_emoticons": 2,
"emoticon_density": 12.5,
"emoticon_frequency": {":)": 2, ":D": 1},
"emotion_frequency": {"Smile": 2, "Laugh": 1},
"sentiment_distribution": {"positive": 3, "negative": 0, "neutral": 0},
"average_sentiment": 0.8,
"dominant_emotion": "Smile",
"analysis_timestamp": "2024-01-15T10:30:00"
}
CSV Export: Structured tabular format for spreadsheet analysis
from emoticon_fix import emoticon_fix
text = 'test :) test :D test'
result = emoticon_fix(text)
print(result) # Output: 'test Smile test Laugh test'
from emoticon_fix import emoticon_fix
text = 'Feeling (。♥‿♥。) today! When things go wrong ┗(^0^)┓ keep dancing!'
result = emoticon_fix(text)
print(result) # Output: 'Feeling In Love today! When things go wrong Dancing Joy keep dancing!'
from emoticon_fix import emoticon_fix
text = 'Western :) meets Eastern (◕‿◕✿) style!'
result = emoticon_fix(text)
print(result) # Output: 'Western Smile meets Eastern Sweet Smile style!'
from emoticon_fix import remove_emoticons
text = 'This message :D contains some (。♥‿♥。) emoticons that need to be removed!'
result = remove_emoticons(text)
print(result) # Output: 'This message contains some emoticons that need to be removed!'
from emoticon_fix import replace_emoticons
# Default format: __EMO_{tag}__
text = 'Happy customers :) are returning customers!'
result = replace_emoticons(text)
print(result) # Output: 'Happy customers __EMO_Smile__ are returning customers!'
# Custom format
text = 'User feedback: Product was great :D but shipping was slow :('
result = replace_emoticons(text, tag_format="<EMOTION type='{tag}'>")
print(result) # Output: 'User feedback: Product was great <EMOTION type='Laugh'> but shipping was slow <EMOTION type='Sad'>'
from emoticon_fix import create_emotion_profile, compare_emotion_profiles, export_analysis
# Analyze social media posts from different users
user1_posts = [
"Amazing product! :D Love it!",
"Great customer service :)",
"Highly recommended! (。♥‿♥。)"
]
user2_posts = [
"Product was okay :/",
"Shipping was slow :(",
"Could be better... :/"
]
user3_posts = [
"Mixed experience :) good product but :( bad delivery",
"Happy with purchase :) but upset about delay :(",
"Overall satisfied :) despite issues :/"
]
# Create emotion profiles
user1_profile = create_emotion_profile(user1_posts, "Satisfied Customer")
user2_profile = create_emotion_profile(user2_posts, "Dissatisfied Customer")
user3_profile = create_emotion_profile(user3_posts, "Mixed Customer")
# Compare profiles
comparison = compare_emotion_profiles([user1_profile, user2_profile, user3_profile])
# Export results
export_analysis(comparison, format="json", filename="customer_sentiment_analysis.json")
print("Customer sentiment analysis completed!")
print(f"Satisfied customer sentiment: {user1_profile.get_overall_sentiment():.3f}")
print(f"Dissatisfied customer sentiment: {user2_profile.get_overall_sentiment():.3f}")
print(f"Mixed customer sentiment: {user3_profile.get_overall_sentiment():.3f}")
from emoticon_fix import get_emoticon_trends, export_analysis
# Analyze emotional progression over time
weekly_posts = [
"Week 1: Starting new job :) excited!",
"Week 2: Learning lots :D challenging but fun!",
"Week 3: Feeling overwhelmed :( too much work",
"Week 4: Getting better :) finding my rhythm",
"Week 5: Confident now :D loving the work!",
"Week 6: Stress again :( big project deadline",
"Week 7: Relief! :D Project completed successfully!",
"Week 8: Balanced now :) happy with progress"
]
week_labels = [f"Week {i+1}" for i in range(len(weekly_posts))]
trends = get_emoticon_trends(weekly_posts, week_labels)
# Export trend analysis
export_analysis(trends, format="json", filename="emotional_journey.json")
print("Emotional journey analysis:")
sentiment_trend = trends['trend_summary']['sentiment_trend']
for i, sentiment in enumerate(sentiment_trend):
print(f"Week {i+1}: {sentiment:.3f}")
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
git checkout -b feature/amazing-feature
)git commit -m 'Add some amazing feature'
)git push origin feature/amazing-feature
)The package includes a comprehensive test suite. To run the tests:
pip install -e ".[dev]"
pytest
This project is licensed under the MIT License - see the LICENSE file for details.
FAQs
A lightweight and efficient library for transforming emoticons into their semantic meanings with sentiment analysis capabilities
We found that emoticon-fix demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket now supports Scala and Kotlin, bringing AI-powered threat detection to JVM projects with easy manifest generation and fast, accurate scans.
Application Security
/Security News
Socket CEO Feross Aboukhadijeh and a16z partner Joel de la Garza discuss vibe coding, AI-driven software development, and how the rise of LLMs, despite their risks, still points toward a more secure and innovative future.
Research
/Security News
Threat actors hijacked Toptal’s GitHub org, publishing npm packages with malicious payloads that steal tokens and attempt to wipe victim systems.