
Research
/Security News
Weaponizing Discord for Command and Control Across npm, PyPI, and RubyGems.org
Socket researchers uncover how threat actors weaponize Discord across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.
@yoctol/analytics
Advanced tools
Module | Description |
---|---|
yoctol-analytics | Yoctol's analytics module, which supports chatbot message and Facebook post analytics. |
yoctol-analytics-sdk | Analytics SDK for chatbot to track chat logs. |
Using this analytics module, you can get a analytics table like this.
$ npm install @yoctol/analytics
$ npm install @yoctol/analytics-sdk
First install sdk and log data into bot db using monk or knex client. Then analytics can generate corresponding files based on report setting.
interactionLogs
interaction_logs
switchToHumanPayloads
only supported in messenger platform.reporterSettings: [
{
title: '總覽',
className: 'Statistics',
config: {
switchToHumanPayloads: {
postback: ['__SWITCH_TO_HUMAN__'],
quick_reply: ['__SWITCH_TO_HUMAN__', '__INTENT_轉接專人__'],
},
},
},
{
title: '互動次數(每日)',
className: 'Histogram',
config: {
switchToHumanPayloads: {
postback: ['__SWITCH_TO_HUMAN__'],
quick_reply: ['__SWITCH_TO_HUMAN__', '__INTENT_轉接專人__'],
},
periodMinutes: 24 * 60,
},
},
{
title: '互動次數(每小時)',
className: 'Histogram',
config: {
switchToHumanPayloads: {
postback: ['__SWITCH_TO_HUMAN__'],
quick_reply: ['__SWITCH_TO_HUMAN__', '__INTENT_轉接專人__'],
},
periodMinutes: 60,
},
},
{
title: '按鈕觸發次數',
className: 'Postback',
config: {},
},
{
title: '訊息記錄',
className: 'Log',
config: {},
},
],
}
const logger = new AnalyticsLogger({ knexClient, monkClient, logDbName });
logger.insertLog({ id, platform, platformChannelId, direction, event, triggers })
Parameter | Description |
---|---|
id | analytics request id |
platform | platform name, available values 'line', 'messenger', 'universal' |
platformChannelId | Bot binding channel (for LINE)/page id (for Facebook) |
direction | user message (incoming) or bot message (outgoing), available values: 'incoming', 'outgoing' |
event | raw event from the specified platform. |
triggers | trigger context from NLU triggers (including intent-entity model, regular expression and keywords), see here |
kuratorProjectId | optional (special case for Fubon) |
triggers: trigger | [trigger] // single object or array of objects
trigger: {
intentId: "NLU INTENT ID",
intentName: "NLU INTENT NAME",
entityId: "NLU ENTITY ID",
entityName: "NLU ENTITY NAME",
entityValueId: "NLU ENTITY VALUE ID",
entityValueName: "NLU ENTITY VALUE NAME",
regexp: "^REGEXP$/g",
keywords: ["KEYWORD1", "KEYWORD2", "KEYWORD3"],
displayName: "ACTION NAME WHEN TRIGGERED BY PAYLOAD OR UNKNOWN",
actionId: "ACTION ID WHEN TRIGGERED BY PAYLOAD OR UNKNOWN"
isFallback: Boolean // unknown or not
}
Before the analytics pipeline, we use an adapter to transform platform format message to a generic format:
{
id: STRING (uuid)
direction: STRING ('incoming' | 'outgoing')
type: STRING ('text' | 'quick_reply' | 'button' | ...)
platform: STRING ('messenger' | 'line' | 'universal' | ...)
text: STRING
postback: STRING
context: {
trigger: JSON
... extendibleFields
},
raw: JSON
... extendibleFields
}
We arrange analytics module for yoctol chatbot in a pipeline structure:
Raw message logs from database (MongoDb/MSSQL) passes through the whole component and aggregates to analytics tables (in .xlsx/.csv/.json format)
For memory saving issue, this analytics module partially fetch messages from databases, and then aggregate each chunk into the final result table(s).
I18n tables are in src/locale/{bot, analytics}/{en, zh}.json
.
Use functions in src/i18n.js
to implement i18n in tables:
setLocale(locale)
: set locale to be 'zh'
or 'en'
translate(key, params, namespace = 'analytics')
: use the template of key
from namespace
to generate a i18n string with parameters params
e.g.
import { setLocale, translate as t } from '../i18n';
setLocale('zh');
const intentName = '表示開心';
const title = t('title_entities_of_intent', {
intentName,
});
console.log(title);
// output: 表示開心 的抽換詞類統計
new Analytics({ ... options });
Parameter | Description |
---|---|
kuratorProjectId | kurator project id for filtering specific project. |
platformChannelId | LINE channel id or messenger recipient id for filtering specific channel. |
mongoUrl | URL of mongoDB. |
mongoClient | mongoDB client compatible with native basic APIs, in here we use monk. |
knexClient | knex client for Relational DBs, we have only tested MSSQL. |
collectionNames | collection/table names of DB for analytics. See here |
platform | messaging platform. Now support: ‘line‘, ‘messenger‘, 'generic' |
startDate | starting date |
endDate | ending date |
analyzerSettings | settings for analyzers, see here. |
reporterSettings | settings for reporters, see here. |
locale | Specify locale for analytics. Now support: en , zh |
customAdapter | Adapter for generic message analytics. |
stream | stream processing mode (boolean) |
chunkSize | chunk size for stream, enabled only when stream is true |
definition | kurator definition for retrieving action names. |
default:
{
logsName: 'interactionLogs',
sessionsName: 'sessions',
}
e.g.
{
filteredUserIds: [],
conversationSplitMinutes: 15,
}
e.g.
[
{
title: 'title_overview',
className: 'Statistics',
config: {
switchToHumanPayloads: {
postback: ['__SWITCH_TO_HUMAN__'],
quick_reply: ['__SWITCH_TO_HUMAN__', '__INTENT_轉接專人__'],
},
},
},
]
An adapter transforms raw chatbots log into generic message format for later processing.
An analyzer is a auxiliary data processor for reporters. It consumes message logs and produces specific type of structured data.
The following are structures of each analyzer's output data.
Output: Map of Intent
Intent: {
id: Number,
count: Number,
entities: Map of Entity,
}
Entity: {
id: Number,
count: Number,
entityValues: Map of EntityValues,
}
EntityValue: {
id: Number,
count: Number,
}
e.g.
{
'intentName1': {
id: 1,
count: 123,
entities: {
'entityName1': {
id: 1,
count: 34,
entityValues: {
'entityValueName1': {
id: 1,
count: 16
},
'entityValueName2': {
id: 2,
count: 12
},
},
},
'entityName2': {
id: 2,
count: 47,
entityValues: {}
}
}
},
'intentName2': {
id: 2,
count: 456,
entities: {},
},
}
UserAnalyzer produces a list of user ids.
Output: List of Strings
UserConversationAnalyzer produces logs grouped by conversations, and conversations grouped by users.
Output: List of UserConversation
UserConversation: {
id: String,
conversations: List of Conversation,
conversationtCount: Number,
}
Conversation: List of ProcessedLog // logs processed by adapaters
e.g.
[
{
id: 'userId1',
conversations: [
[ {...rawLog1 }, { ...rawLog2 }, { ...rawLog3 }],
[ {...rawLog4 }, { ...rawLog5 }, { ...rawLog6 }],
],
conversationsCount: 2,
},
...
]
A reporter aggregates structured data tables
Here's the list of all reporters and their brief descriptions
These 4 reporters simply give (field value, count) pairs for specific field in chat logs.
HistogramReporter is to produce statistics based on time duration (daily, hourly)
example config:
{
switchToHumanPayloads: {
postback: ['__SWITCH_TO_HUMAN__'],
quick_reply: ['__SWITCH_TO_HUMAN__', '__INTENT_轉接專人__'],
},
periodMinutes: 24 * 60,
}
These 2 reporters produces intent stats and entity stats for each intents.
Unknown reporter shows all messages with unknown intent.
LogReporter produces raw logs
StatisticsReporter produces the general statistics table.
example config:
switchToHumanPayloads: {
postback: ['__SWITCH_TO_HUMAN__'],
quick_reply: ['__SWITCH_TO_HUMAN__', '__INTENT_轉接專人__'],
},
}
A writer writes JSON produces by reporters into files. Currently there are 3 Writers for different formats:
FAQs
Analytics for Yoctol bots
The npm package @yoctol/analytics receives a total of 14 weekly downloads. As such, @yoctol/analytics popularity was classified as not popular.
We found that @yoctol/analytics demonstrated a not healthy version release cadence and project activity because the last version was released a year ago. It has 6 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
/Security News
Socket researchers uncover how threat actors weaponize Discord across the npm, PyPI, and RubyGems ecosystems to exfiltrate sensitive data.
Security News
Socket now integrates with Bun 1.3’s Security Scanner API to block risky packages at install time and enforce your organization’s policies in local dev and CI.
Research
The Socket Threat Research Team is tracking weekly intrusions into the npm registry that follow a repeatable adversarial playbook used by North Korean state-sponsored actors.