New Research: Supply Chain Attack on Axios Pulls Malicious Dependency from npm.Details → →

Book a Demo Sign in

ytid

Package Overview

Advanced tools

Install Socket

Detect and block malicious and high-risk dependencies

Install

ytid

Create URL friendly short IDs just like YouTube

latest

Source

npm

Version: 1.3.0

Version published: 2 years ago

Maintainers: 1

Created: 3 years ago

Source

YouTube like ID

Create URL friendly short IDs just like YouTube.

Suitable for generating -

short IDs for new users.
referral code for a user in an affiliate program.
file names for user uploaded documents / resources.
short URLs (like bitly) for sharing links on social media platforms.
URL slug for dynamically generated content like blog posts, articles, or product pages.

Works with ES6 (ECMAScript):

Demo GIF of ytid working in ES6 (ECMAScript)

as well as with CommonJS:

Demo GIF of ytid working in ES6 CommonJS

Installation

Using npm:

npm i ytid

Using yarn:

yarn add ytid

Using pnpm:

pnpm i ytid

Usage

With ES6 (ECMAScript):

import { ytid } from "ytid";

console.log(ytid()); // gocwRvLhDf8

With CommonJS:

const { ytid } = require("ytid");

console.log(ytid()); // dQw4w9WgXcQ

FAQs

What are the possible characters in the ID?

YouTube uses 0-9, A-Z, a-z, _ and - as possible characters for the IDs. This makes each position in the ID have one of these 64 characters. However, as capital I and lowercase l appear very similar in the URLs (I → I, l → l), ytid excludes them.

Hence, ytid uses 0-9, A-H, J-Z, a-k, m-z, _ and - as possible characters in the ID.

Why should URL IDs be short?

A Backlinko's study, based on an analysis of 11.8 million Google search results, found that short URLs rank above long URLs.

And a Brafton study found a correlation between short URLs and more social shares, especially on platforms such as Twitter which have character limits.

These studies highlight the benefits of short URLs over long ones.

What if the ID contains any offensive word or phrase?

All the generated IDs are checked against a dataset of offensive / profane words to ensure they do not contain any inappropriate language.

As a result, ytid doesn't generate IDs like 7-GoToHell3 or shit9RcYjcM.

The dataset of offensive / profane words is a combination of various datasets -

Language	Dataset	Source	Instances (Rows)
English	Google's "what do you love" project	https://gist.github.com/jamiew/1112488	451
Bad Bad Words	https://www.kaggle.com/datasets/nicapotato/bad-bad-words	1617
Surge AI's The Obscenity List	https://github.com/surge-ai/profanity	1598
washyourmouthoutwithsoap	https://github.com/thisandagain/washyourmouthoutwithsoap	147
Spanish	Multilingual swear profanity	https://www.kaggle.com/datasets/miklgr500/jigsaw-multilingual-swear-profanity	366
Surge AI's Spanish Dataset	https://www.surgehq.ai/datasets/spanish-profanity-list	178
washyourmouthoutwithsoap	https://github.com/thisandagain/washyourmouthoutwithsoap	125
German	Bad Words in German	https://data.world/wordlists/dirty-naughty-obscene-and-otherwise-bad-words-in-german	65
Surge AI's German Dataset	https://www.surgehq.ai/datasets/german-profanity-list	165
washyourmouthoutwithsoap	https://github.com/thisandagain/washyourmouthoutwithsoap	133
French	Bad Words in French	https://data.world/wordlists/dirty-naughty-obscene-and-otherwise-bad-words-in-french	91
Multilingual swear profanity	https://www.kaggle.com/datasets/miklgr500/jigsaw-multilingual-swear-profanity	178
washyourmouthoutwithsoap	https://github.com/thisandagain/washyourmouthoutwithsoap	126

These datasets undergo the following preprocessing steps -

Firstly, all the datasets are combined into a single dataset.
Then the duplicate instances are removed.
Then two new datasets are created -
- A dataset in which all spaces are replaced with -.
- A dataset in which all spaces are replaced with _.
These two datasets are then combined to form a new dataset.
This ensures that the dataset contains phrases with spaces in the form of hyphen separated words as well as underscore separated words.
Then, duplicate values are removed from this new dataset.
Finally, only the instances that match the regex pattern ^[A-Za-z0-9_-]{0,11}$ are kept, while the rest are removed. This keeps the number of instances to a minimum by removing unnecessary words or phrases.

Preprocessing yields a dataset of 3656 instances, that helps ensure the generated IDs are safe for using in URLs and for sharing on social media platforms.

The preprocessing was done on this Colab Jupyter notebook.

Future release(s) will expand the dataset to include words / phrases from other languages (that use English alphabets).

Stargazers

License

Apache-2.0

Keywords

FAQs

What is ytid?

Is ytid well maintained?

Package last updated on 25 Nov 2023

Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

ytid

YouTube like ID

Installation

Usage

FAQs

What are the possible characters in the ID?

Why should URL IDs be short?

What if the ID contains any offensive word or phrase?

Stargazers

License

Keywords

Related posts

Attackers Are Impersonating a Linux Foundation Leader in Slack to Target Open Source Developers

North Korea’s Contagious Interview Campaign Spreads Across 5 Ecosystems, Delivering Staged RAT Payloads