
Product
Introducing Repository Access Permissions and Custom Roles
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.
predict-data-types
Advanced tools
A lightweight, zero-dependency npm package that predicts data types for comma-separated values, including JSON objects, and validates URLs, phone numbers, email addresses, IP addresses, colors, percentages, and currency within string values.
When users upload CSV or JSON files, everything arrives as strings.
TypeScript and JavaScript can't help you here:
// ❌ TypeScript only knows static types
const userInput = "test@example.com"; // TypeScript thinks: string
const csvValue = "2024-01-01"; // TypeScript thinks: string
const formData = "42"; // TypeScript thinks: string
// TypeScript CANNOT detect these are email, date, and number at runtime
This library solves that problem with runtime type detection:
const { infer } = require("predict-data-types");
infer("test@example.com"); // → 'email' ✅
infer("2024-01-01"); // → 'date' ✅
infer("42"); // → 'number' ✅
infer('11:59 PM'); // → 'time' ✅
infer(["true", "false", "true"]);
// → 'boolean' ✅
infer({ name: "Alice", age: "25", email: "alice@example.com" });
// → { name: 'string', age: 'number', email: 'email' } ✅
infer([
{ name: "Alice", age: "25" },
{ name: "Bob", age: "30" },
]);
// → { name: 'string', age: 'number' } ✅
One smart function. Any input type.
Zero-dependency package for automatic data type detection from strings, arrays, and JSON objects. Detects 19+ data types including primitives, emails, URLs, UUIDs, dates,time, IPs, colors, percentages, hashtags, mentions, currency and file paths.
💡 Important: This library performs runtime type detection on string values, not static type checking. TypeScript is a compile-time type system for your code structure - this library analyzes actual data content at runtime. They solve completely different problems!
infer() function handles strings, arrays, objects, and arrays of objectsDataTypes for type-safe comparisons instead of string literalsnpm install predict-data-types
Real-world use cases showing what you can build:
📊 CSV Import Tool
// Auto-detect column types and transform data
const employees = parseCSV(file); // All values are strings
const schema = infer(employees);
// → { name: 'string', email: 'email', salary: 'currency', hire_date: 'date' }
🎨 Form Builder
// Auto-generate form fields with correct input types
const userData = { email: 'alice@example.com', age: '25', website: 'https://alice.dev' };
const types = infer(userData);
// → { email: 'email', age: 'number', website: 'url' }
// Generate: <input type="email">, <input type="number">, <input type="url">
🌐 API Analyzer
// Generate JSON Schema and TypeScript interfaces from API responses
const response = await fetch('/api/users').then(r => r.json());
const jsonSchema = infer(response, Formats.JSONSCHEMA);
// Use with Ajv, joi, or generate TypeScript types
✅ Data Validator
// Validate imported data quality
const expected = { email: DataTypes.EMAIL, age: DataTypes.NUMBER };
const actual = infer(importedData);
// Detect mismatches, missing fields, wrong types
👉 See full runnable examples in examples/ directory
| Type | Examples |
|---|---|
string | 'John', 'Hello World' |
number | 42, 3.14, -17, 1e10 |
boolean | true, false, yes, no |
email | user@example.com |
phone | 555-555-5555, (555) 555-5555 |
url | https://example.com |
uuid | 550e8400-e29b-41d4-a716-446655440000 |
date | 2023-12-31, 31/12/2023 |
ip | 192.168.1.1, 2001:0db8::1 |
macaddress | 00:1B:63:84:45:E6, 00-1B-63-84-45-E6 |
color | #FF0000, #fff, rgb(255, 0, 0), rgba(0, 255, 0, 0.5) |
percentage | 50%, -25% |
currency | $100, €50.99 |
hashtag | #hello, #OpenSource, #dev_community |
mention | @username, @user_name123, @john-doe |
cron | 0 0 * * *, */5 * * * *, 0 9-17 * * 1-5 |
emoji | 😀, 🎉, ❤️, 👍, ❌ |
filepath | /usr/local/bin, C:\\Program Files\\node.exe, ./src/index.js |
isbn | 978-0-596-52068-7, 0596520689, 043942089X |
array | [1, 2, 3] |
object | {"name": "John"} |
semver | 0.0.0 |
time | 23:59:59, 2:30 PM, 14:30 |
Use DataTypes constants instead of string literals for type-safe comparisons:
const { infer, DataTypes } = require("predict-data-types");
const type = infer("test@example.com");
// ✅ Type-safe with constants
if (type === DataTypes.EMAIL) {
console.log("Valid email!");
}
// ❌ Avoid string literals (error-prone)
if (type === 'email') { ... }
// All available constants:
DataTypes.STRING // 'string'
DataTypes.NUMBER // 'number'
DataTypes.BOOLEAN // 'boolean'
DataTypes.EMAIL // 'email'
DataTypes.PHONE // 'phone'
DataTypes.URL // 'url'
DataTypes.UUID // 'uuid'
DataTypes.DATE // 'date'
DataTypes.ARRAY // 'array'
DataTypes.OBJECT // 'object'
DataTypes.IP // 'ip'
DataTypes.MACADDRESS // 'macaddress'
DataTypes.COLOR // 'color'
DataTypes.PERCENTAGE // 'percentage'
DataTypes.CURRENCY // 'currency'
DataTypes.MENTION // 'mention'
DataTypes.CRON // 'cron'
DataTypes.HASHTAG // 'hashtag'
DataTypes.EMOJI // 'emoji'
DataTypes.FILEPATH // 'filepath'
DataTypes.SEMVER // 'semver'
DataTypes.TIME // 'time'
const predictDataTypes = require("predict-data-types");
const text = "John, 30, true, john@example.com, 2023-01-01, 0.0.0";
const types = predictDataTypes(text);
console.log(types);
// {
// 'John': 'string',
// '30': 'number',
// 'true': 'boolean',
// 'john@example.com': 'email',
// '2023-01-01': 'date',
// '0.0.0':'semver'
// }
infer() FunctionThe infer() function automatically adapts to any input type:
const { infer, DataTypes } = require("predict-data-types");
// Single value → DataType
infer("2024-01-01"); // → 'date'
infer("12:05 AM"); // → 'time'
infer("test@example.com"); // → 'email'
infer("@username"); // → 'mention'
infer("42"); // → 'number'
infer("#OpenSource"); // → 'hashtag'
infer("/usr/local/bin"); // → 'filepath'
infer(["#dev", "#opensource", "#community"]); // → 'hashtag'
// Ambiguous 3-char values (can be hex color or hashtag)
infer("#bad"); // → 'color' (default: hex takes priority)
infer("#bad", "none", { preferHashtagOver3CharHex: true }); // → 'hashtag'
// Array of values → Common DataType
infer(["1", "2", "3"]); // → 'number'
infer(["true", "false", "yes"]); // → 'boolean'
// Object → Schema
infer({
name: "Alice",
age: "25",
active: "true",
});
// → { name: 'string', age: 'number', active: 'boolean' }
// Array of objects → Schema
infer([
{ name: "Alice", age: "25", email: "alice@example.com" },
{ name: "Bob", age: "30", email: "bob@example.com" },
]);
// → { name: 'string', age: 'number', email: 'email' }
Generate standard JSON Schema for validation libraries (Ajv, etc.):
const { infer, Formats } = require("predict-data-types");
const data = {
name: "Alice",
age: "25",
email: "alice@example.com",
website: "https://example.com",
};
// Simple format (default)
infer(data);
// → { name: 'string', age: 'number', email: 'email', website: 'url' }
// JSON Schema format
infer(data, Formats.JSONSCHEMA);
// → {
// type: 'object',
// properties: {
// name: { type: 'string' },
// age: { type: 'number' },
// email: { type: 'string', format: 'email' },
// website: { type: 'string', format: 'uri' }
// },
// required: ['name', 'age', 'email', 'website']
// }
// Use with validation libraries
const Ajv = require("ajv");
const ajv = new Ajv();
const schema = infer(data, Formats.JSONSCHEMA);
const validate = ajv.compile(schema);
const valid = validate({
name: "Bob",
age: 30,
email: "bob@example.com",
website: "https://bob.dev",
});
const csvData = `name,age,active,email
John,30,true,john@example.com`;
const types = predictDataTypes(csvData, true);
// {
// 'name': 'string',
// 'age': 'number',
// 'active': 'boolean',
// 'email': 'email'
// }
The examples/ directory contains full, runnable code for real-world scenarios:
Each example includes:
Run any example:
cd examples/csv-import
node example.js
const { infer } = require('predict-data-types');
const complexString = "192.168.1.1, #FF0000, 50%, $100, 2023-12-31";
const types = infer(complexString.split(', ').map(v => ({ value: v })));
// { value: 'ip' } // Takes the most specific type found
// Or analyze each value separately:
const values = "192.168.1.1, #FF0000, 50%, $100, 2023-12-31".split(', ');
values.forEach(val => {
console.log(`${val}: ${infer(val)}`);
});
// 192.168.1.1: ip
// #FF0000: color
// 50%: percentage
// $100: currency
// 2023-12-31: date
infer(input, format?, options?)The main function - handles any input type:
Parameters:
input (string | string[] | Object | Object[]): Value(s) to analyzeformat (optional): Output format - Formats.NONE (default) or Formats.JSONSCHEMAoptions (optional): Configuration options
preferHashtagOver3CharHex (boolean, default: false): When true, treats ambiguous 3-character values like #bad, #ace as hashtags instead of hex colorsReturns:
DataType (string) - for single values and arrays of valuesSchema (Object) - for objects and arrays of objectsJSONSchema (Object) - when format is Formats.JSONSCHEMAExamples:
const { infer, Formats, DataTypes } = require('predict-data-types');
// Single values
infer("42"); // → 'number'
infer("test@example.com"); // → 'email'
// Arrays
infer(["1", "2", "3"]); // → 'number'
// Objects
infer({ age: "25", email: "test@example.com" });
// → { age: 'number', email: 'email' }
// Arrays of objects
infer([{ age: "25" }, { age: "30" }]);
// → { age: 'number' }
// JSON Schema format
infer({ name: "Alice", age: "25" }, Formats.JSONSCHEMA);
// → { type: 'object', properties: {...}, required: [...] }
// Hashtag field example
infer({ tag: "#OpenSource" }, Formats.JSONSCHEMA);
// {
// tag: { type: 'string', pattern: '^#[A-Za-z0-9_]+$' }
// }
DataTypes - Type-safe constants for comparisons:
DataTypes.STRING, DataTypes.NUMBER, DataTypes.BOOLEAN, DataTypes.EMAIL,
DataTypes.PHONE, DataTypes.URL, DataTypes.UUID, DataTypes.DATE,
DataTypes.IP, DataTypes.COLOR, DataTypes.PERCENTAGE, DataTypes.CURRENCY, DataTypes.HASHTAG, DataTypes.FILEPATH,
DataTypes.ARRAY, DataTypes.OBJECT, DataTypes.SEMVER, DataTypes.TIME
Formats - Output format constants:
Formats.NONE // Default simple schema
Formats.JSONSCHEMA // JSON Schema format
predictDataTypes(input, firstRowIsHeader) - For CSV strings only (use infer() instead)
Parameters:
input (string): Comma-separated string to analyzefirstRowIsHeader (boolean): Treat first row as headers (default: false)Returns: Object mapping field names/values to their data types
Example:
const types = predictDataTypes('name,age\nAlice,25', true);
// { name: 'string', age: 'number' }
Note: This function is maintained for backwards compatibility. New code should use infer().
Common Misconception: "Doesn't TypeScript already do this?"
No! TypeScript and this library serve completely different purposes:
| Feature | TypeScript | This Library |
|---|---|---|
| When it works | Compile-time | Runtime |
| What it checks | Your code structure | Actual data content |
| Scope | Static type annotations | Dynamic string analysis |
| Use case | Prevent coding errors | Analyze user-provided data |
Example:
// TypeScript
const value: string = "test@example.com";
// TypeScript knows: "value is a string"
// TypeScript DOESN'T know: "value contains an email address"
// This Library
const type = infer("test@example.com");
// Returns: 'email' ✅
// Detects the ACTUAL CONTENT at runtime
When to use this library:
TypeScript can't help with any of these - you need runtime type detection!
npm test # Run tests
npm run test:coverage # Run tests with coverage
npm run lint # Check code quality
npm run lint:fix # Fix lint issues
MIT License - see LICENSE file for details.
See CONTRIBUTING.md for contribution guidelines.
Author: Melih Birim
FAQs
A lightweight, zero-dependency npm package that predicts data types for comma-separated values, including JSON objects, and validates URLs, phone numbers, email addresses, IP addresses, colors, percentages, and currency within string values.
We found that predict-data-types demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Product
Socket now supports Custom Roles and Repository Access Permissions so organizations can control who can access specific repositories and actions.

Product
Socket MCP now lets AI assistants review org alerts, investigate threats using the Socket threat feed, and inspect package files in addition to dependency scoring.

Product
Socket Firewall blocks malicious VS Code and Open VSX extensions before install, protecting developers from compromised editor marketplaces.