sensitive-data-masker
A high-performance TypeScript library for detecting and masking sensitive data in strings. Protect PII, API keys, tokens, credentials, and other confidential information with intelligent masking algorithms and configurable accuracy levels.

Features
- π‘οΈ 200+ Detection Patterns: Comprehensive coverage for modern security needs
- β‘ High Performance: Optimized regex engine with pattern caching
- π― Accuracy Control: Configure detection sensitivity (high/medium/low)
- π§ Flexible Masking: Smart partial masking that preserves readability
- π¦ Zero Dependencies: Lightweight and secure
- π International Support: Handles US, UK, Canadian, and international formats
- π Pattern Filtering: Include or exclude specific pattern types
- π Detailed Results: Get match counts, positions, and masked values
Installation
npm install sensitive-data-masker
yarn add sensitive-data-masker
Quick Start
import { mask, hasSensitiveContent, getPatternMatches } from 'sensitive-data-masker';
const text = 'My email is john@example.com and my SSN is 123-45-6789';
const result = mask(text);
console.log(result.output);
console.log(result.found);
const isSensitive = hasSensitiveContent(text);
console.log(isSensitive);
const matches = getPatternMatches(text);
console.log(matches);
API Reference
mask(input: string, options?: MaskingOptions): MaskResult
Masks sensitive content in a string using intelligent partial masking.
Options
interface MaskingOptions {
maskChar?: string;
preserveLength?: boolean;
excludePatterns?: string[];
onlyPatterns?: string[];
matchAccuracy?: 'high' | 'medium' | 'low';
}
Returns
interface MaskResult {
output: string;
found: { [name: string]: number };
matches: string[];
masked: string[];
}
hasSensitiveContent(input: string, options?): boolean
Quickly check if a string contains sensitive data without performing masking.
import { hasSensitiveContent } from 'sensitive-data-masker';
hasSensitiveContent('user@example.com');
hasSensitiveContent('hello world');
hasSensitiveContent('sk-1234567890abcdef', {
matchAccuracy: 'high',
excludePatterns: ['genericId']
});
getPatternMatches(input: string, options?): PatternMatch[]
Get detailed information about all pattern matches including their positions.
import { getPatternMatches } from 'sensitive-data-masker';
const matches = getPatternMatches('Contact: admin@test.com and key: sk-123abc');
console.log(matches);
Advanced Usage
Custom Masking Options
import { mask } from 'sensitive-data-masker';
const result = mask('API key: sk-1234567890abcdef', { maskChar: '#' });
console.log(result.output);
const result2 = mask('secret123', { preserveLength: true });
console.log(result2.output);
const result3 = mask('sk-1234567890abcdef', { matchAccuracy: 'high' });
console.log(result3.output);
Pattern Filtering
const result = mask('Email: user@test.com, API: sk-123', {
onlyPatterns: ['email', 'openaiApiKey']
});
const result2 = mask('Email: user@test.com, UUID: 123e4567-e89b-12d3-a456-426614174000', {
excludePatterns: ['uuid', 'genericId']
});
const result3 = mask(sensitiveText, {
matchAccuracy: 'high',
excludePatterns: ['uuid']
});
Supported Pattern Categories
The library detects sensitive data across 25 categories with 200+ patterns:
π Personal Identifiable Information (PII)
- Email addresses (multiple formats)
- Phone numbers (US, International, E.164)
- Social Security Numbers (US with various formats)
- Driver's license numbers, Medical record numbers
- Tax IDs (TIN/EIN), Canadian SIN, UK National Insurance Numbers
βοΈ Cloud Provider Credentials
- AWS: Access keys, secret keys, session tokens, account IDs
- AWS Resources: EC2, S3, RDS, Lambda ARNs, VPC IDs
- Azure: Subscription IDs, client secrets, resource IDs
- Google Cloud: API keys, service account keys, project IDs
π³ Financial & Payment Services
- Credit card numbers (Visa, MasterCard, Amex, Discover)
- Stripe: Secret keys, publishable keys, webhook secrets
- PayPal: Access tokens, client IDs
- Square: Access tokens, application IDs
- Bank account numbers (US routing numbers, IBAN)
π€ AI Provider Credentials
- OpenAI: API keys, organization IDs
- Anthropic/Claude: API keys
- Google AI: Gemini API keys, Vertex AI tokens
- Hugging Face: Access tokens, API keys
- Other AI: Groq, Perplexity, Replicate, Together AI
π Authentication & Security
- JWT tokens, Bearer tokens
- OAuth access tokens, refresh tokens
- API keys in headers (
X-API-Key
, Authorization
)
- Session IDs, CSRF tokens
- Generic secret patterns in environment variables
π§ Developer Tools & Services
- GitHub: Personal access tokens, app tokens
- Slack: Bot tokens, webhook URLs, app secrets
- Discord: Bot tokens, webhook URLs
- Analytics: Google Analytics, Mixpanel, Amplitude
- Monitoring: Datadog, New Relic, Sentry keys
ποΈ Database & Storage
- Database connection strings (PostgreSQL, MySQL, MongoDB)
- File Storage: S3 bucket URLs, Azure Blob Storage
- CDN: CloudFront URLs, Azure CDN
- Redis connection strings, Elasticsearch URLs
π Cryptographic Materials
- RSA private keys, SSH private keys
- EC private keys, DSA private keys
- X.509 certificates, PGP private key blocks
- JSON Web Keys (JWK), PKCS#8 keys
π Network & Location
- IPv4/IPv6 addresses, MAC addresses
- Geographic coordinates (latitude/longitude)
- Private network ranges, subnet masks
- URL patterns with embedded secrets
π± Communication Services
- Messaging: Twilio, SendGrid, Mailgun keys
- Social Media: Twitter, Facebook, Instagram tokens
- Email Services: Mailchimp, Postmark, SparkPost
- SMS/Voice: Nexmo, Plivo, MessageBird
π οΈ Infrastructure & DevOps
- Container Registries: Docker Hub, ECR, GCR tokens
- CI/CD: Jenkins, GitLab CI, CircleCI tokens
- Deployment: Vercel, Netlify, Heroku tokens
- Monitoring: PagerDuty, Datadog, New Relic
π’ Enterprise & Business
- CRM: Salesforce, HubSpot tokens
- E-commerce: Shopify, WooCommerce keys
- Business Tools: Slack, Microsoft Teams tokens
- Analytics: Google Analytics, Adobe Analytics
π― Generic Patterns
- UUID v4, Generic IDs
- Base64 encoded secrets
- Hex-encoded keys (32, 64, 128 bit)
- Custom secret patterns in configuration files
π URL & Reference Patterns
- URLs with embedded tokens
- Database connection URIs
- API endpoints with keys
- Webhook URLs with secrets
πΎ Version Control & Code
- Git repository URLs with tokens
- Package manager tokens (npm, PyPI)
- Container registry credentials
- Code hosting platform tokens
Pattern Accuracy Levels
Control detection sensitivity to balance between security and false positives:
High Accuracy
- Most specific patterns with minimal false positives
- Examples: AWS access keys with
AKIA
prefix, specific API key formats
- Best for production environments
Medium Accuracy (Default)
- Balanced detection with reasonable false positive rates
- Examples: Generic API keys, common secret patterns
- Good for most use cases
Low Accuracy
- Broadest detection, may have higher false positive rates
- Examples: Generic IDs, loose pattern matching
- Useful for comprehensive scanning
const prodResult = mask(text, { matchAccuracy: 'high' });
const devResult = mask(text, { matchAccuracy: 'medium' });
const scanResult = mask(text, { matchAccuracy: 'low' });
TypeScript Support
Full TypeScript support with complete type definitions:
import { mask, hasSensitiveContent, getPatternMatches } from 'sensitive-data-masker';
import type { MaskResult, MaskingOptions } from 'sensitive-data-masker';
const options: MaskingOptions = {
maskChar: '#',
matchAccuracy: 'high',
excludePatterns: ['uuid']
};
const result: MaskResult = mask(text, options);
Real-World Examples
Log File Sanitization
import { mask } from 'sensitive-data-masker';
const logEntry = `
[2024-01-15 10:30:45] INFO User john@company.com logged in
[2024-01-15 10:31:12] DEBUG API call with key sk-1234567890abcdef
[2024-01-15 10:31:15] ERROR Payment failed for card 4111-1111-1111-1111
[2024-01-15 10:31:20] WARN SSN in request: 123-45-6789
`;
const sanitized = mask(logEntry);
console.log(sanitized.output);
console.log(sanitized.found);
Configuration File Security
const config = `
DATABASE_URL=postgresql://user:password123@localhost:5432/db
OPENAI_API_KEY=sk-1234567890abcdef1234567890abcdef
STRIPE_SECRET_KEY=sk_live_abcdef123456
ADMIN_EMAIL=admin@company.com
JWT_SECRET=super-secret-key-123
`;
const result = mask(config);
console.log(result.output);
Multi-Environment Setup
import { mask } from 'sensitive-data-masker';
const prodResult = mask(sensitiveData, { matchAccuracy: 'high' });
const devResult = mask(sensitiveData, {
matchAccuracy: 'medium',
excludePatterns: ['email']
});
const testResult = mask(sensitiveData, {
onlyPatterns: ['creditCard', 'bankAccount', 'ssn'],
matchAccuracy: 'high'
});
Data Pipeline Processing
import { hasSensitiveContent, mask } from 'sensitive-data-masker';
function processBatch(records: string[]) {
const results = records.map(record => {
if (hasSensitiveContent(record)) {
const masked = mask(record, { matchAccuracy: 'high' });
return {
data: masked.output,
hadSensitiveData: true,
patternsFound: Object.keys(masked.found)
};
}
return { data: record, hadSensitiveData: false };
});
return results;
}
Performance Considerations
- Optimized Regex Engine: Patterns are compiled and cached on first use
- Single-Pass Processing: Efficient string traversal with minimal overhead
- Memory Efficient: No unnecessary string copies or allocations
- Pattern Filtering: Use
onlyPatterns
when you know which types to look for
- Accuracy Optimization: Higher accuracy modes are faster due to more specific patterns
const emailsOnly = mask(text, { onlyPatterns: ['email'] });
const highAccuracy = mask(text, { matchAccuracy: 'high' });
const comprehensive = mask(text, { matchAccuracy: 'low' });
Security Best Practices
- Always mask before logging: Ensure sensitive data is masked before writing to logs
- Use appropriate accuracy: Higher accuracy for production, lower for development/testing
- Store results securely: The
matches
array contains original sensitive values
- Regular updates: Keep the library updated for new pattern definitions
- Test your patterns: Verify masking works correctly with your specific data formats
- Environment-specific config: Use different settings for dev/staging/production
Development
Prerequisites
- Node.js >= 18.12.0
- Yarn or npm
Setup
git clone https://github.com/bgauryy/sensitive-data-mask.git
cd sensitive-data-mask
yarn install
Commands
yarn build
yarn dev
yarn lint
yarn test
yarn typecheck
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Adding New Patterns
- Choose the appropriate category file in
src/regexes/
- Add your pattern following the existing structure:
{
name: 'myPattern',
regex: /your-regex-here/gi,
description: 'Description of what this detects',
matchAccuracy: 'medium'
}
- Run tests to ensure no regressions
- Submit a PR with a clear description
License
MIT Β© guybary
Security
If you discover a security vulnerability, please email guybary@wix.com instead of using the issue tracker.
Made with β€οΈ for developers who care about data security