
Research
2025 Report: Destructive Malware in Open Source Packages
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.
layoutlens
Advanced tools
Traditional UI testing is painful:
LayoutLens lets you test UIs the way humans see them - using natural language and domain expert knowledge:
# Basic analysis
result = await lens.analyze("https://example.com", "Is the navigation user-friendly?")
# Expert-powered analysis
result = await lens.audit_accessibility("https://example.com", compliance_level="AA")
# Returns: "WCAG AA compliant with 4.7:1 contrast ratio. Focus indicators visible..."
Instead of writing complex selectors and assertions, just ask questions like:
Get expert-level insights from built-in domain knowledge in accessibility, conversion optimization, mobile UX, and more.
✅ 95.2% accuracy on real-world UI testing benchmarks
pip install layoutlens
playwright install chromium # For screenshot capture
from layoutlens import LayoutLens
# Initialize (uses OPENAI_API_KEY env var)
lens = LayoutLens()
# Test any website or local HTML
result = await lens.analyze("https://your-site.com", "Is the header properly aligned?")
print(f"Answer: {result.answer}")
print(f"Confidence: {result.confidence:.1%}")
That's it! No selectors, no complex setup, just natural language questions.
Test single pages with custom questions:
# Test local HTML files
result = await lens.analyze("checkout.html", "Is the payment form user-friendly?")
# Test with expert context
from layoutlens.prompts import Instructions, UserContext
instructions = Instructions(
expert_persona="conversion_expert",
user_context=UserContext(
business_goals=["reduce_cart_abandonment"],
target_audience="mobile_shoppers"
)
)
result = await lens.analyze(
"checkout.html",
"How can we optimize this checkout flow?",
instructions=instructions
)
Perfect for A/B testing and redesign validation:
result = await lens.compare(
["old-design.html", "new-design.html"],
"Which design is more accessible?"
)
print(f"Winner: {result.answer}")
Domain expert knowledge with one line of code:
# Professional accessibility audit (WCAG expert)
result = await lens.audit_accessibility("product-page.html", compliance_level="AA")
# Conversion rate optimization (CRO expert)
result = await lens.optimize_conversions("landing.html",
business_goals=["increase_signups"], industry="saas")
# Mobile UX analysis (Mobile expert)
result = await lens.analyze_mobile_ux("app.html", performance_focus=True)
# E-commerce audit (Retail expert)
result = await lens.audit_ecommerce("checkout.html", page_type="checkout")
# Legacy methods still work
result = await lens.check_accessibility("product-page.html") # Backward compatible
Test multiple pages efficiently:
results = await lens.analyze(
sources=["home.html", "about.html", "contact.html"],
queries=["Is it accessible?", "Is it mobile-friendly?"]
)
# Processes 6 tests in parallel
# Async for maximum throughput
result = await lens.analyze(
sources=["page1.html", "page2.html", "page3.html"],
queries=["Is it accessible?"],
max_concurrent=5
)
All results provide clean, typed JSON for automation:
result = await lens.analyze("page.html", "Is it accessible?")
# Export to clean JSON
json_data = result.to_json() # Returns typed JSON string
print(json_data)
# {
# "source": "page.html",
# "query": "Is it accessible?",
# "answer": "Yes, the page follows accessibility standards...",
# "confidence": 0.85,
# "reasoning": "The page has proper heading structure...",
# "screenshot_path": "/path/to/screenshot.png",
# "viewport": "desktop",
# "timestamp": "2024-01-15 10:30:00",
# "execution_time": 2.3,
# "metadata": {}
# }
# Type-safe structured access
from layoutlens.types import AnalysisResultJSON
import json
data: AnalysisResultJSON = json.loads(result.to_json())
confidence = data["confidence"] # Fully typed: float
Choose from 6 built-in domain experts with specialized knowledge:
# Available experts: accessibility_expert, conversion_expert, mobile_expert,
# ecommerce_expert, healthcare_expert, finance_expert
# Use any expert with custom analysis
result = await lens.analyze_with_expert(
source="healthcare-portal.html",
query="How can we improve patient experience?",
expert_persona="healthcare_expert",
focus_areas=["patient_privacy", "health_literacy"],
user_context={
"target_audience": "elderly_patients",
"accessibility_needs": ["large_text", "simple_navigation"],
"industry": "healthcare"
}
)
# Expert comparison analysis
result = await lens.compare_with_expert(
sources=["old-design.html", "new-design.html"],
query="Which design converts better?",
expert_persona="conversion_expert",
focus_areas=["cta_prominence", "trust_signals"]
)
# Analyze a single page
layoutlens https://example.com "Is this accessible?"
# Analyze local files
layoutlens page.html "Is the design professional?"
# Compare two designs
layoutlens page1.html page2.html --compare
# Analyze with different viewport
layoutlens site.com "Is it mobile-friendly?" --viewport mobile
# JSON output for automation
layoutlens page.html "Is it accessible?" --output json
- name: Visual UI Test
run: |
pip install layoutlens
playwright install chromium
layoutlens ${{ env.PREVIEW_URL }} "Is it accessible and mobile-friendly?"
import pytest
from layoutlens import LayoutLens
@pytest.mark.asyncio
async def test_homepage_quality():
lens = LayoutLens()
result = await lens.analyze("homepage.html", "Is this production-ready?")
assert result.confidence > 0.8
assert "yes" in result.answer.lower()
LayoutLens includes a comprehensive benchmarking system to validate AI performance:
# Run LayoutLens against test data
python benchmarks/run_benchmark.py --api-key sk-your-key
# With custom settings
python benchmarks/run_benchmark.py \
--api-key sk-your-key \
--output benchmarks/my_results \
--no-batch \
--filename custom_results.json
# Evaluate results against ground truth
python benchmarks/evaluation/evaluator.py \
--answer-keys benchmarks/answer_keys \
--results benchmarks/layoutlens_output \
--output evaluation_report.json
The benchmark runner outputs clean JSON for analysis:
# Example benchmark result structure
{
"benchmark_info": {
"total_tests": 150,
"successful_tests": 143,
"failed_tests": 7,
"success_rate": 0.953,
"batch_processing_used": true,
"model_used": "gpt-4o-mini"
},
"results": [
{
"html_file": "good_contrast.html",
"query": "Is this page accessible?",
"answer": "Yes, the page has good color contrast...",
"confidence": 0.89,
"reasoning": "WCAG guidelines are followed...",
"success": true,
"error": null,
"metadata": {"category": "accessibility"}
}
]
}
Create your own test data and answer keys:
# Use the async API for custom benchmark workflows
from layoutlens import LayoutLens
async def run_custom_benchmark():
lens = LayoutLens()
test_cases = [
{"source": "page1.html", "query": "Is it accessible?"},
{"source": "page2.html", "query": "Is it mobile-friendly?"}
]
results = []
for case in test_cases:
result = await lens.analyze(case["source"], case["query"])
results.append({
"test": case,
"result": result.to_json(), # Clean JSON output
"passed": result.confidence > 0.7
})
return results
Simple configuration options:
# Via environment
export OPENAI_API_KEY="sk-..."
# Via code
lens = LayoutLens(
api_key="sk-...",
model="gpt-4o-mini", # or "gpt-4o" for higher accuracy
cache_enabled=True, # Reduce API costs
cache_type="memory", # "memory" or "file"
)
Making UI testing as simple as asking "Does this look right?"
FAQs
AI-powered UI testing framework with natural language visual validation
We found that layoutlens demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Research
Destructive malware is rising across open source registries, using delays and kill switches to wipe code, break builds, and disrupt CI/CD.

Security News
Socket CTO Ahmad Nassri shares practical AI coding techniques, tools, and team workflows, plus what still feels noisy and why shipping remains human-led.

Research
/Security News
A five-month operation turned 27 npm packages into durable hosting for browser-run lures that mimic document-sharing portals and Microsoft sign-in, targeting 25 organizations across manufacturing, industrial automation, plastics, and healthcare for credential theft.