pipesync
Pull data from any API into any destination. Zero platform-specific code.
pipesync is a dynamic sync engine for Pica-connected platforms. You give it a JSON mapping config, it handles pagination, field mapping, deduplication, and incremental sync. The AI writes the config. The engine does the work.
Install
npm install @withone/pipesync
Quick Start
npx pipesync init --pica-key <your-key>
npx pipesync add gmail-emails --config '{
"platform": "gmail",
"connectionKey": "live::gmail::default::abc123",
"actionId": "conn_mod_def::xyz::list-messages",
"request": { "method": "GET", "path": "/users/me/messages", "queryParams": { "maxResults": 100 } },
"pagination": { "type": "cursor", "requestParam": "pageToken", "responseField": "nextPageToken", "itemsField": "messages" },
"record": { "type": "email", "mapping": { "subject": "payload.headers.Subject", "from": "payload.headers.From", "snippet": "snippet" }, "tags": ["gmail"] },
"externalRef": { "system": "gmail", "idField": "id" }
}'
npx pipesync pull gmail-emails
npx pipesync pull gmail-emails | jq '.data.subject'
npx pipesync pull gmail-emails > emails.jsonl
npm install @withone/mem
npx pipesync pull gmail-emails --output mem
How It Works
┌──────────────────────────┐
│ Your AI Tool │
│ (Claude, Cursor, etc.) │
│ │
│ 1. Pica MCP: learn API │
│ 2. Generate JSON config │
│ 3. Run pipesync CLI │
└────────────┬─────────────┘
│
▼
┌──────────────────────────┐
│ pipesync │
│ │
│ Pull → Map → Dedup → │
│ Output (stdout or mem) │
└──────┬───────────┬───────┘
│ │
▼ ▼
┌─────────┐ ┌────────┐
│Pica API │ │ stdout │
│(200+ APIs)│ │ or mem │
└─────────┘ └────────┘
- AI generates a mapping config using Pica MCP to understand the platform's API shape
- pipesync pulls data through Pica's passthrough API, handling pagination automatically
- Field mapping extracts what you need using dot-path notation (
payload.headers.Subject)
- Deduplication via keys or external references prevents duplicates on re-sync
- Output goes to stdout (NDJSON) by default, or @withone/mem for searchable storage
No AI runs at sync time. The engine is fully deterministic.
CLI
pipesync init [--pica-key <key>]
pipesync add <name> --config '<json>'
pipesync pull [name] [--output mem]
pipesync pull --full
pipesync status
pipesync list
pipesync show <name>
pipesync watch [--interval 30m]
pipesync remove <name>
pipesync update <name> --config '<json>'
The AI picks the right strategy based on the platform's API:
cursor | Token-based paging | Gmail, Notion, Slack |
offset | Offset + limit | Attio, HubSpot |
sync-token | Delta sync via token | Google Calendar |
link-header | Parse Link header | GitHub |
page-number | Increment page number | Many REST APIs |
none | Single request | Small endpoints |
Output Adapters
stdout (default) - NDJSON lines, pipe anywhere:
pipesync pull contacts | jq '.data.email'
pipesync pull contacts > contacts.jsonl
pipesync pull contacts | your-custom-script
@withone/mem (optional) - Searchable database with hybrid search:
npm install @withone/mem
pipesync pull contacts --output mem
npx mem search "john" -t contact
AI-Assisted Setup
Works best with an AI coding tool that has Pica MCP connected:
User: "Sync my Gmail emails"
AI:
1. Discovers Gmail actions via Pica MCP
2. Reads API response shape
3. Generates mapping config
4. Runs: pipesync add gmail-emails --config '...'
5. Runs: pipesync pull gmail-emails
6. Done: "Synced 1,247 emails"
Claude Code skills are included for a streamlined experience:
cp -r node_modules/@withone/pipesync/skills/* .claude/skills/
Then use /sync-setup and /sync-pull.
Docs
License
MIT