Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement
Sign In

pi-web-utils

Package Overview
Dependencies
Maintainers
1
Versions
2
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pi-web-utils

Configurable web search, markdown-first webpage fetching, GitHub local repo search tools for pi coding agent

latest
Source
npmnpm
Version
0.1.1
Version published
Weekly downloads
19
58.33%
Maintainers
1
Weekly downloads
 
Created
Source

pi-web-utils

Configurable web tooling extension for pi-coding-agent.

It adds four tools:

  • web_search
  • fetch_webpage
  • clone_github_repo
  • search_local_repo

What it does

  • Search with configurable engines (google, duckduckgo, searxng, or custom) and ordered fallback.
  • Append engine-specific query params and headers from config and per-call overrides.
  • If a search engine returns HTML, optionally convert that raw HTML with the same formatter used by webpage fetch (markdown or structured json).
  • Fetch webpages as markdown by default.
  • Try markdown.new (https://markdown.new/<url>) first, then fall back to local HTML -> markdown/json conversion.
  • Clone GitHub repos from root, tree, and blob URLs with a cached local path.
  • Search cloned repos (or any local folder) with rg (fallback to grep).

Install

pi install npm:pi-web-utils

Or package/publish and install via npm/git like other pi packages.

Tool quick examples

// 1) Search with fallback chain from config
web_search({ query: "TypeScript project architecture patterns" })

// 2) Force a specific engine but still allow fallback
web_search({
  query: "SearXNG self-host setup",
  engineId: "searxng",
  allowFallback: true
})

// 3) Pass per-call query params to engine URL
web_search({
  query: "React suspense",
  engineId: "google",
  extraParams: { hl: "en", num: "10" }
})

// 4) Format raw HTML search response as markdown
web_search({
  query: "site:github.com pi-coding-agent extensions",
  engineId: "duckduckgo",
  rawHtmlFormat: "markdown"
})

// 5) Fetch webpage as markdown (markdown.new first)
fetch_webpage({ url: "https://docs.example.com/guide" })

// 6) Fetch webpage as structured JSON
fetch_webpage({
  url: "https://docs.example.com/guide",
  output: "json"
})

// 7) Clone GitHub repo URL
clone_github_repo({ url: "https://github.com/owner/repo" })

// 8) Clone tree/blob URLs too
clone_github_repo({ url: "https://github.com/owner/repo/tree/main/src" })
clone_github_repo({ url: "https://github.com/owner/repo/blob/main/README.md" })

// 9) Search latest cloned repo
search_local_repo({ query: "registerTool" })

// 10) Search a specific repo key
search_local_repo({
  repo: "owner/repo@main",
  query: "fetchWithTimeout",
  glob: "*.ts"
})

Configuration

Configuration file path (default):

~/.pi/web-tools.json

You can override path with env var:

PI_WEB_TOOLS_CONFIG=/path/to/config.json

Example config

{
  "search": {
    "includeBuiltins": true,
    "engines": [
      {
        "id": "searxng",
        "kind": "searxng",
        "baseUrl": "https://searx.example.com/search",
        "queryParams": {
          "format": "json",
          "language": "en-US"
        },
        "headers": {
          "x-api-key": "optional"
        },
        "timeoutMs": 20000
      },
      {
        "id": "duckduckgo",
        "enabled": true
      },
      {
        "id": "google",
        "queryParams": {
          "hl": "en"
        }
      },
      {
        "id": "my-custom-engine",
        "kind": "custom",
        "baseUrl": "https://search.example.com/query",
        "queryParam": "q",
        "queryParams": {
          "api": "v2"
        },
        "responseFormat": "json"
      }
    ],
    "fallbackOrder": ["searxng", "duckduckgo", "google", "my-custom-engine"],
    "maxResults": 8,
    "timeoutMs": 15000
  },
  "fetch": {
    "timeoutMs": 30000,
    "maxBodyChars": 120000,
    "markdownNew": {
      "enabled": true,
      "baseUrl": "https://markdown.new/",
      "timeoutMs": 20000
    }
  },
  "github": {
    "enabled": true,
    "clonePath": "/tmp/pi-web-utils/repos",
    "cloneTimeoutMs": 30000,
    "maxRepoSizeMB": 350,
    "maxTreeEntries": 200,
    "maxInlineFileChars": 100000
  },
  "localSearch": {
    "defaultMaxMatches": 80,
    "maxMatches": 300,
    "previewChars": 220,
    "timeoutMs": 15000
  }
}

Tool details

Parameters:

  • query (required)
  • engineId (optional)
  • fallbackOrder (optional)
  • maxResults (optional)
  • extraParams (optional key/value params appended to request)
  • allowFallback (optional, default true)
  • rawHtmlFormat (optional: none | markdown | json)
  • includeRawResponse (optional)

Behavior:

  • Picks an engine from config and tries fallback order on failure/empty parse.
  • Parses JSON result formats when possible.
  • Parses HTML result pages for Google/DDG/generic anchors.
  • Optionally formats raw HTML response via shared webpage formatter.

fetch_webpage

Parameters:

  • url (required)
  • output (markdown | json, default markdown)
  • preferMarkdownNew (default true)
  • maxChars
  • includeRawHtml

Behavior:

  • Try markdown.new endpoint (https://markdown.new/<url>).
  • If unavailable/invalid response, fetch directly.
  • Convert HTML locally with Readability + Turndown (markdown) or DOM extraction (json).

clone_github_repo

Parameters:

  • url (required)
  • forceClone
  • refresh
  • maxTreeEntries

Behavior:

  • Handles GitHub root/tree/blob URLs.
  • Clones to configured local cache path.
  • For large repos (over maxRepoSizeMB), returns API preview unless forceClone: true.
  • For commit-SHA URLs, returns API preview.
  • Returns local path and structured preview for follow-up tooling.

search_local_repo

Parameters:

  • query (required)
  • repo (optional clone key, e.g. owner/repo or owner/repo@branch)
  • path (optional)
  • glob (optional)
  • maxMatches (optional)
  • caseSensitive (optional)

Behavior:

  • Uses rg if available, otherwise grep.
  • Defaults to latest cloned repo if repo/path not provided.
  • Returns file/line/column match list.

Notes

  • This extension executes network requests and local git/shell commands.
  • Only install from trusted sources.
  • Search parsing for public HTML engines can change when provider markup changes; fallback order is important.
  • Google HTML scraping may be rate-limited/challenged depending on region/IP.

Development

bun install
bun run typecheck

Publish to npm

bun run typecheck
bun publish --access public

Before publishing, update these values in package.json:

  • author
  • repository.url
  • homepage
  • bugs.url
  • name (if pi-web-utils is already taken on npm)

Keywords

pi-package

FAQs

Package last updated on 25 Feb 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts