Big News: Socket raises $60M Series C at a $1B valuation to secure software supply chains for AI-driven development.Announcement β†’
Sign In

pi-web-providers

Package Overview
Dependencies
Maintainers
1
Versions
17
Alerts
File Explorer

Advanced tools

Socket logo

Install Socket

Detect and block malicious and high-risk dependencies

Install

pi-web-providers

Configurable web access extension for pi with per-tool provider routing and explicit provider option schemas for search, contents, quick grounded answers, and research.

latest
Source
npmnpm
Version
3.3.0
Version published
Weekly downloads
380
2.43%
Maintainers
1
Weekly downloads
Β 
Created
Source

🌍 pi-web-providers

A meta web extension for pi that routes search, content extraction, quick grounded answers, and research through configurable per-tool providers, with explicit provider-specific option schemas for each managed tool.

Why?

Most web extensions hard-wire a single backend. pi-web-providers lets you mix and match providers per tool instead, so web_search, web_contents, web_answer, and web_research can each use a different backend or be turned off entirely. Treat web_answer as a fast path for simple grounded questions, not as a replacement for source inspection or deeper research.

✨ Features

  • Multiple providers: Brave, Claude, Cloudflare, Codex, Exa, Firecrawl, Gemini, Linkup, Ollama, OpenAI, Perplexity, Parallel, Serper, Tavily, Valyu
  • Provider-aware tool options: pi only exposes the provider settings that actually apply to the backend you selected, so tool calls are easier to discover and harder to get wrong
  • Batched search and answers: run several related queries or questions in a single web_search or web_answer call and get grouped results back in one response
  • Background contents prefetch: optionally start web_contents extraction from web_search results in the background and reuse the cached pages later for faster follow-up reads

πŸš€ Installation

pi install npm:pi-web-providers

βš™οΈ Configuration

Run:

/web-providers

This edits the global config file ~/.pi/agent/web-providers.json. The settings UI mirrors the three sections below: tools, providers, and settings.

Each tool can be routed to any compatible provider:

Provider credentials can be literal values, environment variable names, or !command references. pi-web-providers resolves those secrets lazily on the first matching tool use, not when a session starts. This avoids startup delays from password-manager commands such as op; missing or failing secrets are reported when the tool is actually called.

Built-in local providers

ProvidersearchcontentsanswerresearchAuth
Claudeβœ”βœ”Local Claude Code auth
Codexβœ”Local Codex CLI auth

API-backed providers

ProvidersearchcontentsanswerresearchAuth
Braveβœ”βœ”βœ”BRAVE_SEARCH_API_KEY / BRAVE_ANSWERS_API_KEY
Cloudflareβœ”CLOUDFLARE_API_TOKEN + CLOUDFLARE_ACCOUNT_ID
Exaβœ”βœ”βœ”βœ”EXA_API_KEY
Firecrawlβœ”βœ”βœ”FIRECRAWL_API_KEY
Geminiβœ”βœ”βœ”GOOGLE_API_KEY
Linkupβœ”βœ”βœ”LINKUP_API_KEY
Ollamaβœ”βœ”OLLAMA_API_KEY
OpenAIβœ”βœ”βœ”OPENAI_API_KEY
Parallelβœ”βœ”PARALLEL_API_KEY
Perplexityβœ”βœ”βœ”PERPLEXITY_API_KEY
Serperβœ”SERPER_API_KEY
Tavilyβœ”βœ”TAVILY_API_KEY
Valyuβœ”βœ”βœ”βœ”VALYU_API_KEY

Advanced option: custom can route any managed tool through a local wrapper command using a JSON stdin/stdout contract.

See example-config.json for the minimal default configuration.

Tools

Each managed tool maps to one provider id under the top-level tools key. Removing a tool mapping turns that tool off. A tool is only exposed when it is mapped to a compatible provider and that provider is currently available. Shared defaults and tool-specific settings live under settings; search-specific settings live under settings.search, and async research uses settings.researchTimeoutMs. Provider option schemas are strict: only the keys shown for the active provider are accepted.

Search the public web for up to 10 queries in one call. It returns grouped titles, URLs, and snippets for each query. Batch related queries when grouped comparison matters; use separate sibling web_search calls when independent results should arrive as soon as they are ready.

Parameters and behavior
ParameterTypeDefaultDescription
queriesstring[]requiredOne or more search queries to run (max 10)
maxResultsinteger5Result count per query, clamped to 1–20
optionsobjectβ€”Provider-specific settings exposed by the selected provider schema

options is omitted when the configured search provider has no per-call provider options. Runtime controls are not accepted in tool calls. Configure retry, timeout, and background contents prefetch under settings and settings.search; prefetch starts only when settings.search.provider is set.

web_contents

Read the main text from one or more web pages. It reuses cached pages when they match and fetches only missing or stale URLs. Batch related pages when they are meant to be read as one bundle; use separate sibling web_contents calls when each page can be acted on independently.

Parameters and behavior
ParameterTypeDefaultDescription
urlsstring[]requiredOne or more URLs to extract
optionsobjectβ€”Provider-specific settings exposed by the selected provider schema

web_contents reuses any matching cached pages already present in the local in-memory cacheβ€”whether they came from prefetch or an earlier readβ€”and only fetches missing or stale URLs.

web_answer

Answer one or more simple factual questions using web-grounded evidence. Use it as a lightweight shortcut when you want a concise grounded answer without manually selecting and reading sources. Prefer web_search plus web_contents when source selection matters or you need to inspect primary sources directly; prefer web_research for open-ended, controversial, or multi-step investigations.

When you ask more than one question, the response is grouped into per-question sections. Batch related questions when the answers belong together; split them into sibling calls when earlier independent answers can unblock the next step. Collapsed results show a short answer preview; expand the tool result to read the full answer.

Parameters and behavior
ParameterTypeDefaultDescription
queriesstring[]requiredOne or more questions to answer in one call (max 10)
optionsobjectβ€”Provider-specific settings exposed by the selected provider schema

Responses are grouped into per-question sections when more than one question is provided.

web_research

Investigate a topic across web sources and produce a longer report. web_research is always asynchronous: it starts a background run, returns a short dispatch notice immediately, and later posts a completion message with a saved report path.

Parameters and behavior
ParameterTypeDefaultDescription
inputstringrequiredResearch brief or question
optionsobjectβ€”Provider-specific settings exposed by the selected provider schema

options is provider-specific. Equivalent concepts can use different field names across SDKsβ€”for example Perplexity uses country, Exa uses userLocation, and Valyu uses countryCode. Runtime controls are not accepted in tool calls.

Unlike the other managed tools, web_research does not accept local timeout, retry, polling, or resume controls. Research has one opinionated execution style: pi starts it asynchronously, tracks it locally, and saves the final report under .pi/artifacts/research/.

Providers

The built-in providers below integrate with official SDKs or documented APIs.

Brave
  • API: Brave Search API and Brave Answers API
  • Supports web_search via Web Search, plus optional llm_context, news, videos, images, and places search modes
  • Supports web_answer and web_research via Brave Answers streaming chat completions
  • Uses the Answers API replacement for Brave's deprecated Summarizer API; answer and research search controls are sent through web_search_options
  • Exposes the Answers model option (brave or brave-pro) for answer and research calls
  • web_contents stays routed to URL-fetch providers; Brave LLM Context is query-based retrieval and is exposed as a search mode instead
  • Brave Answers may require a different key or plan than Brave Search

Setup

{
  "tools": {
    "search": "brave",
    "answer": "brave",
    "research": "brave"
  },
  "providers": {
    "brave": {
      "credentials": {
        "search": "BRAVE_SEARCH_API_KEY",
        "answers": "BRAVE_ANSWERS_API_KEY"
      }
    }
  }
}

Use providers.brave.options.search.mode or per-call search options to select llm_context, news, videos, images, or places. Places details and descriptions are opt-in because they can add calls, latency, and place-specific semantics.

Claude
  • SDK: @anthropic-ai/claude-agent-sdk
  • Uses Claude Code's built-in WebSearch and WebFetch tools with structured JSON output
  • Exposes model, thinking, effort, maxThinkingTokens, maxTurns, and maxBudgetUsd as provider options for search and answer calls
  • Great for search plus grounded answers if you already use Claude Code locally
Cloudflare
  • SDK: cloudflare
  • Supports web_contents via Cloudflare Browser Rendering's /markdown endpoint
  • Good for JavaScript-heavy pages that need a real browser render before extraction
  • Exposes gotoOptions.waitUntil as the provider-specific contents option

Setup

  • In the Cloudflare dashboard, create an API token.
  • Grant it this permission:
    • Account | Browser Rendering | Edit
  • Scope it to the account you want to use.
  • Copy that account's Account ID from the Cloudflare dashboard.
  • Configure pi with both values:
{
  "tools": {
    "contents": "cloudflare"
  },
  "providers": {
    "cloudflare": {
      "credentials": {
        "api": "CLOUDFLARE_API_TOKEN"
      },
      "accountId": "CLOUDFLARE_ACCOUNT_ID"
    }
  }
}

If Cloudflare returns 401 Authentication error, the token permission, token scope, or account ID is usually wrong.

Codex
  • SDK: @openai/codex-sdk
  • Runs in read-only mode with web search enabled
  • Exposes model, modelReasoningEffort, and webSearchMode as provider options for web_search
  • Best if you already use the local Codex CLI and auth flow
Exa
  • SDK: exa-js
  • Supports web_search, web_contents, web_answer, and web_research
  • web_research is exposed through pi's async research workflow
  • Neural, keyword, hybrid, and deep-research search modes
  • Inline text-content extraction on search results
  • Exposes search options such as category, type, date filters, includeDomains, excludeDomains, userLocation, and contents
  • Persisted Exa defaults are scoped under providers.exa.options.search
  • web_contents, web_answer, and web_research currently use fixed provider behavior with no extra per-call provider options
Firecrawl
  • SDK: @mendable/firecrawl-js
  • Supports web_search, web_contents, and page-scoped web_answer
  • Search can optionally include Firecrawl scrape-backed result enrichment
  • Contents extraction uses Firecrawl scrape with markdown-first defaults
  • Answers use Firecrawl scrape's question format against one explicit page URL; set options.url in the web_answer call or providers.firecrawl.options.answer.url as a default
  • Exposes search options such as lang, country, sources, categories, location, timeout, and scrapeOptions
  • Exposes contents options such as formats, onlyMainContent, includeTags, excludeTags, waitFor, headers, location, mobile, and proxy
  • Exposes answer options url, onlyMainContent, includeTags, excludeTags, waitFor, headers, location, mobile, and proxy
  • Firecrawl charges 5 credits per page for the question format
  • Optional baseUrl overrides are supported for self-hosted Firecrawl instances, proxies, and testing. API keys are required for Firecrawl Cloud, but can be omitted for self-hosted endpoints that do not enforce authentication.
Gemini
  • SDK: @google/genai
  • Supports web_search, web_answer, and web_research
  • web_research is exposed through pi's async research workflow
  • Google Search grounding for answers
  • Deep-research agents via Google's Gemini API
  • Exposes model and generation_config for search, model and config for answers, and only the conservative deep-research option agent_config.thinking_summaries for research
  • Gemini research intentionally does not expose or send Interactions API tools, response_format, response_modalities, or system_instruction because the default deep-research agent rejects several of those fields
Linkup
  • SDK: linkup-sdk
  • Supports web_search via Linkup Search with fixed searchResults output
  • Supports web_contents via Linkup Fetch and always returns markdown
  • Supports web_research via Linkup's async Research API
  • Exposes search options depth, includeImages, includeDomains, excludeDomains, fromDate, and toDate
  • Exposes contents options renderJs, includeRawHtml, and extractImages
  • Exposes research options outputType, mode, reasoningDepth, includeDomains, excludeDomains, fromDate, toDate, and structuredOutputSchema
  • Research defaults to sourcedAnswer; provide structuredOutputSchema for structured output
  • Good fit for search, markdown extraction, and long-running sourced research without extra provider wiring
Ollama
  • API: Ollama Web Search and Fetch API
  • Supports web_search via Ollama's POST /api/web_search endpoint
  • Supports web_contents via Ollama's POST /api/web_fetch endpoint
  • Authenticates with an Ollama API key using OLLAMA_API_KEY by default
  • Optional baseUrl overrides the default https://ollama.com API host for proxies or compatible endpoints
  • Ollama caps search requests at 10 results, so web_search.maxResults is clamped to 1–10 for this provider

Minimal config:

{
  "tools": {
    "search": "ollama",
    "contents": "ollama"
  },
  "providers": {
    "ollama": {
      "credentials": {
        "api": "OLLAMA_API_KEY"
      }
    }
  }
}
OpenAI
  • SDK: openai
  • Supports web_search, web_answer, and web_research
  • Uses the Responses API for structured web search, grounded answers, and deep-research runs
  • Always enables OpenAI's built-in web_search_preview tool for search, answer, and research calls
  • Exposes model and instructions for web_search and web_answer
  • Exposes model, instructions, and max_tool_calls for web_research
  • Good fit when you want official OpenAI web-grounded search, answers, and deep research behind pi's managed tool abstractions

Setup

  • Create or reuse an OpenAI API key.
  • Configure pi to route web_search, web_answer, web_research, or any subset of them to openai.
  • Optionally set default models under providers.openai.options.search.model, providers.openai.options.answer.model, and providers.openai.options.research.model.
{
  "tools": {
    "search": "openai",
    "answer": "openai",
    "research": "openai"
  },
  "providers": {
    "openai": {
      "credentials": {
        "api": "OPENAI_API_KEY"
      },
      "options": {
        "search": {
          "model": "gpt-4.1"
        },
        "answer": {
          "model": "gpt-4.1"
        },
        "research": {
          "model": "o4-mini-deep-research"
        }
      }
    }
  }
}

You can also set instructions as a provider default under providers.openai.options.search, providers.openai.options.answer, or providers.openai.options.research, and set max_tool_calls under providers.openai.options.research. All of them can also be overridden per call.

Perplexity
  • SDK: @perplexity-ai/perplexity_ai
  • Supports web_search, web_answer, and web_research
  • web_research is exposed through pi's async research workflow
  • Uses Perplexity Search for web_search
  • Uses Sonar for web_answer and sonar-deep-research for web_research
  • Exposes search options country, search_mode, search_domain_filter, and search_recency_filter
  • Exposes model for answer and research calls
Parallel
  • SDK: parallel-web
  • Agentic and one-shot search modes
  • Page content extraction with excerpt and full-content toggles
  • Exposes search option mode
  • Exposes contents options excerpts and full_content
Serper
  • API: Serper HTTP API
  • Supports web_search via Serper's Google endpoints for web, image, video, places, maps, reviews, news, shopping, product reviews, Lens, Scholar, patents, autocomplete, and webpage results
  • Good fit for fast, straightforward Google-style search results
  • Exposes search options mode, gl, hl, location, page, tbs, autocorrect, and mode-specific fields such as url, ll, placeId, cid, fid, productId, nextPageToken, and webpage include flags. Reviews mode uses the top-level query when no place identifier is provided. includeMarkdown defaults to true for webpage scraping
  • Preserves rich metadata from Serper responses, including ranking position, sitelinks, attributes, and top-level response context such as knowledgeGraph, answerBox, peopleAlsoAsk, and relatedSearches
  • Optional baseUrl overrides are supported for proxies and testing

Minimal config:

{
  "tools": {
    "search": "serper"
  },
  "providers": {
    "serper": {
      "credentials": {
        "api": "SERPER_API_KEY"
      }
    }
  }
}
Tavily
  • SDK: @tavily/core
  • Supports web_search via Tavily Search
  • Supports web_contents via Tavily Extract
  • Good for pairing LLM-oriented web search with lightweight page extraction
  • Exposes search options topic, searchDepth, timeRange, country, exactMatch, includeAnswer, includeRawContent, includeImages, includeFavicon, includeDomains, excludeDomains, and days
  • Exposes contents options extractDepth, format, includeImages, query, chunksPerSource, and includeFavicon
Valyu
  • SDK: valyu-js
  • Supports web_search, web_contents, web_answer, and web_research
  • web_research is exposed through pi's async research workflow
  • Web, proprietary, and news search types
  • Exposes search options searchType, responseLength, and countryCode
  • Exposes answer and research options responseLength and countryCode
  • Persisted Valyu defaults are scoped under providers.valyu.options.search, providers.valyu.options.answer, and providers.valyu.options.research
  • web_contents currently uses fixed provider behavior with no extra per-call provider options

Custom provider

The custom provider lets you bring your own wrapper command for any managed tool. Each capability can point at a different local command under providers["custom"].options.

custom does not expose standard per-call options fields. Put provider-specific behavior in the wrapper configuration or in the wrapper implementation.

The repo includes actual wrapper examples under examples/custom/wrappers/. They are small bash scripts that use jq for JSON handling. Each one uses a different backend pattern:

  • codex --search exec for web_search
  • Gemini API via curl for web_contents
  • claude -p for web_answer
  • Perplexity API via curl for web_research
Configuration example

Copy the example wrappers into a local ./wrappers/ directory, then configure:

{
  "tools": {
    "search": "custom",
    "contents": "custom",
    "answer": "custom",
    "research": "custom"
  },
  "providers": {
    "custom": {
      "options": {
        "search": {
          "argv": ["bash", "./wrappers/codex-search.sh"]
        },
        "contents": {
          "argv": ["bash", "./wrappers/gemini-contents.sh"]
        },
        "answer": {
          "argv": ["bash", "./wrappers/claude-answer.sh"]
        },
        "research": {
          "argv": ["bash", "./wrappers/perplexity-research.sh"]
        }
      }
    }
  }
}

Those example wrappers deliberately use different local CLIs and APIs so you can see several wrapper styles in one setup without extra glue code.

Each capability can also set an optional cwd and env block. Use cwd when one wrapper must run from a specific directory. Use env for per-command variables; each value can be a literal string, an environment variable name, or !command.

web_research uses the same async workflow as every other research provider: pi starts the wrapper in the background, tracks the job locally, and writes the final report to a file when it finishes.

Wrapper contract:

  • stdin: one JSON request object with capability plus the per-call managed inputs (query, urls, input, maxResults, options, cwd)
  • stdout: one JSON response object
    • search: { "results": [{ "title", "url", "snippet" }] }
    • contents: { "answers": [{ "url", "content"?: "...", "summary"?: unknown, "metadata"?: {}, "error"?: "..." }] }
    • answer / research: { "text": "...", "summary"?: "...", "itemCount"?: 1, "metadata"?: {} }
  • stderr: optional progress lines
  • exit code 0: success
  • non-zero exit code: failure

See examples/custom/README.md for a copy-and-pasteable setup, and see examples/custom/wrappers/ for the actual wrapper files.

Settings

The settings block holds shared execution defaults that apply to all providers unless overridden in a provider's own settings block:

FieldDefaultDescription
requestTimeoutMs30000Maximum time for a single provider request
retryCount3Retries for transient failures
retryDelayMs2000Initial delay before retrying
researchTimeoutMs1800000Maximum total time for an async web_research job (30 min)

🧹 Uninstall

pi remove npm:pi-web-providers

πŸ“„ License

MIT

Keywords

pi-package

FAQs

Package last updated on 31 May 2026

Did you know?

Socket

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Install

Related posts