
Research
Malicious npm Packages Impersonate Flashbots SDKs, Targeting Ethereum Wallet Credentials
Four npm packages disguised as cryptographic tools steal developer credentials and send them to attacker-controlled Telegram infrastructure.
@upstash/docs2vector
Advanced tools
A tool to process markdown files from GitHub repositories and store them in Upstash Vector
A Node.js tool to process Markdown files from GitHub repositories, generate embeddings, and store them in Upstash Vector database. Perfect for building document search systems, AI-driven documentation assistants, or knowledge bases.
.md
) and MDX (.mdx
) files in any GitHub repositorySettings
> Developer settings
> Personal access tokens
> Tokens (classic)
Generate new token
> Generate new token (classic)
repo
(Full control of private repositories)read:org
(Read organization data)Generate token
mkdir github-docs-vectorizer
cd github-docs-vectorizer
Ensure the following files are included in your directory:
script.js
: The main script for processingpackage.json
: Manages project dependencies.env
: Contains your environment variables (explained below)Install dependencies:
npm install @upstash/docs2vector
.env
file in the root directory of your project with your credentials:# Required for accessing GitHub repositories
GITHUB_TOKEN=your_github_token
# Required for storing vectors in Upstash
UPSTASH_VECTOR_REST_URL=your_upstash_vector_url
UPSTASH_VECTOR_REST_TOKEN=your_upstash_vector_token
# Optional: Provide if using OpenAI embeddings
OPENAI_API_KEY=your_openai_api_key
Run the script by providing the GitHub repository URL as an argument:
node script.js https://github.com/username/repository
Example:
node script.js https://github.com/facebook/react
The script will:
OpenAI Embeddings (default if API key is provided)
OPENAI_API_KEY
in .env
Upstash Embeddings (used when OpenAI API key is not provided)
To adjust how documents are split into chunks, you can update the configuration in script.js
:
const textSplitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000, // Adjust chunk size as needed
chunkOverlap: 200 // Adjust overlap as needed
});
npm install @upstash/docs2vector dotenv
import Docs2Vector from '@upstash/docs2vector';
import dotenv from 'dotenv';
// Load environment variables
dotenv.config();
async function main() {
try {
// Step 1: Define the GitHub repository URL
const githubRepoUrl = 'YOUR_GITHUB_URL';
// Print start message
console.log(`Starting processing for the repository: ${githubRepoUrl}`);
// Step 2: Initialize the Docs2Vector SDK
const converter = new Docs2Vector();
// Step 3: Run the processing flow with Docs2Vector's `run` method
await converter.run(githubRepoUrl);
// Print success message
console.log(`Successfully processed repository: ${githubRepoUrl}`);
console.log('Vectors stored in Upstash Vector database.');
} catch (error) {
console.error('An error occurred while processing the repository:', error.message);
}
}
main();
Metadata accompanies each stored chunk for improved context:
The script is designed to handle errors gracefully in the following cases:
In case of errors, the script will:
Feel free to submit issues and enhancement requests!
MIT License - feel free to use this tool for any purpose.
This tool uses the following open-source packages:
FAQs
A tool to process markdown files from GitHub repositories and store them in Upstash Vector
The npm package @upstash/docs2vector receives a total of 1 weekly downloads. As such, @upstash/docs2vector popularity was classified as not popular.
We found that @upstash/docs2vector demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 5 open source maintainers collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Research
Four npm packages disguised as cryptographic tools steal developer credentials and send them to attacker-controlled Telegram infrastructure.
Security News
Ruby maintainers from Bundler and rbenv teams are building rv to bring Python uv's speed and unified tooling approach to Ruby development.
Security News
Following last week’s supply chain attack, Nx published findings on the GitHub Actions exploit and moved npm publishing to Trusted Publishers.