
Product
Announcing Precomputed Reachability Analysis in Socket
Socket’s precomputed reachability slashes false positives by flagging up to 80% of vulnerabilities as irrelevant, with no setup and instant results.
The official Neo4j GraphRAG package for Python enables developers to build graph retrieval augmented generation (GraphRAG) applications using the power of Neo4j and Python. As a first-party library, it offers a robust, feature-rich, and high-performance solution, with the added assurance of long-term support and maintenance directly from Neo4j.
Documentation can be found here
A series of blog posts demonstrating how to use this package:
A list of Neo4j GenAI-related features can also be found at Neo4j GenAI Ecosystem.
Version | Supported? |
---|---|
3.13 | ✓ |
3.12 | ✓ |
3.11 | ✓ |
3.10 | ✓ |
3.9 | ✓ |
3.8 | ✗ |
To install the latest stable version, run:
pip install neo4j-graphrag
This package has some optional features that can be enabled using the extra dependencies described below:
sentence-transformers
Python packageExternal Retrievers
):
Install package with optional dependencies with (for instance):
pip install "neo4j-graphrag[openai]"
The scripts below demonstrate how to get started with the package and make use of its key features.
To run these examples, ensure that you have a Neo4j instance up and running and update the NEO4J_URI
, NEO4J_USERNAME
, and NEO4J_PASSWORD
variables in each script with the details of your Neo4j instance.
For the examples, make sure to export your OpenAI key as an environment variable named OPENAI_API_KEY
.
Additional examples are available in the examples
folder.
NOTE: The APOC core library must be installed in your Neo4j instance in order to use this feature
This package offers two methods for constructing a knowledge graph.
The Pipeline
class provides extensive customization options, making it ideal for advanced use cases.
See the examples/pipeline
folder for examples of how to use this class.
For a more streamlined approach, the SimpleKGPipeline
class offers a simplified abstraction layer over the Pipeline
, making it easier to build knowledge graphs.
Both classes support working directly with text and PDFs.
import asyncio
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.experimental.pipeline.kg_builder import SimpleKGPipeline
from neo4j_graphrag.llm import OpenAILLM
NEO4J_URI = "neo4j://localhost:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "password"
# Connect to the Neo4j database
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
# List the entities and relations the LLM should look for in the text
node_types = ["Person", "House", "Planet"]
relationship_types = ["PARENT_OF", "HEIR_OF", "RULES"]
patterns = [
("Person", "PARENT_OF", "Person"),
("Person", "HEIR_OF", "House"),
("House", "RULES", "Planet"),
]
# Create an Embedder object
embedder = OpenAIEmbeddings(model="text-embedding-3-large")
# Instantiate the LLM
llm = OpenAILLM(
model_name="gpt-4o",
model_params={
"max_tokens": 2000,
"response_format": {"type": "json_object"},
"temperature": 0,
},
)
# Instantiate the SimpleKGPipeline
kg_builder = SimpleKGPipeline(
llm=llm,
driver=driver,
embedder=embedder,
schema={
"node_types": node_types,
"relationship_types": relationship_types,
"patterns": patterns,
},
on_error="IGNORE",
from_pdf=False,
)
# Run the pipeline on a piece of text
text = (
"The son of Duke Leto Atreides and the Lady Jessica, Paul is the heir of House "
"Atreides, an aristocratic family that rules the planet Caladan."
)
asyncio.run(kg_builder.run_async(text=text))
driver.close()
Warning: In order to run this code, the
openai
Python package needs to be installed:pip install "neo4j_graphrag[openai]"
Example knowledge graph created using the above script:
When creating a vector index, make sure you match the number of dimensions in the index with the number of dimensions your embeddings have.
from neo4j import GraphDatabase
from neo4j_graphrag.indexes import create_vector_index
NEO4J_URI = "neo4j://localhost:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "password"
INDEX_NAME = "vector-index-name"
# Connect to the Neo4j database
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
# Create the index
create_vector_index(
driver,
INDEX_NAME,
label="Chunk",
embedding_property="embedding",
dimensions=3072,
similarity_fn="euclidean",
)
driver.close()
This example demonstrates one method for upserting data in your Neo4j database. It's important to note that there are alternative approaches, such as using the Neo4j Python driver.
Ensure that your vector index is created prior to executing this example.
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.indexes import upsert_vectors
from neo4j_graphrag.types import EntityType
NEO4J_URI = "neo4j://localhost:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "password"
# Connect to the Neo4j database
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
# Create an Embedder object
embedder = OpenAIEmbeddings(model="text-embedding-3-large")
# Generate an embedding for some text
text = (
"The son of Duke Leto Atreides and the Lady Jessica, Paul is the heir of House "
"Atreides, an aristocratic family that rules the planet Caladan."
)
vector = embedder.embed_query(text)
# Upsert the vector
upsert_vectors(
driver,
ids=["1234"],
embedding_property="vectorProperty",
embeddings=[vector],
entity_type=EntityType.NODE,
)
driver.close()
Please note that when querying a Neo4j vector index approximate nearest neighbor search is used, which may not always deliver exact results. For more information, refer to the Neo4j documentation on limitations and issues of vector indexes.
In the example below, we perform a simple vector search using a retriever that conducts a similarity search over the vector-index-name
vector index.
This library provides more retrievers beyond just the VectorRetriever
.
See the examples
folder for examples of how to use these retrievers.
Before running this example, make sure your vector index has been created and populated.
from neo4j import GraphDatabase
from neo4j_graphrag.embeddings import OpenAIEmbeddings
from neo4j_graphrag.generation import GraphRAG
from neo4j_graphrag.llm import OpenAILLM
from neo4j_graphrag.retrievers import VectorRetriever
NEO4J_URI = "neo4j://localhost:7687"
NEO4J_USERNAME = "neo4j"
NEO4J_PASSWORD = "password"
INDEX_NAME = "vector-index-name"
# Connect to the Neo4j database
driver = GraphDatabase.driver(NEO4J_URI, auth=(NEO4J_USERNAME, NEO4J_PASSWORD))
# Create an Embedder object
embedder = OpenAIEmbeddings(model="text-embedding-3-large")
# Initialize the retriever
retriever = VectorRetriever(driver, INDEX_NAME, embedder)
# Instantiate the LLM
llm = OpenAILLM(model_name="gpt-4o", model_params={"temperature": 0})
# Instantiate the RAG pipeline
rag = GraphRAG(retriever=retriever, llm=llm)
# Query the graph
query_text = "Who is Paul Atreides?"
response = rag.search(query_text=query_text, retriever_config={"top_k": 5})
print(response.answer)
driver.close()
You must sign the contributors license agreement in order to make contributions to this project.
Our Python dependencies are managed using Poetry. If Poetry is not yet installed on your system, you can follow the instructions here to set it up. To begin development on this project, start by cloning the repository and then install all necessary dependencies, including the development dependencies, with the following command:
poetry install --with dev
If you have a bug to report or feature to request, first search to see if an issue already exists. If a related issue doesn't exist, please raise a new issue using the issue form.
If you're a Neo4j Enterprise customer, you can also reach out to Customer Support.
If you don't have a bug to report or feature request, but you need a hand with the library; community support is available via Neo4j Online Community and/or Discord.
main
and start with your changes!Our codebase follows strict formatting and linting standards using Ruff for code quality checks and Mypy for type checking. Before contributing, ensure that all code is properly formatted, free of linting issues, and includes accurate type annotations.
Adherence to these standards is required for contributions to be accepted.
We recommend setting up pre-commit to automate code quality checks. This ensures your changes meet our guidelines before committing.
Install pre-commit by following the installation guide.
Set up the pre-commit hooks by running:
pre-commit install
To manually check if a file meets the quality requirements, run:
pre-commit run --file path/to/file
When you're finished with your changes, create a pull request (PR) using the following workflow.
main
.CHANGELOG.md
if you have made significant changes to the project, these include:
CHANGELOG.md
changes brief and focus on the most important changes.CHANGELOG.md
@CodiumAI-Agent /update_changelog
CHANGELOG.md
file under 'Next'.To be able to run all tests, all extra packages needs to be installed. This is achieved by:
poetry install --all-extras
Install the project dependencies then run the following command to run the unit tests locally:
poetry run pytest tests/unit
To execute end-to-end (e2e) tests, you need the following services to be running locally:
The simplest way to set these up is by using Docker Compose:
docker compose -f tests/e2e/docker-compose.yml up
(tip: If you encounter any caching issues within the databases, you can completely remove them by running docker compose -f tests/e2e/docker-compose.yml down
)
Once all the services are running, execute the following command to run the e2e tests:
poetry run pytest tests/e2e
FAQs
Python package to allow easy integration to Neo4j's GraphRAG features
We found that neo4j-graphrag demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?
Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.
Product
Socket’s precomputed reachability slashes false positives by flagging up to 80% of vulnerabilities as irrelevant, with no setup and instant results.
Product
Socket is launching experimental protection for Chrome extensions, scanning for malware and risky permissions to prevent silent supply chain attacks.
Product
Add secure dependency scanning to Claude Desktop with Socket MCP, a one-click extension that keeps your coding conversations safe from malicious packages.