NebulaGraph Python Client
Getting Started
Note: Ensure you are using the correct version, refer to the Capability Matrix for how the Python client version corresponds to the NebulaGraph Database version.
Accessing NebulaGraph
Handling Query Results
Jupyter Notebook Integration
If you are about to access NebulaGraph within Jupyter Notebook, you may want to use the NebulaGraph Jupyter Extension, which provides a more interactive way to access NebulaGraph. See also this on Google Colab: NebulaGraph on Google Colab.
Obtaining nebula3-python
Method 1: Installation via pip
pip install nebula3-python==$version
pip install nebula2-python==$version
Method 2: Installation via source
Click to expand
git clone https://github.com/vesoft-inc/nebula-python.git
cd nebula-python
For python version >= 3.7.0
pip install .
For python version >= 3.6.2, < 3.7.0
python3 setup.py install
Quick Example: Connecting to GraphD Using Graph Client
from nebula3.gclient.net import ConnectionPool
from nebula3.Config import Config
config = Config()
config.max_connection_pool_size = 10
connection_pool = ConnectionPool()
ok = connection_pool.init([('127.0.0.1', 9669)], config)
session = connection_pool.get_session('root', 'nebula')
session.execute('USE basketballplayer')
result = session.execute('SHOW TAGS')
print(result)
session.release()
with connection_pool.session_context('root', 'nebula') as session:
session.execute('USE basketballplayer')
result = session.execute('SHOW TAGS')
print(result)
connection_pool.close()
Using the Session Pool: A Guide
The session pool is a collection of sessions that are managed by the pool. It is designed to improve the efficiency of session management and to reduce the overhead of session creation and destruction.
Session Pool comes with the following assumptions:
- A space must already exist in the database prior to the initialization of the session pool.
- Each session pool is associated with a single user and a single space to ensure consistent access control for the user. For instance, a user may possess different access permissions across various spaces. To execute queries in multiple spaces, consider utilizing several session pools.
- Whenever
sessionPool.execute()
is invoked, the session executes the query within the space specified in the session pool configuration. - It is imperative to avoid executing commands through the session pool that would alter passwords or remove users.
For more details, see SessionPoolExample.py.
Example: Extracting Edge and Vertex Lists from Query Results
For graph visualization purposes, the following code snippet demonstrates how to effortlessly extract lists of edges and vertices from any query result by utilizing the ResultSet.dict_for_vis()
method.
result = session.execute(
'GET SUBGRAPH WITH PROP 2 STEPS FROM "player101" YIELD VERTICES AS nodes, EDGES AS relationships;')
data_for_vis = result.dict_for_vis()
Then, we could pass the data_for_vis
to a front-end visualization library such as vis.js
, d3.js
or Apache ECharts. There is an example of Apache ECharts in exapmple/apache_echarts.html.
The dict/JSON structure with dict_for_vis()
is as follows:
Click to expand
{
'nodes': [
{
'id': 'player100',
'labels': ['player'],
'props': {
'name': 'Tim Duncan',
'age': '42',
'id': 'player100'
}
},
{
'id': 'player101',
'labels': ['player'],
'props': {
'age': '36',
'name': 'Tony Parker',
'id': 'player101'
}
}
],
'edges': [
{
'src': 'player100',
'dst': 'player101',
'name': 'follow',
'props': {
'degree': '95'
}
}
],
'nodes_dict': {
'player100': {
'id': 'player100',
'labels': ['player'],
'props': {
'name': 'Tim Duncan',
'age': '42',
'id': 'player100'
}
},
'player101': {
'id': 'player101',
'labels': ['player'],
'props': {
'age': '36',
'name': 'Tony Parker',
'id': 'player101'
}
}
},
'edges_dict': {
('player100', 'player101', 0, 'follow'): {
'src': 'player100',
'dst': 'player101',
'name': 'follow',
'props': {
'degree': '95'
}
}
},
'nodes_count': 2,
'edges_count': 1
}
Example: Fetching Query Results into a Pandas DataFrame
For nebula3-python>=3.6.0
:
Assuming you have pandas installed, you can use the following code to fetch query results into a pandas DataFrame:
pip3 install pandas
result = session.execute('<your query>')
df = result.as_data_frame()
For `nebula3-python<3.6.0`:
from nebula3.gclient.net import ConnectionPool
from nebula3.Config import Config
import pandas as pd
from typing import Dict
from nebula3.data.ResultSet import ResultSet
def result_to_df(result: ResultSet) -> pd.DataFrame:
"""
build list for each column, and transform to dataframe
"""
assert result.is_succeeded()
columns = result.keys()
d: Dict[str, list] = {}
for col_num in range(result.col_size()):
col_name = columns[col_num]
col_list = result.column_values(col_name)
d[col_name] = [x.cast() for x in col_list]
return pd.DataFrame(d)
config = Config()
connection_pool = ConnectionPool()
ok = connection_pool.init([('127.0.0.1', 9669)], config)
with connection_pool.session_context('root', 'nebula') as session:
session.execute('USE <your graph space>')
result = session.execute('<your query>')
df = result_to_df(result)
print(df)
connection_pool.close()
Quick Example: Using Storage Client to Scan Vertices and Edges
Storage Client enables you to scan vertices and edges from the storage service instead of the graph service w/ nGQL/Cypher. This is useful when you need to scan a large amount of data.
Click to expand
You should make sure the scan client can connect to the address of storage which see from SHOW HOSTS
from nebula3.mclient import MetaCache, HostAddr
from nebula3.sclient.GraphStorageClient import GraphStorageClient
meta_cache = MetaCache([('172.28.1.1', 9559),
('172.28.1.2', 9559),
('172.28.1.3', 9559)],
50000)
graph_storage_client = GraphStorageClient(meta_cache)
storage_addrs = [HostAddr(host='172.28.1.4', port=9779),
HostAddr(host='172.28.1.5', port=9779),
HostAddr(host='172.28.1.6', port=9779)]
graph_storage_client = GraphStorageClient(meta_cache, storage_addrs)
resp = graph_storage_client.scan_vertex(
space_name='ScanSpace',
tag_name='person')
while resp.has_next():
result = resp.next()
for vertex_data in result:
print(vertex_data)
resp = graph_storage_client.scan_edge(
space_name='ScanSpace',
edge_name='friend')
while resp.has_next():
result = resp.next()
for edge_data in result:
print(edge_data)
See ScanVertexEdgeExample.py for more details.
Compatibility Matrix
Nebula-Python Version | Compatible NebulaGraph Versions | Notes |
---|
3.5.1 | 3.x | Highly recommended. Latest release for NebulaGraph 3.x series. |
master | master | Includes recent changes. Not yet released. |
3.0.0 ~ 3.5.0 | 3.x | Compatible with any released version within the NebulaGraph 3.x series. |
2.6.0 | 2.6.0, 2.6.1 | |
2.5.0 | 2.5.0 | |
2.0.0 | 2.0.0, 2.0.1 | |
1.0 | 1.x | |
Directory Structure Overview
.
└──nebula-python
│
├── nebula3 // client source code
│ ├── fbthrift // the RPC code generated from thrift protocol
│ ├── common
│ ├── data
│ ├── graph
│ ├── meta
│ ├── net // the net code for graph client
│ ├── storage // the storage client code
│ ├── Config.py // the pool config
│ └── Exception.py // the exceptions
│
├── examples
│ ├── FormatResp.py // the format response example
│ ├── SessionPoolExample.py // the session pool example
│ ├── GraphClientMultiThreadExample.py // the multi thread example
│ ├── GraphClientSimpleExample.py // the simple example
│ └── ScanVertexEdgeExample.py // the scan vertex and edge example(storage client)
│
├── tests // the test code
│
├── setup.py // used to install or package
│
└── README.md // the introduction of nebula3-python
Contribute to Nebula-Python
Click to expand
To contribute, start by forking the repository. Next, clone your forked repository to your local machine. Remember to substitute {username}
with your actual GitHub username in the URL below:
git clone https://github.com/{username}/nebula-python.git
cd nebula-python
For package management, we utilize PDM. Please begin by installing it:
pipx install pdm
Visit the PDM documentation for alternative installation methods.
Install the package and all dev dependencies:
pdm install
Make sure the Nebula server in running, then run the tests with pytest:
pdm test
Using the default formatter with black.
Please run pdm fmt
to format python code before submitting.
See How to contribute for the general process of contributing to Nebula projects.