databricks-utils
databricks-utils
is a python package that provide several utility classes/func
that improve ease-of-use in databricks notebook.
Installation
pip install databricks-utils
Features
Documentation
API documentation can be found at https://e2fyi.github.io/databricks-utils/.
Quick start
S3Bucket
import json
from databricks_utils.aws import S3Bucket
S3Bucket.attach_dbutils(dbutils)
bucket = (S3Bucket("somebucketname", "SOMEACCESSKEY", "SOMESECRETKEY")
.allow_spark(sc)
.mount("somebucketname"))
bucket.ls("resource/")
data = json.load(open(bucket.local("resource/somefile.json", "r")))
dataframe = spark.read.parquet(bucket.s3("resource/somedf.parquet"))
bucket.umount()
Vega
Vega and Vega-Lite
are high-level grammars of interactive graphics. They provide concise JSON
syntax for rapidly generating visualizations to support analysis.
from databricks_utils.vega import vega_embed
spec = {
"data": {
"values": [
{"a": "A","b": 28}, {"a": "B","b": 55}, {"a": "C","b": 43},
{"a": "D","b": 91}, {"a": "E","b": 81}, {"a": "F","b": 53},
{"a": "G","b": 19}, {"a": "H","b": 87}, {"a": "I","b": 52}
]
},
"mark": "bar",
"encoding": {
"x": {"field": "a", "type": "ordinal"},
"y": {"field": "b", "type": "quantitative"}
}
}
displayHTML(vega_embed(spec=spec))
Developer
. add_tag.sh <VERSION>