AzFS
AzFS is to provide convenient Python read/write functions for Azure Storage Account.
AzFS
can
- list files in blob (also with wildcard
*
), - check if file exists,
- read csv as pd.DataFrame, and json as dict from blob,
- write pd.DataFrame as csv, and dict as json to blob.
install
$ pip install azfs
usage
For Blob
Storage.
import azfs
from azure.identity import DefaultAzureCredential
import pandas as pd
azc = azfs.AzFileClient()
credential = "[your storage account credential]"
credential = DefaultAzureCredential()
azc = azfs.AzFileClient(credential=credential)
connection_string = "DefaultEndpointsProtocol=https;AccountName=xxxx;AccountKey=xxxx;EndpointSuffix=core.windows.net"
azc = azfs.AzFileClient(connection_string=connection_string)
csv_path = "https://testazfs.blob.core.windows.net/test_caontainer/test_file.csv"
df = azc.read_csv(csv_path, index_col=0)
with azc:
df = pd.read_csv_az(csv_path, header=None)
azc.write_csv(path=csv_path, df=df)
with azc:
df.to_csv_az(path=csv_path, index=False)
csv_pattern_path = "https://testazfs.blob.core.windows.net/test_caontainer/*.csv"
df = azc.read().csv(csv_pattern_path)
df = azc.read().apply(function=lambda x: x[x['id'] == 'AAA']).csv(csv_pattern_path)
df = azc.read(use_mp=True).apply(function=lambda x: x[x['id'] == 'AAA']).csv(csv_pattern_path)
For Queue
Storage
import azfs
queue_url = "https://{storage_account}.queue.core.windows.net/{queue_name}"
azc = azfs.AzFileClient()
queue_message = azc.get(queue_url)
queue_content = queue_message.get('content')
For Table
Storage
import azfs
cons = {
"account_name": "{storage_account_name}",
"account_key": "{credential}",
"database_name": "{database_name}"
}
table_client = azfs.TableStorageWrapper(**cons)
table_client.put(id_="1", message="hello_world")
table_client.get(id_="1")
check more details in
types of authorization
Supported authentication types are
types of storage account kind
The table below shows if AzFS
provides read/write functions for the storage.
account kind | Blob | Data Lake | Queue | File | Table |
---|
StorageV2 | O | O | O | X | O |
StorageV1 | O | O | O | X | O |
BlobStorage | O | - | - | - | - |
- O: provides basic functions
- X: not provides
- -: storage type unavailable
dependencies
pandas
azure-identity >= "1.3.1"
azure-storage-blob >= "12.3.0"
azure-storage-file-datalake >= "12.0.0"
azure-storage-queue >= "12.1.1"
azure-cosmosdb-table
references