
Security News
The Nightmare Before Deployment
Season’s greetings from Socket, and here’s to a calm end of year: clean dependencies, boring pipelines, no surprises.
cloudpathlib
Advanced tools
Our goal is to be the meringue of file management libraries: the subtle sweetness of
pathlibworking in harmony with the ethereal lightness of the cloud.
A Python library with classes that mimic pathlib.Path's interface for URIs from different cloud storage services.
with CloudPath("s3://bucket/filename.txt").open("w+") as f:
f.write("Send my changes to the cloud!")
Path, you know how to interact with CloudPath. All of the cloud-relevant Path methods are implemented.MyPath and MyClient is all you need to add support for a new cloud storage service.write_text, write_bytes or .open('w') methods will all upload your changes to cloud storage without any additional file management as a developer.cloudpathlib depends on the cloud services' SDKs (e.g., boto3, google-cloud-storage, azure-storage-blob) to communicate with their respective storage service. If you try to use cloud paths for a cloud service for which you don't have dependencies installed, cloudpathlib will error and let you know what you need to install.
To install a cloud service's SDK dependency when installing cloudpathlib, you need to specify it using pip's "extras" specification. For example:
pip install cloudpathlib[s3,gs,azure]
With some shells, you may need to use quotes:
pip install "cloudpathlib[s3,gs,azure]"
Currently supported cloud storage services are: azure, gs, s3. You can also use all to install all available services' dependencies.
If you do not specify any extras or separately install any cloud SDKs, you will only be able to develop with the base classes for rolling your own cloud path class.
cloudpathlib is also available using conda from conda-forge. Note that to install the necessary cloud service SDK dependency, you should include the appropriate suffix in the package name. For example:
conda install cloudpathlib-s3 -c conda-forge
If no suffix is used, only the base classes will be usable. See the conda-forge/cloudpathlib-feedstock for all installation options.
You can get latest development version from GitHub:
pip install https://github.com/drivendataorg/cloudpathlib.git#egg=cloudpathlib[all]
Note that you similarly need to specify cloud service dependencies, such as all in the above example command.
Here's an example to get the gist of using the package. By default, cloudpathlib authenticates with the environment variables supported by each respective cloud service SDK. For more details and advanced authentication options, see the "Authentication" documentation.
from cloudpathlib import CloudPath
# dispatches to S3Path based on prefix
root_dir = CloudPath("s3://drivendata-public-assets/")
root_dir
#> S3Path('s3://drivendata-public-assets/')
# there's only one file, but globbing works in nested folder
for f in root_dir.glob('**/*.txt'):
text_data = f.read_text()
print(f)
print(text_data)
#> s3://drivendata-public-assets/odsc-west-2019/DATA_DICTIONARY.txt
#> Eviction Lab Data Dictionary
#>
#> Additional information in our FAQ evictionlab.org/help-faq/
#> Full methodology evictionlab.org/methods/
#>
#> ... (additional text output truncated)
# use / to join paths (and, in this case, create a new file)
new_file_copy = root_dir / "nested_dir/copy_file.txt"
new_file_copy
#> S3Path('s3://drivendata-public-assets/nested_dir/copy_file.txt')
# show things work and the file does not exist yet
new_file_copy.exists()
#> False
# writing text data to the new file in the cloud
new_file_copy.write_text(text_data)
#> 6933
# file now listed
list(root_dir.glob('**/*.txt'))
#> [S3Path('s3://drivendata-public-assets/nested_dir/copy_file.txt'),
#> S3Path('s3://drivendata-public-assets/odsc-west-2019/DATA_DICTIONARY.txt')]
# but, we can remove it
new_file_copy.unlink()
# no longer there
list(root_dir.glob('**/*.txt'))
#> [S3Path('s3://drivendata-public-assets/odsc-west-2019/DATA_DICTIONARY.txt')]
Most methods and properties from pathlib.Path are supported except for the ones that don't make sense in a cloud context. There are a few additional methods or properties that relate to specific cloud services or specifically for cloud paths.
| Methods + properties | AzureBlobPath | GSPath | HttpsPath | S3Path |
|---|---|---|---|---|
absolute | ✅ | ✅ | ✅ | ✅ |
anchor | ✅ | ✅ | ✅ | ✅ |
as_uri | ✅ | ✅ | ✅ | ✅ |
drive | ✅ | ✅ | ✅ | ✅ |
exists | ✅ | ✅ | ✅ | ✅ |
glob | ✅ | ✅ | ✅ | ✅ |
is_absolute | ✅ | ✅ | ✅ | ✅ |
is_dir | ✅ | ✅ | ✅ | ✅ |
is_file | ✅ | ✅ | ✅ | ✅ |
is_junction | ✅ | ✅ | ✅ | ✅ |
is_relative_to | ✅ | ✅ | ✅ | ✅ |
iterdir | ✅ | ✅ | ✅ | ✅ |
joinpath | ✅ | ✅ | ✅ | ✅ |
match | ✅ | ✅ | ✅ | ✅ |
mkdir | ✅ | ✅ | ✅ | ✅ |
name | ✅ | ✅ | ✅ | ✅ |
open | ✅ | ✅ | ✅ | ✅ |
parent | ✅ | ✅ | ✅ | ✅ |
parents | ✅ | ✅ | ✅ | ✅ |
parts | ✅ | ✅ | ✅ | ✅ |
read_bytes | ✅ | ✅ | ✅ | ✅ |
read_text | ✅ | ✅ | ✅ | ✅ |
relative_to | ✅ | ✅ | ✅ | ✅ |
rename | ✅ | ✅ | ✅ | ✅ |
replace | ✅ | ✅ | ✅ | ✅ |
resolve | ✅ | ✅ | ✅ | ✅ |
rglob | ✅ | ✅ | ✅ | ✅ |
rmdir | ✅ | ✅ | ✅ | ✅ |
samefile | ✅ | ✅ | ✅ | ✅ |
stat | ✅ | ✅ | ✅ | ✅ |
stem | ✅ | ✅ | ✅ | ✅ |
suffix | ✅ | ✅ | ✅ | ✅ |
suffixes | ✅ | ✅ | ✅ | ✅ |
touch | ✅ | ✅ | ✅ | ✅ |
unlink | ✅ | ✅ | ✅ | ✅ |
walk | ✅ | ✅ | ✅ | ✅ |
with_name | ✅ | ✅ | ✅ | ✅ |
with_segments | ✅ | ✅ | ✅ | ✅ |
with_stem | ✅ | ✅ | ✅ | ✅ |
with_suffix | ✅ | ✅ | ✅ | ✅ |
write_bytes | ✅ | ✅ | ✅ | ✅ |
write_text | ✅ | ✅ | ✅ | ✅ |
as_posix | ❌ | ❌ | ❌ | ❌ |
chmod | ❌ | ❌ | ❌ | ❌ |
cwd | ❌ | ❌ | ❌ | ❌ |
expanduser | ❌ | ❌ | ❌ | ❌ |
group | ❌ | ❌ | ❌ | ❌ |
hardlink_to | ❌ | ❌ | ❌ | ❌ |
home | ❌ | ❌ | ❌ | ❌ |
is_block_device | ❌ | ❌ | ❌ | ❌ |
is_char_device | ❌ | ❌ | ❌ | ❌ |
is_fifo | ❌ | ❌ | ❌ | ❌ |
is_mount | ❌ | ❌ | ❌ | ❌ |
is_reserved | ❌ | ❌ | ❌ | ❌ |
is_socket | ❌ | ❌ | ❌ | ❌ |
is_symlink | ❌ | ❌ | ❌ | ❌ |
lchmod | ❌ | ❌ | ❌ | ❌ |
lstat | ❌ | ❌ | ❌ | ❌ |
owner | ❌ | ❌ | ❌ | ❌ |
readlink | ❌ | ❌ | ❌ | ❌ |
root | ❌ | ❌ | ❌ | ❌ |
symlink_to | ❌ | ❌ | ❌ | ❌ |
as_url | ✅ | ✅ | ✅ | ✅ |
clear_cache | ✅ | ✅ | ✅ | ✅ |
client | ✅ | ✅ | ✅ | ✅ |
cloud_prefix | ✅ | ✅ | ✅ | ✅ |
copy | ✅ | ✅ | ✅ | ✅ |
copytree | ✅ | ✅ | ✅ | ✅ |
download_to | ✅ | ✅ | ✅ | ✅ |
from_uri | ✅ | ✅ | ✅ | ✅ |
fspath | ✅ | ✅ | ✅ | ✅ |
full_match | ✅ | ✅ | ✅ | ✅ |
is_valid_cloudpath | ✅ | ✅ | ✅ | ✅ |
parser | ✅ | ✅ | ✅ | ✅ |
rmtree | ✅ | ✅ | ✅ | ✅ |
upload_from | ✅ | ✅ | ✅ | ✅ |
validate | ✅ | ✅ | ✅ | ✅ |
etag | ✅ | ✅ | ❌ | ✅ |
blob | ✅ | ✅ | ❌ | ❌ |
bucket | ❌ | ✅ | ❌ | ✅ |
md5 | ✅ | ✅ | ❌ | ❌ |
container | ✅ | ❌ | ❌ | ❌ |
delete | ❌ | ❌ | ✅ | ❌ |
get | ❌ | ❌ | ✅ | ❌ |
head | ❌ | ❌ | ✅ | ❌ |
key | ❌ | ❌ | ❌ | ✅ |
parsed_url | ❌ | ❌ | ✅ | ❌ |
post | ❌ | ❌ | ✅ | ❌ |
put | ❌ | ❌ | ✅ | ❌ |
Icon made by srip from www.flaticon.com.
Sample code block generated using the reprexpy package.
FAQs
pathlib-style classes for cloud storage services.
We found that cloudpathlib demonstrated a healthy version release cadence and project activity because the last version was released less than a year ago. It has 1 open source maintainer collaborating on the project.
Did you know?

Socket for GitHub automatically highlights issues in each pull request and monitors the health of all your open source dependencies. Discover the contents of your packages and block harmful activity before you install or update your dependencies.

Security News
Season’s greetings from Socket, and here’s to a calm end of year: clean dependencies, boring pipelines, no surprises.

Research
/Security News
Impostor NuGet package Tracer.Fody.NLog typosquats Tracer.Fody and its author, using homoglyph tricks, and exfiltrates Stratis wallet JSON/passwords to a Russian IP address.

Security News
Deno 2.6 introduces deno audit with a new --socket flag that plugs directly into Socket to bring supply chain security checks into the Deno CLI.