wraps gsutil, a command-line interface to Google Cloud Storage.
Project description
gsutilwrap
gsutilwrap wraps Google Storage gsutil command-line interface in order to simplify the deployment and backup tasks related to Google Cloud Storage. It provides a set of data manipulation commands including copying, reading, writing and hashing stored data.
We primarly needed something simple that can still leverage mutli-threading, has decent progress output and implements robust pattern matching. Since gsutil CLI already provides all this functionality, we decided to wrap it. The wrapper adds type-annotated arguments and provides code inspection and autocomplete feature in an IDE such as PyCharm.
Additionally, since gsutil lacked copying of multiple patterns to multiple targets, we created this extra feature in gsutilwrap.
If you need to transfer data from/to Google Cloud Storage in the core of your application, we would recommend you to use the library google-cloud-storage provided by Google itself. That library is much more sophisticated in terms of features and would not incur you the overhead of authorizing and spawning a process for each operation. However, it lacks pattern matching (except for matching the prefixes) and you have to manage multi-threading and progress output yourself.
Usage
import pathlib
import gsutilwrap
# list
lst = gsutilwrap.ls(
'gs://some-bucket/some-path/**/*.txt')
lst = gsutilwrap.ls_many(
['gs://some-bucket/some-path/**/*.txt',
'gs://another-bucket/another-path/**/*.xml'],
multithreaded=True)
# if you need a listing with size and update time, use long_ls
entries = gsutilwrap.long_ls(
'gs://some-bucket/some-path/**/*.txt')
for entry in entries:
print("File size and update time of {}: {} {}".format(
entry.url, entry.size, entry.update_time))
# write/read text
gsutilwrap.write_text(
url='gs://some-bucket/some-path/some-file.txt',
text='some text')
text = gsutilwrap.read_text(
url='gs://some-bucket/some-path/some-file.txt')
# write/read bytes
gsutilwrap.write_bytes(
url='gs://some-bucket/some-path/some-file.bin',
data=b'x\DE\xAD\xBE\xEF')
data = gsutilwrap.read_bytes(
url='gs://some-bucket/some-path/some-file.bin')
# copy
gsutilwrap.copy(
pattern="gs://some-bucket/some-path/*.txt",
target="/some/dir")
gsutilwrap.copy_many_to_one(
patterns=[
"gs://some-bucket/some-path/*.txt",
"gs://some-bucket/some-path/*.xml"
],
target="/some/dir")
gsutilwrap.copy_many_to_many(
patterns_targets=[
("gs://some-bucket/some-path/*.txt", "/some/dir"),
("gs://some-bucket/some-path/*.xml", "/some/other/dir")
])
# stat an object
stat = gsutilwrap.stat(
url='gs://some-bucket/some-path/some-file.txt')
print("Modification time: {}".format(stat.file_mtime))
print("Size: {}".format(stat.content_length))
print("MD5: {}".format(stat.md5.hex()))
Installation
Create a virtual environment:
python3 -m venv venv3
Activate it:
source venv3/bin/activate
Install gsutilwrap with pip:
pip3 install gsutilwrap
Make sure you installed gsutil command-line interface: gsutil installation
Development
Check out the repository.
In the repository root, create the virtual environment:
python3 -m venv venv3
Activate the virtual environment:
source venv3/bin/activate
Install the development dependencies:
pip3 install -e .[dev]
We provide a set of live tests. The live tests need an existing bucket in the Google Cloud Storage. You need to set the URL prefix which will be used for all the live tests via the environment variable TEST_GSUTILWRAP_URL_PREFIX.
Mind that the live tests will use Google Cloud resources for which you will be billed. Always check that no resources are used after the tests finished so that you don’t incur an unnecessary cost!
We use tox for testing and packaging the distribution. Assuming that the virtual environment has been activated and the development dependencies have been installed, run:
tox
We also provide a set of pre-commit checks that lint and check code for formatting. Run them locally from an activated virtual environment with development dependencies:
./precommit.py
The pre-commit script can also automatically format the code:
./precommit.py --overwrite
Versioning
We follow Semantic Versioning. The version X.Y.Z indicates:
X is the major version (backward-incompatible),
Y is the minor version (backward-compatible), and
Z is the patch version (backward-compatible bug fix).
Project details
Download files
Download the file for your platform. If you're not sure which to choose, learn more about installing packages.
Source Distribution
File details
Details for the file gsutilwrap-1.1.2.tar.gz
.
File metadata
- Download URL: gsutilwrap-1.1.2.tar.gz
- Upload date:
- Size: 11.5 kB
- Tags: Source
- Uploaded using Trusted Publishing? No
- Uploaded via: twine/1.11.0 pkginfo/1.4.2 requests/2.19.1 setuptools/40.1.0 requests-toolbelt/0.8.0 tqdm/4.24.0 CPython/3.5.2
File hashes
Algorithm | Hash digest | |
---|---|---|
SHA256 | 8c45ce8aa4b940d7e217ae6c2c6ab8488f17057d0cd82db44bdbfce3037a3f4c |
|
MD5 | 9ec8da4f258e5e3be0e00040bcb3cd03 |
|
BLAKE2b-256 | f8701f199e69236e22d2126ad8e572de29462fdd2e91f6162eddf40d00c936da |