Skip to main content

A S3-backed ContentsManager implementation for Jupyter

Project description

Build Status Coverage Status

S3Contents

A S3 and GCS backed ContentsManager implementation for Jupyter.

It aims to a be a transparent, drop-in replacement for Jupyter standard filesystem-backed storage system. With this implementation of a Jupyter Contents Manager you can save all your notebooks, regular files, directories structure directly to a S3/GCS bucket, this could be on AWS/GCP or a self hosted S3 API compatible like minio.

While there is some implementations of this functionality already available online (s3nb or s3drive) I wasn't able to make them work in newer Jupyter Notebook installations. This aims to be a better tested one by being highly based on the awesome PGContents.

Prerequisites

Write access (valid credentials) to an S3/GCS bucket, this could be on AWS/GCP or a self hosted S3 like minio.

Installation

$ pip install s3contents

Jupyter config

Edit ~/.jupyter/jupyter_notebook_config.py by filling the missing values:

S3

from s3contents import S3ContentsManager

c = get_config()

# Tell Jupyter to use S3ContentsManager for all storage.
c.NotebookApp.contents_manager_class = S3ContentsManager
c.S3ContentsManager.access_key_id = "<AWS Access Key ID / IAM Access Key ID>"
c.S3ContentsManager.secret_access_key = "<AWS Secret Access Key / IAM Secret Access Key>"
c.S3ContentsManager.session_token = "<AWS Session Token / IAM Session Token>"
c.S3ContentsManager.bucket = "<bucket-name>"

# Optional settings:
c.S3ContentsManager.prefix = "this/is/a/prefix"
c.S3ContentsManager.sse = "AES256"
c.S3ContentsManager.signature_version = "s3v4"

Example for play.minio.io:9000:

from s3contents import S3ContentsManager

c = get_config()

# Tell Jupyter to use S3ContentsManager for all storage.
c.NotebookApp.contents_manager_class = S3ContentsManager
c.S3ContentsManager.access_key_id = "Q3AM3UQ867SPQQA43P2F"
c.S3ContentsManager.secret_access_key = "zuf+tfteSlswRu7BJ86wekitnifILbZam1KYY3TG"
c.S3ContentsManager.endpoint_url = "http://play.minio.io:9000"
c.S3ContentsManager.bucket = "s3contents-demo"
c.S3ContentsManager.prefix = "notebooks/test"

GCP

Note that the file ~/.config/gcloud/application_default_credentials.json assumes a posix system when you did gcloud init

from s3contents import GCSContentsManager

c = get_config(

c.NotebookApp.contents_manager_class = GCSContentsManager
c.GCSContentsManager.project = "<your-project>"
c.GCSContentsManager.token = "~/.config/gcloud/application_default_credentials.json"
c.GCSContentsManager.bucket = "<bucket-name>"

AWS IAM

It is also possible to use IAM Role-based access to the S3 bucket from an Amazon EC2 instance; to do that, just leave access_key_id and secret_access_key set to their default values (None), and ensure that the EC2 instance has an IAM role which provides sufficient permissions for the bucket and the operations necessary.

Access local files

To access local file as well as remote files in S3 you can use pgcontents..

First:

pip install pgcontents

And use a configuration like this:

from s3contents import S3ContentsManager
from pgcontents.hybridmanager import HybridContentsManager
from IPython.html.services.contents.filemanager import FileContentsManager

c = get_config()

c.NotebookApp.contents_manager_class = HybridContentsManager

c.HybridContentsManager.manager_classes = {
    # Associate the root directory with a PostgresContentsManager.
    # This manager will receive all requests that don"t fall under any of the
    # other managers.
    "": S3ContentsManager,
    # Associate /directory with a FileContentsManager.
    "local_directory": FileContentsManager,
}

c.HybridContentsManager.manager_kwargs = {
    # Args for root PostgresContentsManager.
    "": {
        "access_key_id": "access-key",
        "secret_access_key": "secret-key",
        "endpoint_url": "http://localhost:9000",
        "bucket": "notebooks",
    },
    # Args for the FileContentsManager mapped to /directory
    "local_directory": {
        "root_dir": "/Users/drodriguez/Downloads",
    },
}

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

s3contents-0.1.13.tar.gz (31.9 kB view details)

Uploaded Source

File details

Details for the file s3contents-0.1.13.tar.gz.

File metadata

  • Download URL: s3contents-0.1.13.tar.gz
  • Upload date:
  • Size: 31.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/1.13.0 pkginfo/1.5.0.1 requests/2.21.0 setuptools/41.0.1 requests-toolbelt/0.9.1 tqdm/4.31.1 CPython/3.7.3

File hashes

Hashes for s3contents-0.1.13.tar.gz
Algorithm Hash digest
SHA256 0a7abcdaeed51100bdc6dc090141ff8a9061d01a60210443183c1fe9df0612d6
MD5 9fd04f66b5a7698c05b6afe97717993f
BLAKE2b-256 99c2b683a92228066011a6665770664b10b5ce45be924d1922125c63a53f608b

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page