Skip to main content

Streamlit Connection for cloud and remote file storage.

Project description

Streamlit FilesConnection

Connect to cloud (or local) file storage from your Streamlit app. Powered by st.experimental_connection() and fsspec. Works with Streamlit >= 1.22.

Any fsspec compatible protocol should work, it just needs to be installed. Read more about Streamlit Connections in the official docs.

Quickstart

See the example directory for a fuller example using S3 and/or GCS.

pip install streamlit
pip install git+https://github.com/streamlit/files-connection

Note: Install from pypi coming soon

import streamlit as st
from st_files_connection import FilesConnection

"# Minimal FilesConnection example"
with st.echo():
    conn = st.experimental_connection('my_connection', type=FilesConnection)

# Write a file to local directory if it doesn't exist
test_file = "test.txt"
try:
    _ = conn.read(test_file, input_format='text')
except FileNotFoundError:
    with conn.open(test_file, "wt") as f:
        f.write("Hello, world!")

with st.echo():
    # Read back the contents of the file
    st.write(conn.read(test_file, input_format='text'))

Using for cloud file storage

You can pass the protocol name into st.experimental_connection() as the first argument, for any known fsspec protocol:

# Create an S3 connection
conn = st.experimental_connection('s3', type=FilesConnection)

# Create a GCS connection
conn = st.experimental_connection('gcs', type=FilesConnection)

# Create a Weights & Biases connection
conn = st.experimental_connection('wandb', type=FilesConnection)

For cloud file storage tools (or anything that needs config / credentials) you can specify it in two ways:

  • Using the native configuration / credential approach of the underlying library (e.g. config file or environment variables)
  • Using Streamlit secrets.

For Streamlit secrets, create a section called [connections.<name>] in your .streamlit/secrets.toml file, and add parameters there. You can pass in anything you would pass to an fsspec file system constructor. Additionally:

  • For GCS, the contents of secrets are assumed to be the keys to a token file (e.g. it is passed as a {"token":{<secrets>}} dict)

Main methods

read()

conn.read("path/to/file", input_format="text|csv|parquet|json|jsonl" or None, ttl=None) -> pd.DataFrame

Specify a path to file and input format. Optionally specify a TTL for caching.

Valid values for input_format=:

  • text returns a string
  • json returns a dict or list (depending on the JSON object) - only one object per file is supported
  • csv, parquet, jsonl return a pandas DataFrame
  • None will attempt to infer the input format from file extension of path
  • Anything else (or unrecognized inferred type) raises a ValueError
conn = st.experimental_connection("s3", type=FilesConnection)
df = conn.read(f"my-s3-bucket/path/to/file.parquet", input_format='parquet')
st.dataframe(df)

Note: We want to add a format= argument to specify output format with more options, contributions welcome!

open()

conn.open("path/to/file", mode="rb", *args, **kwargs) -> Iterator[TextIOWrapper | AbstractBufferedFile]

Works just like fsspec AbstractFileSystem.open().

fs

Use conn.fs to access the underlying FileSystem object API.

Contributing

Contributions to this repo are welcome. We are still figuring it out and expect it may get some usage over time. We want to keep the API pretty simple and not increase the maintenance surface area too much. If you are interested in helping to maintain it, reach out to us. Thanks for your patience if it takes a few days to respond.

The best way to submit ideas you might want to work on is to open an issue and tag @sfc-gh-jcarroll and/or any other listed contributors. Please don't spend a bunch of time working on a PR without checking with us first, since it risks the work being wasted and leaving you frustrated.

Also note, the Streamlit experimental_connection() interface is open for 3rd party packages and we look forward to promoting high quality ones in the ecosystem. If you have an idea that differs from our direction here, we would love for you to fork / clone, build it, and share it with us and the wider community. Thank you!

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

st-files-connection-0.1.0.tar.gz (9.8 kB view details)

Uploaded Source

Built Distribution

st_files_connection-0.1.0-py3-none-any.whl (10.8 kB view details)

Uploaded Python 3

File details

Details for the file st-files-connection-0.1.0.tar.gz.

File metadata

  • Download URL: st-files-connection-0.1.0.tar.gz
  • Upload date:
  • Size: 9.8 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? Yes
  • Uploaded via: twine/4.0.2 CPython/3.11.5

File hashes

Hashes for st-files-connection-0.1.0.tar.gz
Algorithm Hash digest
SHA256 6ebc672e466ccd981d77a962a31fd7ab2615545f0b3acda691382f1eb7724f0c
MD5 8e59e4aeddd2faccf2194a30ca708cf7
BLAKE2b-256 76220aaab0013bf311cb37a0262e71964d298783432375201c2ee05515547f45

See more details on using hashes here.

File details

Details for the file st_files_connection-0.1.0-py3-none-any.whl.

File metadata

File hashes

Hashes for st_files_connection-0.1.0-py3-none-any.whl
Algorithm Hash digest
SHA256 869ae5cbc0f5eb65778b133c4348d1debdddb9acef483c0e37c4bc1f9012d29c
MD5 a7fad11b0254bc1bbe5c236c6904bab2
BLAKE2b-256 1e66df147106c2000b5db525a4a3ae7ca99cc28a837b6528f95d4091d528cb06

See more details on using hashes here.

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page