Skip to main content

Proxy batch job requests to kubernetes.

Project description

kbatch-proxy

A simple Kubernetes proxy, allowing JupyterHub users to make requests to the Kubernetes API without having direct access to the Kubernetes API.

Motivation

We want kbatch users to be able to create Kubernetes Jobs, access logs, etc., but

  1. Don't want to grant them direct access to the Kubernetes API
  2. Don't want to maintain a separate web application, with any state that's independent of Kubernetes

Enter kbatch-proxy

Design

A simple FastAPI application that sits in between kbatch users and the Kubernetes API. It's expected that the kbatch-proxy application has access to the Kubernetes API, with permission to create namespaces, jobs, etc. This will often be run as a JupyterHub service.

Users will make requests to kbatch-proxy. Upon request we will

  1. Validate that the user is authenticated with JupyterHub (checking the Bearer token)
  2. Validate that data the user is submitting or requesting meets our security model
  3. Make the request to the Kubernetes API on behalf of the user

Security model

This remains to be proven effective, but the hope is to let users do whatever they want in their own namespace and nothing outside of their namespace.

Container images

We provide container images at https://github.com/kbatch-dev/kbatch/pkgs/container/kbatch-proxy.

$ docker pull ghcr.io/kbatch-dev/kbatch-proxy:latest

Deployment

kbatch-proxy is most easily deployed as a JupyterHub service using Helm. A few values need to be configured:

# file: config.yaml
app:
  jupyterhub_api_token: "<jupyterhub-api-token>"
  jupyterhub_api_url: "https://<jupyterhub-url>/hub/api/"
  extra_env:
    KBATCH_PREFIX: "/services/kbatch"

# image:
#   tag: "0.1.4"  # you likely want to pin the latest here.

Note: we don't currently publish a helm chart, so you have to git clone the kbatch repository.

From the kbatch/kbatch-proxy directory, use helm to install the chart

$ helm upgrade --install kbatch-proxy ../helm/kbatch-proxy/ \
    -n "<namepsace> \
    -f config.yaml

You'll need to configure kbatch as a JupyterHub service. This example makes it available at /services/kbatch (this should match KBATCH_PREFIX above):

jupyterhub:
  hub:
    services:
      kbatch:
        admin: true
        api_token: "<jupyterhub-api-token>"  # match the api token above
        url: "http://kbatch-proxy.<kbatch-namespace>.svc.cluster.local"

That example relies on kbatch being deployed to the same Kubernetes cluster as JupyterHub, so JupyterHub can proxy requests to kbatch-proxy using Kubernetes' internal DNS. The namespace in that URL should match the namespace where kbatch was deployed.

Dask Gateway Integration

If your JupyterHub is deployed with Dask Gateway, you might want to set a few additional environment variables in the job so that they behave similarly to the singleuser notebook pod.

app:
  extra_env:
    KBATCH_JOB_EXTRA_ENV: |
      {
        "DASK_GATEWAY__AUTH__TYPE": "jupyterhub",
        "DASK_GATEWAY__CLUSTER__OPTIONS__IMAGE": "{JUPYTER_IMAGE_SPEC}",
        "DASK_GATEWAY__ADDRESS":  "https://<JUPYTERHUB_URL>/services/dask-gateway",
        "DASK_GATEWAY__PROXY_ADDRESS": "gateway://<DASK_GATEWAY_ADDRESS>:80"
      }

Development setup

We don't have a fully working docker-ized setup, since we (i.e. Tom) don't know how to do Kubernetes within docker. So the current setup relies on

  1. k3d for Kubernetes
  2. JupyterHub as a regular Python process
  3. kbatch-proxy as a regular Python process

Create a cluster

$ k3d cluster create ksubmit

Create a Hub

make sure to npm install configurable-http-proxy.

$ cd hub
$ jupyterhub

Start kbatch-proxy

KBATCH_PREFIX=/services/kbatch \
  KBATCH_PROFILE_FILE=tests/profile_template.yaml \
  JUPYTERHUB_API_TOKEN=super-secret \
  JUPYTERHUB_API_URL=http://127.0.0.1:8000/hub/api \
  JUPYTERHUB_HOST=http://127.0.0.1:8000 \
  uvicorn kbatch_proxy.main:app --reload --port=8050

You'll might want to log in and create a token at http://localhost:8000/hub/token. The kbatch configure with that token.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

kbatch-proxy-0.3.2.tar.gz (12.9 kB view details)

Uploaded Source

Built Distribution

kbatch_proxy-0.3.2-py3-none-any.whl (13.0 kB view details)

Uploaded Python 3

File details

Details for the file kbatch-proxy-0.3.2.tar.gz.

File metadata

  • Download URL: kbatch-proxy-0.3.2.tar.gz
  • Upload date:
  • Size: 12.9 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for kbatch-proxy-0.3.2.tar.gz
Algorithm Hash digest
SHA256 c293dbbdc93f1906b753004411b83f0a1a476df5442564401fe2ff3f59179f78
MD5 74493013ce3d83c65872eb0eb5c26b67
BLAKE2b-256 70d239869fdc5ab452bf1c450060f8d8c649a2229adf93d8105cb1166ebccf38

See more details on using hashes here.

Provenance

File details

Details for the file kbatch_proxy-0.3.2-py3-none-any.whl.

File metadata

  • Download URL: kbatch_proxy-0.3.2-py3-none-any.whl
  • Upload date:
  • Size: 13.0 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/3.7.1 importlib_metadata/4.10.1 pkginfo/1.8.2 requests/2.27.1 requests-toolbelt/0.9.1 tqdm/4.62.3 CPython/3.10.1

File hashes

Hashes for kbatch_proxy-0.3.2-py3-none-any.whl
Algorithm Hash digest
SHA256 8e9a28f7d7b1b6608f36c68aa12808fceba90538be8e95b4744608df9f6fca26
MD5 8de50fe2447a1b5b9244058443da6d31
BLAKE2b-256 f77a4cdd310c7b0cc431c4f4859dc46d23db6daf727420fbe06188430c2efe87

See more details on using hashes here.

Provenance

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page