Install on a Kubernetes Cluster¶
Here we provide instructions for installing and configuring
dask-gateway-server on a Kubernetes Cluster.
Create a Kubernetes Cluster (optional)¶
If you don’t already have a cluster running, you’ll want to create one. There are plenty of guides online for how to do this. We recommend following the excellent documentation provided by zero-to-jupyterhub-k8s.
Install Helm (optional)¶
If you don’t already have Helm installed, you’ll need to install it locally,
tiller is running on your cluster. As with above, there are
plenty of guides online for doing this. We recommend following the guide
provided by zero-to-jupyterhub-k8s.
At this point you should have a Kubernetes cluster with Helm installed and configured. You are now ready to install Dask-Gateway on your cluster.
Add the Helm Chart Repository¶
To avoid downloading the chart locally from GitHub, you can use the Dask-Gateway Helm chart repository.
$ helm repo add dask-gateway https://dask.org/dask-gateway-helm-repo/ $ helm repo update
The Helm chart provides access to configure most aspects of the
dask-gateway-server. These are provided via a configuration YAML file (the
name of this file doesn’t matter, we’ll use
At a minimum, you’ll need to set a value for
gateway.proxyToken. This is a
random hex string representing 32 bytes, used as a security token between the
gateway and its proxies. You can generate this using
$ openssl rand -hex 32
Write the following into a new file
TOKEN> with the output of the previous command above.
gateway: proxyToken: "<RANDOM TOKEN>"
There Helm chart exposes many more configuration values, see the default values.yaml file for more information.
Install the Helm Chart¶
To install the Dask-Gateway Helm chart, run the following command:
RELEASE=dask-gateway NAMESPACE=dask-gateway VERSION=0.5.0 helm upgrade --install \ --namespace $NAMESPACE \ --version $VERSION \ --values path/to/your/config.yaml \ $RELEASE \ dask-gateway/dask-gateway
RELEASEis the Helm release name to use (we suggest
dask-gateway, but any release name is fine).
NAMESPACEis the Kubernetes namespace to install the gateway into (we suggest
dask-gateway, but any namespace is fine).
VERSIONis the Helm chart version to use. To use the latest published version you can omit the
--versionflag entirely. See the Helm chart repository for an index of all available versions.
path/to/your/config.yamlis the path to your
config.yamlfile created above.
Running this command may take some time, as resources are created and images
are downloaded. When everything’s ready, running the following command will
EXTERNAL-IP addresses for all
LoadBalancer services (highlighted
$ kubectl get service --namespace dask-gateway NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE scheduler-api-dask-gateway ClusterIP 10.51.245.233 <none> 8001/TCP 6m54s scheduler-public-dask-gateway LoadBalancer 10.51.253.105 22.214.171.124 8786:31172/TCP 6m54s web-api-dask-gateway ClusterIP 10.51.250.11 <none> 8001/TCP 6m54s web-public-dask-gateway LoadBalancer 10.51.247.160 126.96.36.199 80:30304/TCP 6m54s
At this point, you have a fully running
Connecting to the gateway¶
To connect to the running
dask-gateway-server, you’ll need the external
IP’s from both the
scheduler-public-* services above.
web-public-* service provides access to API requests, and also proxies
out the Dask Dashboards. The
scheduler-public-* service proxies TCP
traffic between Dask clients and schedulers.
To connect, create a
dask_gateway.Gateway object, specifying the both
scheduler-proxy-* IP/port goes under
Using the same values as above:
>>> from dask_gateway import Gateway >>> gateway = Gateway( ... "http://188.8.131.52", ... proxy_address="tls://184.108.40.206:8786" ... )
You should now be able to make API calls. Try
dask_gateway.Gateway.list_clusters(), this should return an empty list.
>>> gateway.list_clusters() 
Shutting everything down¶
When you’re done with the gateway, you’ll want to delete your deployment and
clean everything up. You can do this with
$ helm delete --purge $RELEASE
The Kubernetes API is large, and not all configuration fields you may want to set on scheduler/worker pods are directly exposed by the Helm chart. To address this, we provide a few fields for forwarding configuration directly to the underlying kubernetes objects:
These allow configuring any unexposed fields on the pod/container for
schedulers and workers respectively. Each takes a mapping of key-value pairs,
which is deep-merged with any settings set by dask-gateway itself (with
preference given to the
extra*Config values). Note that keys should be
camelCase (rather than
snake_case) to match those in the kubernetes
For example, this can be useful for setting things like tolerations or node affinities on scheduler or worker pods. Here we configure a node anti-affinity for scheduler pods to avoid preemptible nodes:
gateway: clusterManager: scheduler: extraPodConfig: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: cloud.google.com/gke-preemptible operator: DoesNotExist
For information on allowed fields, see the Kubernetes documentation:
Authenticating with JupyterHub¶
JupyterHub provides a multi-user interactive notebook environment. Through
the zero-to-jupyterhub-k8s project, many companies and institutions have setup
JuypterHub to run on Kubernetes. When deploying Dask-Gateway alongside
JupyterHub, you can configure Dask-Gateway to use JupyterHub for
authentication. To do this, we register
dask-gateway as a JupyterHub
First we need to generate an API Token - this is commonly done using
$ openssl rand -hex 32
Then add the following lines to your
gateway: auth: type: jupyterhub jupyterhub: apiToken: "<API TOKEN>"
<API TOKEN> with the output from above.
If you’re not deploying Dask-Gateway in the same cluster and namespace as
JupyterHub, you’ll also need to specify JupyterHub’s API url. This is usually
of the form
JupyterHub and Dask-Gateway are on the same cluster and namespace you can omit
this configuration key, the address will be inferred automatically.
gateway: auth: type: jupyterhub jupyterhub: apiToken: "<API TOKEN>" apiUrl: "<API URL>"
You’ll also need to add the following to the
config.yaml file for your
JupyterHub Helm Chart.
hub: services: dask-gateway: apiToken: "<API TOKEN>"
<API TOKEN> with the output from above.
With this configuration, JupyterHub will be used to authenticate requests
between users and the
dask-gateway-server. Note that users will need to add
auth="jupyterhub" when they create a Gateway
>>> from dask_gateway import Gateway >>> gateway = Gateway( ... "http://220.127.116.11", ... proxy_address="tls://18.104.22.168:8786", ... auth="jupyterhub", ... )