Exposing Cluster Options

By default cluster configuration (e.g. worker memory, docker image, etc…) is set statically by an administrator in the server configuration file. To allow users to change certain parameters when creating a new cluster an administrator must explicitly expose them in the configuration.

User Experience

On the user side, exposing options with allows users to:

>>> options = gateway.cluster_options()
>>> options
Options<worker_cores=1, worker_memory=1.0, environment='basic'>
# Using Gateway.new_cluster
>>> cluster = gateway.new_cluster(worker_cores=2, environment="tensorflow")

# Or using the GatewayCluster constructor
>>> cluster = GatewayCluster(worker_cores=2, environment="tensorflow")
  • If working in a notebook, use the ipywidgets based GUI to configure a cluster.

Cluster options widget

See Configure a cluster for more information on the user experience.

Server Configuration

Options are exposed to the user by setting c.Backend.cluster_options. This configuration field takes either:

Options(*fields[, handler])

A declarative specification of exposed cluster options.

A dask_gateway_server.options.Options object takes two arguments:

  • *fields: One or more dask_gateway_server.options.Field objects, which provide a typed declarative specification of each user facing option.

  • handler: An optional handler function for translating the values set by those options into configuration values to set on the corresponding ClusterConfig. Should have the signature handler(options) or handler(options, user), where options is the validated dict of user options, and user is a User model for that user.

Field objects provide typed specifications for a user facing option. There are several different Field classes available, each representing a different common type:

Integer(field[, default, min, max, label, …])

An integer field, with optional bounds.

Float(field[, default, min, max, label, target])

A float field, with optional bounds.

Bool(field[, default, label, target])

A boolean field.

String(field[, default, label, target])

A string field.

Select(field, options[, default, label, target])

A select field, allowing users to select between a few choices.

Mapping(field[, default, label, target])

A mapping field.

Each field supports the following standard parameters:

  • field: The field name to use. Must be a valid Python identifier. This will be the keyword users use to set this field programmatically (e.g. "worker_cores").

  • default: The default value if the user doesn’t specify this field.

  • label: A human readable label that will be used in GUI representations (e.g. "Worker Cores"). Optional, if not provided field will be used.

  • target: The target key to set in the processed options dict. Must be a valid Python identifier. Optional, if not provided field will be used.

After validation (type, bounds, etc…), a dictionary of all options for a requested cluster is passed to a handler function. This function is run on the dask-gateway-server. Here any additional validation can be done (errors raised in the handler are forwarded to the user), as well as any conversion needed between the exposed option fields and configuration fields on the backing ClusterConfig. The default handler returns the provided options unchanged.

Available options are backend specific. For example, if running on Kubernetes, an options handler can return overrides for any configuration fields on KubeClusterConfig. See Cluster Backends for information on what cluster configuration fields are available for your backend.

Examples

Worker Cores and Memory

Here we expose options for users to configure c.ClusterConfig.worker_cores and c.ClusterConfig.worker_memory. We set bounds on each resource to prevent users from requesting too large of a worker. The handler is used to convert the user specified memory from GiB to bytes (as expected by c.ClusterConfig.worker_memory).

from dask_gateway_server.options import Options, Integer, Float

def options_handler(options):
    return {
        "worker_cores": options.worker_cores,
        "worker_memory": int(options.worker_memory * 2 ** 30),
    }

c.Backend.cluster_options = Options(
    Integer("worker_cores", default=1, min=1, max=4, label="Worker Cores"),
    Float("worker_memory", default=1, min=1, max=8, label="Worker Memory (GiB)"),
    handler=options_handler,
)

Cluster Profiles

Instead of exposing individual options, you may instead wish to expose “profiles” - user-friendly names for common groups of options. For example, here we provide 3 cluster profiles (small, medium, and large) a user can select from.

from dask_gateway_server.options import Options, Select

# A mapping from profile name to configuration overrides
profiles = {
    "small": {"worker_cores": 2, "worker_memory": "4 G"},
    "medium": {"worker_cores": 4, "worker_memory": "8 G"},
    "large": {"worker_cores": 8, "worker_memory": "16 G"},
}

# Expose `profile` as an option, valid values are 'small', 'medium', or
# 'large'. A handler is used to convert the profile name to the
# corresponding configuration overrides.
c.Backend.cluster_options = Options(
    Select(
        "profile",
        ["small", "medium", "large"],
        default="medium",
        label="Cluster Profile",
    ),
    handler=lambda options: profiles[options.profile],
)

Different Options per User Group

Cluster options may be configured to differ based on the user by providing a function for c.Backend.cluster_options. This function receives a dask_gateway_server.models.User object and should return a dask_gateway_server.options.Options object. It may optionally be an async function.

Similar to the last examples, here we expose options for users to configure c.ClusterConfig.worker_cores and c.ClusterConfig.worker_memory. However, we offer different ranges depending on whether or not the user is a member of the “power-users” group.

from dask_gateway_server.options import Options, Integer, Float

def options_handler(options):
    return {
        "worker_cores": options.worker_cores,
        "worker_memory": int(options.worker_memory * 2 ** 30),
    }

def generate_options(user):
    if "power-users" in user.groups:
        options = Options(
            Integer("worker_cores", default=1, min=1, max=8, label="Worker Cores"),
            Float("worker_memory", default=1, min=1, max=16, label="Worker Memory (GiB)"),
            handler=options_handler,
        )
    else:
        options = Options(
            Integer("worker_cores", default=1, min=1, max=4, label="Worker Cores"),
            Float("worker_memory", default=1, min=1, max=8, label="Worker Memory (GiB)"),
            handler=options_handler,
        )

c.Backend.cluster_options = generate_options

User-specific Configuration

Since the handler function can optionally take in the User object, you can use this to add user-specific configuration. Note that you don’t have to expose any configuration options to make use of this, the options handler is called regardless.

Here we configure the worker cores and memory based on the user’s groups:

from dask_gateway_server.options import Options

def options_handler(options, user):
    if "power-users" in user.groups:
        return {
            "worker_cores": 8,
            "worker_memory": "16 G"
        }
    else:
        return {
            "worker_cores": 4,
            "worker_memory": "8 G"
        }

c.Backend.cluster_options = Options(handler=options_handler)