Gatekeeper

Note

Last modified by: Rob Barnsley
Last modified date: 14/11/2024
Support: https://skao.slack.com/archives/C05A1D147FV
Code Repository: https://gitlab.com/ska-telescope/src/src-dm/ska-src-dm-da-service-gatekeeper

Purpose

This service is an nginx reverse proxy that acts as a protected access point for firewalled compute services by first authorising requests via the Permissions API.

Prerequisites

A Kubernetes cluster
A new IAM client for service-to-service communication with the Site Capabilties API
Corresponding entries in the Site Capabilities API for the services to be proxied
User group membership for service access, and, depending on what service is to be accessed, any additional groups required, e.g. for soda, access to data is expected, so users will need to be a member of the corresponding /data/namespaces/<namespace> group

Creating a new IAM Client

A new IAM client is required for service-to-service communication between the Gatekeeper and the Site Capabilities API. There should be a separate client per deployment.

Please follow the instructions at IAM Client Configuration for creating and managing an IAM client.

This new client requires the following grant types:

client_credentials

and the following scopes:

site-capabilities-api-service

Tip

Make a note of the resulting client ID and secret. These will be needed during the deployment process.

Hint

A token can be retrieved from this client by using the following command:

$ curl -XPOST https://ska-iam.stfc.ac.uk/token --data grant_type=client_credentials \
--data client_id=<client_id> --data client_secret=<client_secret> \
--data audience=site-capabilities-api

and decoded (client side) using jwt.io. This can be used to check that a client has been registered successfully and can be used to get a token with the required scope.

Adding Services to the Site Capabilities API

For each firewalled service to be proxied by the Gatekeeper, a corresponding entry must exist in the Site Capabilities database.

Please follow the instructions at Site Capabilities for creating and managing a site in the Site Capabilities database. Of relevance here is the Compute > Associated local services section.

Attention

A separate service entry must exist in the section for each service that the Gatekeeper is proxying, not just a single service for the Gatekeeper.

For each service in this section, the following fields should be populated. Note that the prefix, hostname, port and path all point toward the Gatekeeper service, as this is the entry point for the firewalled service that it is proxying requests to.

Type: Select whatever type of service the Gatekeeper is proxying, e.g. echo, soda_sync
Prefix: Set to the prefix of the (externally exposed) gatekeeper service, e.g. https
Hostname: Set to the hostname of the (externally exposed) gatekeeper service, e.g. gatekeeper.srcdev.skao.int
Port: Set to the port of the (externally exposed) gatekeeper service, e.g.
Path: Set to the request path, e.g. echo, /soda corresponding to the request route that the Gatekeeper is proxying
Via proxy: Set to enabled
Mandatory Set to enabled if the service is mandatory

At this point the page should now be reloaded. A uuid should have been generated for this service in the ID box of the corresponding service’s tab in Compute > Associated local services.

Tip

Make a note of this uuid, it will be needed during the deployment process.

Adding user to the required groups for access

Access to services is managed through the Permissions API, which in turn, checks IAM user group membership. To use a proxied service, the user must be a member of the following groups:

services/<src>/<service_type>

where <src> is one of the formally recognised names in the SRC naming scheme and <service_type> corresponds to the Type field in the Compute > Associated local services of the Site Capabilities API (or alternatively see here for the list).

Caution

Depending on which service is being proxied, additional groups may be required, e.g. for services that access data in a particular <namespace>>, data/namespace/<namespace> membership will be required. For more information, please refer to the corresponding service’s documentation.

Deployment

With docker-compose

Deployment with docker-compose is possible, but is expected only to be useful for development purposes. As such, instructions for this are currently only kept at the code repository in the note at the top of this page.

With Helm

Deployment via Helm follows the standard procedure for Helm charts. A helm chart does not currently exist, so it is required to clone the code repository first:

$ git clone https://gitlab.com/ska-telescope/src/src-dm/ska-src-dm-da-service-gatekeeper
$ cd ska-src-dm-da-service-gatekeeper
$ helm upgrade --install --create-namespace -n service-gatekeeper --values values.yaml service-gatekeeper ska-src-dm-da-service-gatekeeper/etc/helm/

An example values.yaml.template is included in the repository. Included in the deployment is an “echo” service, which can be used to verify that the Gatekeeper service has been deployed successfully; it is a dummy backend that responds to requests with the request parameters and body.

The key values an operator may wish/need to configure are:

deployment_echo:
  namespace: service-gatekeeper-echo
  image: harbor.srcdev.skao.int/ska-src-dm-da-service-gatekeeper/service-gatekeeper-echo:1.0.1

gatekeeper:
  namespace: service-gatekeeper
  ingress_proxyBodySize: 5000m
  ingress_proxyBuffering: "off"
  ingress_proxyRequestBuffering: "off"
  iam_token_endpoint: https://ska-iam.stfc.ac.uk/token
  permissions_api_plugin_authz_endpoint: https://permissions.srcdev.skao.int/api/v1/authorise/plugin/
  site_capabilities_api_get_service_by_id_endpoint: https://site-capabilities.srcdev.skao.int/api/v1/services/
  site_capabilities_gatekeeper_client_id:         # the client id for site-capabilities gatekeeper service client for this node
  site_capabilities_gatekeeper_client_secret:     # the client secret for site-capabilities gatekeeper service client for this node
  site_capabilities_gatekeeper_client_scopes: site-capabilities-api-service
  site_capabilities_gatekeeper_client_audience: site-capabilities-api
  services_cache_ttl: 3600
  services:
    - route: "/echo"                          # request route
      namespace: service-gatekeeper-echo      # namespace the service will run in, can be different to gatekeeper ns
      prefix: "http://"                       # usually http:// assuming SSL termination occurs upstream
      service_name: "service-gatekeeper-echo" # to proxied address
      ingress_host: ""                        # Host domain the Ingress rules will apply to
      port: 8080
      uuid: ""                                # autogenerated by the site capabilities catalogue; a corresponding entry must exist in iam

The gatekeeper section houses settings required to configure the Gatekeeper service itself. This includes the site_capabilities_gatekeeper_client_ settings which will allow the Gatekeeper to retrieve up to date service information, and information about the services to be protected. These services should be added to the gatekeeper.services section, and will be used to configure routes in the nginx service, and also for creating Ingress resources for each protected service.

The ingress settings will be applied to all of these Ingress resources created on deployment.

Notes

Deployment assumes that a separate (nginx) ingress controller will be created, as the logic to perform the permissions checking has been made part of the controller. It is recognised that this would naturally lead to the creation of another load balancer. Deployments that do not want another load balancer created are advised to ingress into the controller from their existing controller.

Testing

To test the Gatekeeper service with the echo backend, you must first get a user token. This can be retrieved using the ska-src-clients python package with the following command to log in:

$ srcnet-oper token request

following the on-screen prompts and then:

$ srcnet-oper token get authn-api

to get the token. Alternatively, one can use oidc-agent but the usage of this tool is not within the remit of this documentation.

This token can then be used in the Authorization header of the request to the echo service, e.g.

$ curl https://<gatekeeper_hostname>:<gatekeeper_port>/echo?hello=world -H "Authorization: Bearer <token>"

where <gatekeeper_hostname> and <gatekeeper_port> are the hostname and port the Gatekeeper service respectively. <token> is the token outputted from the above command.

The successful result of this command should be:

{"request_uri":"/?hello=world","request_parameters":{"hello":"world"}