Local monitoring

To monitor SRCNet Local services, one of the options is to deploy a monitoring service using grafana (to view the metrics visually), prometheus (stores the metrics database) and blackbox exporter (exposes the metrics of the services). With this deployment, it is possible to monitor the SRC services that are exposed both externally and those services that are only exposed locally.

Deploy of Exporter, Prometheus and Grafana

Docker

This deployment is based on docker-compose. To start the deployment it is necessary to have docker and docker-compose installed on the system that will be the central point to collect metrics. This system must have access to the exposed local and external services.

We need to create the next configuration files:

docker-compose.yaml

With the next code:

version: '3.7'
services:
prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
    - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
    - ./prometheus/data:/prometheus
    ports:
    - "EXTERNAL_PORT1:9090"
    command:
    - '--config.file=/etc/prometheus/prometheus.yml'
    - '--storage.tsdb.path=/prometheus'
    - '--storage.tsdb.retention.time=1y'

grafana:
        #image: grafana/grafana:latest
    image: grafana/grafana-enterprise:11.2.0-ubuntu
    container_name: grafana
    environment:
    - GF_PATHS_PROVISIONING=/etc/grafana/provisioning
    - GF_SERVER_ROOT_URL=<SERVER ROOT>
    volumes:
    - ./grafana/grafana.ini:/etc/grafana/grafana.ini
    - ./grafana/data:/var/lib/grafana
    ports:
    - "EXTERNAL_PORT2:3000"

blackbox_exporter:
    image: prom/blackbox-exporter:latest
    container_name: blackbox_exporter
    volumes:
    - ./blackbox_exporter/blackbox.yml:/etc/blackbox_exporter/config.yml
    ports:
    - "EXTERNAL_PORT3:9115"
    command:
    - '--config.file=/etc/blackbox_exporter/config.yml'

Change SERVER ROOT> with your service URL for Grafana.

prometheus/prometheus.yml

With the next code:

global:
scrape_interval: 120s

scrape_configs:
- job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:EXTERNAL_PORT1']

- job_name: 'blackbox'
    metrics_path: /probe
    params:
    module: [http_2xx]
    static_configs:
    - targets: # ADD HERE THE URL OF YOUR SERVICES AS MANY AS YOU WANT TO MONITOR
        - https://spsrc25.iaa.csic.es
        - <OTHER SERVICES>
    relabel_configs:
    - source_labels: [__address__]
        target_label: __param_target
    - source_labels: [__param_target]
        target_label: instance
    - target_label: __address__
        replacement: blackbox_exporter:9115

Change/add services you want to monitor by checking http 2XX. In this example we’ve added https://spsrc25.iaa.csic.es, but it can be any local or external service by adding new URLs in <OTHER SERVICES>.

grafana/grafana.ini

[server]
#http_port = 3000 # if you need to use a specific port
root_url = <SERVICE URL>

[security]
admin_user = admin
admin_password = <YOUR_PASSWORD>

[auth]
oauth_allow_insecure_email_lookup=true

# The following configuration is to use SKAO IAM service
[auth.generic_oauth]
enabled = true
name = "IAM Provider"                 # Name in the button in login page
allow_sign_up = true                  # Allowing new users to register
client_id = <CLIENT ID>            # IAM client id
client_secret = <CLIENT SECRET>    # Client password
scopes = openid profile email         # Scopes
auth_url = https://ska-iam.stfc.ac.uk/authorize  #
token_url = https://ska-iam.stfc.ac.uk/token #
api_url = https://ska-iam.stfc.ac.uk/userinfo #
redirect_uri = https://your.domain.com/login/generic_oauth

Change <SERVICE URL> with the URL of your grafana service endpoint. Add a password for the admin account in <YOUR_PASSWORD> and include a <CLIENT ID> and <CLIENT SECRET> with the values of an SKA IAM client that you have previously created.

Finally run:

docker compose up -d

In order to be able to visualise the metrics in Grafana of the exposed services, it is necessary to install a Grafana dashboard that has a predefined interface for this purpose. Dashboard. A dashboard can be selected to test and check the metrics:

TBD Node Exporter Deployment (nodes)

As we have seen, the implemented solution is based on blackbox exporter. This is in charge of reading the state of the service from outside. If we want to monitor the machines, both physical and virtual, the same solution can be deployed but based on node exporter.

We need to create the next configuration files:

docker-compose.yaml

With the next code:

version: '3.7'
services:
prometheus:
    image: prom/prometheus:latest
    container_name: prometheus
    volumes:
    - ./prometheus/prometheus.yml:/etc/prometheus/prometheus.yml
    - ./prometheus/data:/prometheus
    ports:
    - "EXTERNAL_PORT1:9090"
    command:
    - '--config.file=/etc/prometheus/prometheus.yml'
    - '--storage.tsdb.path=/prometheus'
    - '--storage.tsdb.retention.time=1y'

grafana:
        #image: grafana/grafana:latest
    image: grafana/grafana-enterprise:11.2.0-ubuntu
    container_name: grafana
    environment:
    - GF_PATHS_PROVISIONING=/etc/grafana/provisioning
    - GF_SERVER_ROOT_URL=<SERVER ROOT>
    volumes:
    - ./grafana/grafana.ini:/etc/grafana/grafana.ini
    - ./grafana/data:/var/lib/grafana
    ports:
    - "EXTERNAL_PORT2:3000"

Change SERVER ROOT> with your service URL for Grafana.

prometheus/prometheus.yml

With the next code:

global:
  evaluation_interval: 1m
  scrape_interval: 1m
  scrape_timeout: 1m
  external_labels:
    environment: <SERVER ROOT>

scrape_configs:
- job_name: 'prometheus-node-monitoring'
  static_configs:
    - targets: ['localhost:EXTERNAL_PORT1']

- job_name: 'vm2monitor'
    metrics_path: /metrics
    params:
    module: [Mod_to_use]]
    static_configs:
    - targets: # ADD HERE THE URL OF YOUR SERVICES AS MANY AS YOU WANT TO MONITOR
        - https://spsrc25.iaa.csic.es
        - <OTHER SERVICES>
    relabel_configs:
    - source_labels: [__address__]
        target_label: __param_target
    - source_labels: [__param_target]
        target_label: instance

Integration of Monitoring

Add Monitoring service to SRCNet Site Capabilities

To perform this operation you have to apply for membership from your SKAO IAM account control panel in the following groups, change <SRC> with the name of your SRC:

services/site-capabilities-api/roles/<SRC>/manager
services/site-capabilities-api/roles/<SRC>/viewer

as this example:

services/site-capabilities-api/roles/SPSRC/manager
services/site-capabilities-api/roles/SPSRC/viewer

Once the request is accepted you can add this new capability to the SRCNet Site Capabilities database. To do it, you must follow the steps below.

1. Go to `https://site-capabilities.srcdev.skao.int/api/v1/www/sites/add/<SRC> <https://site-capabilities.srcdev.skao.int/api/v1/www/sites/add/<SRC>>``_ and change <SRC> to the name of your SRC (i.e. `https://site-capabilities.srcdev.skao.int/api/v1/www/sites/add/<SRC> <https://site-capabilities.srcdev.skao.int/api/v1/www/sites/add/ESSRC`_ ). 2. Log in with your SKAO-IAM credentials and you will see You are now logged in. 3. Then go back to the same link as above: https://site-capabilities.srcdev.skao.int/api/v1/www/sites/add/<SRC> <https://site-capabilities.srcdev.skao.int/api/v1/www/sites/add/<SRC>>_. 4. You will see a webpage to manage all the capabilities of your SRC.

Finally add the your monitoring information. To add this new capability, go to the Compute section and include the requested information by clicking in + icon at the end of the section. Then you can fill the next fields:

  • Type Select monitoring

  • Prefix i.e. https

  • Hostname i.e. your SRC grafana dashboard or another

  • Port (if applicable)

  • Path (if applicable)

Finally click at the bottom of the page in Add to save this new service.