Rucio Storage Element
Note
Last modified by: Rob Barnsley
Last modified date: 21/11/2024
Code Repository (Rucio Client): https://gitlab.com/ska-telescope/src/src-dm/ska-src-dm-da-rucio-client
Purpose
A Rucio Storage Element (RSE) is Rucio’s logical abstraction of a storage system. It corresponds to the physical space in which SRCNet data will live.
Prerequisites
A new IAM client (depending on storage backend)
Corresponding entries in the Site Capabilities API for the storage, access protocols and areas to be deployed
Attention
For SRCNet v0.1 the storage backing at least one of the RSEs at each site should be POSIX compatible and accessible as a traditional file system (mountable). This is to ensure that integration with other services that assume POSIX access is possible.
Creating a new IAM Client
Depending on which storage backend is selected, it may be necessary to create a new IAM client.
Please follow the instructions at IAM Client Configuration for creating and managing an IAM client. The scopes, grant types and redirect uris will be dependent on the storage backend chosen; please refer to the backend’s documentation and select them accordingly.
Adding Storages, Protocols and Areas to the Site Capabilities API
Please follow the instructions at Site Capabilities for creating
and managing a site in the Site Capabilities database. Of relevance here are the
Storage
, Storage > Supported Protocols
and Storage > Areas
sections.
First, create a new storage block and populate the following fields:
Hostname
: Set to the hostname of the storage backend, e.g.rucio.espsrc.iaa.csic.es
Base path
: Set to the base path that the storage is served from (if applicable), e.g./disk
Latitude
: Set to the latitude of the storage site, e.g. 37.1342Longitude
: Set to the longitude of the storage site, e.g. -3.613SRM
: Set to the storage backend type, e.g.storm
,xrd
orceph
Size of storage in TB
: Set to the storage capacity in TB, e.g. 10
Next add any supported protocols for this storage:
Prefix
: access protocol prefix, e.g. httpsPort
: access protocol port, e.g. 443
Finally, add a storage area relating to the RSE:
Hint
Storage areas equate to paths on a particular storage. This can be a many to one mapping i.e. there can be multiple RSEs (storage areas) per storage.
Type
: The storage area type, set this torse
Relative path
: The path (relative to the base path above) to the RSE storage areaIdentifier
: The name of this RSE as known by Rucio
Attention
It is critical that the identifier field here matches the name used by Rucio. Please ask if you are unsure about this.
Deployment
For deployment, you must first choose which backend implementation to use for your storage element. There is currently no preference here.
StoRM WebDAV: Manual, Docker (WebDav) and Helm (WebDav)
xrootd: Manual (xrootd)
Testing
Before requesting that the RSE be integrated into the datalake, a precursor connectivity test should be undertaken. This checks for common faults that will cause problems during integration.
For connectivity testing, the operator toolbox should be used.
Note
The operator toolbox requires Docker.
First, clone the repository
git clone https://gitlab.com/ska-telescope/src/operations/ska-src-operator-toolbox.git
cd ska-src-operator-toolbox
Then export the following environment variables:
export RUCIO_CFG_ACCOUNT=<SKAO IAM username>
export ENDPOINT_URL=<full path to storage endpoint>
For example, if you want to test the ESPSRC
RSE:
export RUCIO_CFG_ACCOUNT=mparra
export ENDPOINT_URL=https://spsrc14.iaa.csic.es:18027/disk
Finally, run:
make report-rse-connectivity
and follow the prompts to retrieve a token.
A report should be generated for the connectivity test. A successful test looks like:
BEGIN REPORT
============
report date: Thu Nov 14 14:48:56 UTC 2024
report endpoint: https://spsrc14.iaa.csic.es:18027/disk
Port open.
OpenSSL verification succeeded
gfal-ls check succeeded
gfal-copy check succeeded
gfal-sum succeeded
gfal-rm check succeeded
davix-ls check is disabled.
davix-put check is disabled.
davix-rm check is disabled.
rucio upload check is disabled.
==========
END REPORT
At this point, you can proceed to requesting integration.
Integration
For the RSE to be recognised by the datalake, it must first be added as an endpoint to the Rucio server.
Request addition of RSE
Once the deployment tests are successful, you will need to contact datalake operators in the Rucio slack channel to request addition of your RSE to the SKAO datalake. Datalake operators require the following information about access protocols for your RSE:
scheme
e.g. https
hostname
e.g. spsrc14.iaa.csic.es
path
e.g /disk
port
e.g. 18027
Operators will:
Testing
Manual
For initial testing, the RSE is best interacted with manually via the Rucio CLI. An image of the modified client that
contains the configuration for the SKAO datalake can be found
here. Before trying any
more complicated operations, you should first try authenticating by issuing a rucio whoami
.
Attention
To have an account to authenticate against, you must request to be a member of the /services/rucio/roles/user
group in IAM.
The complete usage of the Rucio command line client is not within the remit of this document, please see the official guide here.
Note
Depending on the RSE, additional scopes may be required during the authentication process. Specifically, for RSEs
that enforce fine-grained authz using the WLCG path specification, you may require one or all of storage.read:/
,
storage.modify:/
and storage.create:/
; which claim(s) you require depends on the operation you wish to
perform. These claims need to be added to the Rucio configuration file at /opt/rucio/etc/rucio.cfg
.
Functional Testing
Once the RSE has been integrated and added to the (automated) functional tests, the Rucio events dashboard should be reviewed periodically. Specifically, operators should check if there are any entries in the table for failed transfers and failed deletions (towards the bottom of the dashboard) for their RSE and address any underlying issues.