.. _rucio-storage-element: Rucio Storage Element ===================== .. note:: - **Last modified by**: Rob Barnsley - **Last modified date**: 21/11/2024 - **Support**: https://skao.slack.com/archives/C047DPDKRN0 - **Code Repository (Rucio Client)**: https://gitlab.com/ska-telescope/src/src-dm/ska-src-dm-da-rucio-client Purpose ------- A Rucio Storage Element (RSE) is Rucio's logical abstraction of a storage system. It corresponds to the physical space in which SRCNet data will live. Prerequisites -------------- - A new IAM client (depending on storage backend) - Corresponding entries in the Site Capabilities API for the storage, access protocols and areas to be deployed .. attention:: For SRCNet v0.1 the storage backing at least one of the RSEs at each site should be POSIX compatible and accessible as a traditional file system (mountable). This is to ensure that integration with other services that assume POSIX access is possible. .. _iam-rucio-storage-element: Creating a new IAM Client ^^^^^^^^^^^^^^^^^^^^^^^^^ Depending on which storage backend is selected, it may be necessary to create a new IAM client. Please follow the instructions at :doc:`/services/dependent/iam-client/iam-client` for creating and managing an IAM client. The scopes, grant types and redirect uris will be dependent on the storage backend chosen; please refer to the backend's documentation and select them accordingly. Adding Storages, Protocols and Areas to the Site Capabilities API ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Please follow the instructions at :doc:`/services/dependent/site-capabilities-api/site-capabilities-api` for creating and managing a site in the Site Capabilities database. Of relevance here are the ``Storage``, ``Storage > Supported Protocols`` and ``Storage > Areas`` sections. First, create a new **storage** block and populate the following fields: - ``Hostname``: Set to the hostname of the storage backend, e.g. ``rucio.espsrc.iaa.csic.es`` - ``Base path``: Set to the base path that the storage is served from (if applicable), e.g. ``/disk`` - ``Latitude``: Set to the latitude of the storage site, e.g. 37.1342 - ``Longitude``: Set to the longitude of the storage site, e.g. -3.613 - ``SRM``: Set to the storage backend type, e.g. ``storm``, ``xrd`` or ``ceph`` - ``Size of storage in TB``: Set to the storage capacity in TB, e.g. 10 Next add any **supported protocols** for this **storage**: - ``Prefix``: access protocol prefix, e.g. https - ``Port``: access protocol port, e.g. 443 Finally, add a **storage area** relating to the RSE: .. hint:: Storage **areas** equate to paths on a particular **storage**. This can be a many to one mapping i.e. there can be multiple RSEs (storage areas) per storage. - ``Type``: The storage area type, set this to ``rse`` - ``Relative path``: The path (relative to the base path above) to the RSE **storage area** - ``Identifier``: The name of this RSE **as known by Rucio** .. attention:: It is critical that the identifier field here matches the name used by Rucio. Please ask if you are unsure about this. Deployment ---------- For deployment, you must first choose which backend implementation to use for your :doc:`storage element <../../../dependent/storage-element/storage-element>`. There is currently no preference here. - StoRM WebDAV: :ref:`Manual `, :ref:`Docker (WebDav) ` and :ref:`Helm (WebDav) ` - xrootd: :ref:`Manual (xrootd) ` Testing ^^^^^^^ Before requesting that the RSE be integrated into the datalake, a precursor connectivity test should be undertaken. This checks for common faults that will cause problems during integration. For connectivity testing, the `operator toolbox `_ should be used. .. note:: The operator toolbox requires Docker. First, clone the repository .. code-block:: bash git clone https://gitlab.com/ska-telescope/src/operations/ska-src-operator-toolbox.git cd ska-src-operator-toolbox Then export the following environment variables: .. code-block:: bash export RUCIO_CFG_ACCOUNT= export ENDPOINT_URL= For example, if you want to test the ``ESPSRC`` RSE: .. code-block:: bash export RUCIO_CFG_ACCOUNT=mparra export ENDPOINT_URL=https://spsrc14.iaa.csic.es:18027/disk Finally, run: .. code-block:: bash make report-rse-connectivity and follow the prompts to retrieve a token. A report should be generated for the connectivity test. A successful test looks like: .. code-block:: console BEGIN REPORT ============ report date: Thu Nov 14 14:48:56 UTC 2024 report endpoint: https://spsrc14.iaa.csic.es:18027/disk Port open. OpenSSL verification succeeded gfal-ls check succeeded gfal-copy check succeeded gfal-sum succeeded gfal-rm check succeeded davix-ls check is disabled. davix-put check is disabled. davix-rm check is disabled. rucio upload check is disabled. ========== END REPORT At this point, you can proceed to requesting integration. Integration ----------- For the RSE to be recognised by the datalake, it must first be added as an endpoint to the Rucio server. Request addition of RSE ^^^^^^^^^^^^^^^^^^^^^^^ Once the deployment tests are successful, you will need to contact datalake operators in the `Rucio `_ slack channel to request addition of your RSE to the SKAO datalake. Datalake operators require the following information about access protocols for your RSE: - ``scheme`` e.g. https - ``hostname`` e.g. spsrc14.iaa.csic.es - ``path`` e.g /disk - ``port`` e.g. 18027 Operators will: - add the RSE to the Rucio datalake by following :doc:`this <../../../global/rucio/guides/add-rse>` guide, and - include this RSE in functional testing by adding it to any corresponding `tasks `_ (private). Testing ------- Manual ^^^^^^ For initial testing, the RSE is best interacted with manually via the Rucio CLI. An image of the modified client that contains the configuration for the SKAO datalake can be found `here `_. Before trying any more complicated operations, you should first try authenticating by issuing a ``rucio whoami``. .. attention:: To have an account to authenticate against, you must request to be a member of the ``/services/rucio/roles/user`` group in IAM. The complete usage of the Rucio command line client is not within the remit of this document, please see the official guide `here `_. .. note:: Depending on the RSE, additional scopes may be required during the authentication process. Specifically, for RSEs that enforce fine-grained authz using the WLCG path specification, you may require one or all of ``storage.read:/``, ``storage.modify:/`` and ``storage.create:/``; which claim(s) you require depends on the operation you wish to perform. These claims need to be added to the Rucio configuration file at ``/opt/rucio/etc/rucio.cfg``. Functional Testing ^^^^^^^^^^^^^^^^^^ Once the RSE has been integrated and added to the (automated) functional tests, the `Rucio events dashboard `_ should be reviewed periodically. Specifically, operators should check if there are any entries in the table for failed transfers and failed deletions (towards the bottom of the dashboard) for their RSE and address any underlying issues.