{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## Back ups in the archive HSM\n", "\n", "Before data is irreversibly removed from the pool disk storage, we **back up** all data in the [HSM](https://docs.dkrz.de/doc/datastorage/hsm/index.html) (Hierachical Storage Management System). For that, we use the package [`packems`](https://code.mpimet.mpg.de/projects/esmenv/wiki/Packems-122) developed in colaboration by MPI-M and DKRZ. Here is a small introduction on how we use it on Levante.\n", "\n", "> `packems` takes directories to be packed into tar files, and optionally pushes these directly to the tape archiving system. Features are parallel operation, batch job, and error recovery. It keeps track of the user's archiving operations to re-use this information for later retrieval of data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "1. In a first step, we load the necessary modules. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%bash\n", "module load packems" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Login to HSM:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "!slk login" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. We construct a file list of all files that should be transferred to tape. That list serves as input for `packems`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sourcePath=\"/work/ik1017/CMIP6/data/CMIP6\"\n", "MIP=\"DCPP\"\n", "filelist=MIP+\".txt\"\n", "!find {sourcepath} -type f -exec ls -la {} \\; | awk '{ print $9 \" \" $5 }' > {filelist} ;" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "4. If you extend the archive with new `.tar` files, you need to find out the last number of tars. For that, you should get the Index file from the archive and get it from there." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "index=\"INDEX_\"+MIP+\".txt\"\n", "lastTar = ! $(expr $(cat {index} | grep '\\.tar' | cut -d '>' -f 2 | cut -d '_' -f 2 | cut -d '.' -f 1 | sort -g | tail -n 1 ) + 0)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "5. Run `packems`" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%bash\n", "archive_dir=\"/arch/ik1017/cmip6/CMIP6_retracted\"\n", "!packems -j 12 \\ #parallelisation processes\n", " -L \\ #follow links\n", " -F \\ #fail fast\n", " -a \\ #generate archiving commands for hsm transfer\n", " -x \\ #add group to index file\n", " -S ${archiveDir} \\ #destination\n", " -s $(({lastTar}+1)) \\ #starts with number of .tar packages\n", " -d ${MIP} \\ #output dir for tar files\n", " -o ${MIP} \\ #prefix for tar files\n", " -O by_name \\\n", " -D by_order \\ #how to add files to archive\n", " -i {filelist} \\ #input file list\n", " --archive-index INDEX_${MIP}.txt" ] } ], "metadata": { "kernelspec": { "display_name": "python3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.9" }, "nbsphinx": { "execute": "never" } }, "nbformat": 4, "nbformat_minor": 4 }