Back ups in the archive HPSS/HSM
Back ups in the archive HPSS/HSM#
Before data is irreversibly removed from the pool disk storage, we back up all data in the HPSS (High Performance Storage System). For that, we use the package packems
developed in colaboration by MPI-M and DKRZ. Here is a small introduction on how we use it on mistral.
packems
takes directories to be packed into tar files, and optionally pushes these directly to the tape archiving system. Features are parallel operation, batch job, and error recovery. It keeps track of the user’s archiving operations to re-use this information for later retrieval of data. Currently only HPSS’s pftp interface is supported for archiving
In a first step, we load the necessary modules.
%bash
module unload python
module load python3/2020.02-gcc-9.1.0
module load packems/1.2.2
Input In [1]
module unload python
^
SyntaxError: invalid syntax
You need a key to connect to the HPSS. On mistral, you can use
kinit
to authenticate with Kerberos.
!kinit -AVR -l 42d
kinit: Configuration file does not specify default realm when parsing name gitlab-runner
We construct a file list of all files that should be transferred to HPSS. That list serves as input for
packems
.
sourcePath="/mnt/lustre02/work/ik1017/CMIP6/data/CMIP6_retracted"
MIP="ScenarioMIP"
filelist=MIP+".txt"
!find {sourcepath} -type f -exec ls -la {} \; | awk '{ print $9 " " $5 }' > {filelist} ;
find: ‘{sourcepath}’: No such file or directory
If you extend the archive with new
.tar
files, you need to find out the last number of tars. For that, you should get the Index file from the archive and get it from there.
index="INDEX_"+MIP+".txt"
lastTar = ! $(expr $(cat {index} | grep '\.tar' | cut -d '>' -f 2 | cut -d '_' -f 2 | cut -d '.' -f 1 | sort -g | tail -n 1 ) + 0)
Run
packems
%bash
archive_dir="/hpss/arch/ik1017/cmip6/CMIP6_retracted"
!packems -j 12 \ #parallelisation processes
-L \ #follow links
-F \ #fail fast
-a \ #generate archiving commands for hpss transfer
-x \ #add group to index file
-S ${archiveDir} \ #destination
-s $(({lastTar}+1)) \ #starts with number of .tar packages
-d ${MIP} \ #output dir for tar files
-o ${MIP} \ #prefix for tar files
-O by_name \
-D by_order \ #how to add files to archive
-i {filelist} \ #input file list
--archive-index INDEX_${MIP}.txt
Input In [5]
-L \ #follow links
^
IndentationError: unexpected indent