This workflow is applicable to users of Arkvum’s Digital Preservation Platform. The workflow allows asset files to be deleted from a tranche once they have been uploaded to Arkivum. This is possible because it provides a mechanism for those files to be reinstated within the tranche when required.
Run gfs_validate_tranche_checksum.py with <checksumtype> set to “md5” and \<asset file location\> set to “gfs”.
Run gfs_create_arkivum_upload.py with \<checksum type\> set to “md5”.
Upload the resulting package to the S3 bucket and it ingest into Arkivum.
Run gfs_create_or_delete_tranche_asset_folder.py with <action> set to “delete_only_content_of_folder” for the relevant asset_folders.
If a need arises to reinstate the asses-files to the tranche:
Access the Arkivum dashboard and follow the instructions at the top of the “gfs_distribute_arkvum_export_to_tranche.py” script in order to use the “Bulk Export” function to export a BagIt-folder that corresponds to the uploaded tranche.
Unzip the exported zip file containing the BagIt-folder.
Run “gfs_distribute_arkvum_export_to_tranche.py” with <checksumtype> set to “sha256” to move the asset-files from the BagIt-folder to the asset-folders in the specified tranche. The parameter has to be “sha256” rather than “md5” because that is the checksum-type contained in the checksum manifest file in the exported BagIt folder.
This script checks that the checksums listed in the checksum-manifest-file of the BagIt match the checksums of the asset-files after they have been moved.
Run “gfs_validate_tranche_checksum.py with \<checksum type\> set to “md5” and <assetfilelocation> set to “gfs”.
## General Notes
Note that this workflow relies on the structure of the tranche folder being unchanged between the time that gfs_create_arkivum_upload.py script is run, and the time when the gfs_distribute_arkivum_export_to_tranche.py script is run.
It also relies on Arkivum continuing make the “Bulk Export” functionality available in their product, and not changing the format of the export file between the time that gfs_create_arkivum_upload.py script is run and the time when the gfs_distribute_arkivum_export_to_tranche.py script is run.
Given the two caveats mentioned above, it would be prudent to only use this workflow if the drive on which the GFS resides is backed up on a regular basis, and an account is taken of how long the backups are retained.
Detailed instructions on how to run each of the scripts mentioned in this document are contained in the comments at the top of each script.