... | ... | @@ -72,11 +72,11 @@ The documentation is currently skewed towards archival processing, and specifica |
|
|
|
|
|
One of the first things to consider when embarking on either a digitisation project, or a migration of born-digital material, is the level to which the material should be divided up (the granularity). For example, in the case of a digitisation project that involves the digitisation of bound volumes of pamphlets, should a bound volume be treated as a single item, and given just one metadata entry, or should each pamphlet have its own metadata entry? If it is the latter, the discoverability of the material will be improved, once it has been uploaded to a website, and the download size of the files will be smaller. However, it will require more cataloguer-time to achieve this outcome. The toolkit provides a mechanism for expressing the required granularity. The smallest division is the child-folder of a parent-folder. So in the example of the bound volume of pamphlets, if we wished to take the more granular of the two approaches, each pamphlet could correspond to a child-folder, and the bound volume the parent-folder. The tranche csv files contain columns that allow this parent-child relationship to be expressed.
|
|
|
|
|
|
For those familiar with archival terminology, a project might equate to the collection level, a tranche to the series level, a parent-folder to the subseries level, and a child-folder to the folder level. The archival "folder level" contains one or more digital files.
|
|
|
For those familiar with archival terminology, a project might equate to the collection level, a tranche to the series level, a parent-folder to the subseries level, and a child-folder to the folder level. The archival "folder level" contains one or more digital asset files.
|
|
|
|
|
|
It is important that the granularity aspect of a project is considered before a tranche folder structure is created and populated because it is very time consuming to rectify mistakes made in this aspect of a project, post-digitisation. The information gained from assessing the granularity will allow a project manager to predict the resourcing levels that will be required to complete a project. See the [Workflows](https://itsagit.lse.ac.uk/hub/lse_digital_toolkit/-/wikis/LSE-Digital-Toolkit#workflows) section for more information about assessing the appropriate level of granularity for material.
|
|
|
|
|
|
Once the column-entries in the tranche csv file have been filled out and validated, a script is run that creates a corresponding folder structure. It is this folder structure (along with the tranche csv file) that can either be given to the Digitisation Provider to populate, or be the receptacle for migrated files.
|
|
|
Once the column-entries in the tranche csv file have been filled out and validated, a script is run that creates a corresponding folder structure. It is this folder structure (along with the tranche csv file) that can either be given to the digitisation provider to populate, or be the receptacle for migrated files.
|
|
|
|
|
|
There is a script to validate that the digitisation provider (or migration script, or archivist responsible for born-digital material) has populated the tranche folder structure correctly, in terms of both the existence of asset-files in the correct folders and, if the GFS naming convention is used for the files, that the asset-filenames contain continuous sequence numbers. It also checks whether the number of each set of derivative files matches the number of master files. For example, it will check that the number of jpg files matches the number of tif files in each child-item-set.
|
|
|
|
... | ... | |