... | ... | @@ -68,9 +68,9 @@ The use of codes and numbers allows for the automatic creation of unique IDs at |
|
|
|
|
|
The levels of the GFS are reflected in an (optional) naming convention for the asset-files. If this naming convention is adopted, the asset-files can be manipulated more easily by the scripts. It also ensures that each asset-filename is unique. All but one of the scripts will still function if the asset-file-naming convention is not adopted for a tranche within a project so long as the asset-filenames abide by some minimum requirements that are listed in [The Generic Folder Structure (GFS)](https://itsagit.lse.ac.uk/hub/lse_digital_toolkit/-/wikis/LSE-Digital-Toolkit#the-generic-folder-structure-gfs) section.
|
|
|
|
|
|
The documentation is currently skewed towards archival processing, and specifically, upload to, and download from, Arkivum’s Digital Preservation Platform, which uses the ISAD(G) schema. However, if external developers wish to write scripts for other upload targets and download sources that use different schema, the documentation could become more generic, with all such scripts given their own sections in the documentation. For example, if there is a need to migrate a legacy collection of images of algae to the GFS, the Darwin Core Schema could be used, and an upload script could be written that has a biological database as a target.
|
|
|
The documentation is currently skewed towards archival processing, and specifically, upload to, and download from, the Arkivum Digital Preservation Platform, which uses the ISAD(G) schema. However, if external developers wish to write scripts for other upload targets and download sources that use different schema, the documentation could become more generic, with all such scripts given their own sections in the documentation. For example, if there is a need to migrate a legacy collection of images of algae to the GFS, the Darwin Core Schema could be used, and an upload script could be written that has a biological database as a target.
|
|
|
|
|
|
One of the first things to consider when embarking on either a digitisation project, or a migration of born-digital material, is the level to which the material should be divided up (the granularity). For example, in the case of a digitisation project that involves the digitisation of bound volumes of phamphlets, should a bound volume be treated as a single item, and given just one metadata entry, or should each pamphlet have its own metadata entry? If it is the latter, the discoverability of the material will be improved, once it has been uploaded to a website, and the download size of the files will be smaller. However, it will require more cataloguer-time to achieve this outcome. The toolkit provides a mechanism for expressing the required granularity. The smallest division is the child-folder of a parent-folder. So in the example of the bound volume of pamphlets, if we wished to take the more granular of the two approaches, each pamphlet could correspond to a child-folder, and the bound volume the parent-folder. The tranche csv files contain columns that allow this parent-child relationship to be expressed.
|
|
|
One of the first things to consider when embarking on either a digitisation project, or a migration of born-digital material, is the level to which the material should be divided up (the granularity). For example, in the case of a digitisation project that involves the digitisation of bound volumes of pamphlets, should a bound volume be treated as a single item, and given just one metadata entry, or should each pamphlet have its own metadata entry? If it is the latter, the discoverability of the material will be improved, once it has been uploaded to a website, and the download size of the files will be smaller. However, it will require more cataloguer-time to achieve this outcome. The toolkit provides a mechanism for expressing the required granularity. The smallest division is the child-folder of a parent-folder. So in the example of the bound volume of pamphlets, if we wished to take the more granular of the two approaches, each pamphlet could correspond to a child-folder, and the bound volume the parent-folder. The tranche csv files contain columns that allow this parent-child relationship to be expressed.
|
|
|
|
|
|
For those familiar with archival terminology, a project might equate to the collection level, a tranche to the series level, a parent-folder to the subseries level, and a child-folder to the folder level. The archival "folder level" contains one or more digital files.
|
|
|
|
... | ... | @@ -88,7 +88,7 @@ The toolkit is only mature in the relatively narrow band of activity for which t |
|
|
|
|
|
When the toolkit is used with the ISAD(G) schema, it can be configured for "Library Processing". This allows certain fields, which are commonly used in bibliographic cataloguing, but are not present in the ISAD(G) schema, such as “Personal author”, “Corporate author”, “Publisher”, and “Note” to have their own columns in the tranche csv files.
|
|
|
|
|
|
When an upload target such as Arkivum's Perpetua is used, the content of these columns are combined and formatted within the “isadg.scopeAndContent” column.
|
|
|
When an upload target such as Arkivum is used, the content of these columns are combined and formatted within the “isadg.scopeAndContent” column.
|
|
|
|
|
|
Tags, plus their content, can be created “on the fly” by entering them in the “gfs.contextualInformation” column of the tranche csv files. These tags are formatted and added to the content of the “isadg.scopeAndContent” column, as can be seen [here](https://lse-atom.arkivum.net/uklse-ex1zt01001001). This is a dissemination to Arkivum’s Atom module that contains the uploaded content of the example GFS used in the [Getting started with the toolkit](https://itsagit.lse.ac.uk/hub/lse_digital_toolkit/-/wikis/Getting-started-with-the-toolkit) guide.
|
|
|
|
... | ... | @@ -124,20 +124,6 @@ This feature is documented in the the "Library Processing" sub-section of the wo |
|
|
|
|
|
[Arkivum tranche cycle workflow](https://itsagit.lse.ac.uk/hub/lse_digital_toolkit/-/wikis/Arkivum-tranche-cycle-workflow)
|
|
|
|
|
|
## Cataloguing guides
|
|
|
|
|
|
The cataloguing guides for the digitisation and born digital workflows can be downloaded via the links below. They are the LSE's internal cataloguing guides and are specific to LSE's own requirements.
|
|
|
|
|
|
[LSE_Digital_Toolkit_Born_Digital_User_Guide_v2_1_0.docx](uploads/acdd08959e5eada4e6a5f514cf98e58e/LSE_Digital_Toolkit_Born_Digital_User_Guide_v2_1_0.docx)
|
|
|
|
|
|
[LSE_Digital_Toolkit_Digitisation_User_Guide__v2_1_0.docx](uploads/1f0a1e4bb11904a4c55c5c35faffe9b2/LSE_Digital_Toolkit_Digitisation_User_Guide__v2_1_0.docx)
|
|
|
|
|
|
[Command_line_examples_v2_1_0.txt](uploads/b75e070b67e2337412b2c52fa57005b4/Command_line_examples_v2_1_0.txt)
|
|
|
|
|
|
You may wish to consult these guides while evaluating the toolkit and then, if you decide to use the toolkit, modify the documents so that they correspond with the requirements of your own organisation.
|
|
|
|
|
|
The content of the digitisation guide matches the default configuration of the [example GFS](https://drive.google.com/file/d/1lV_FhzYtcbIDcUnox78DMpsTxRx2uVho/view?usp=share_link) that is used in the [Getting started with the toolkit](https://itsagit.lse.ac.uk/hub/lse_digital_toolkit/-/wikis/Getting-started-with-the-toolkit) section.
|
|
|
|
|
|
## Script groups
|
|
|
|
|
|
[Script groups](https://itsagit.lse.ac.uk/hub/lse_digital_toolkit/-/wikis/Script-groups)
|
... | ... | @@ -191,15 +177,10 @@ When a digitisation provider is in "production mode", it may have multiple staff |
|
|
|
|
|
A Graphical User Interface (GUI) for the toolkit - this is currently under development.
|
|
|
|
|
|
A script that mints Digital Object Identifiers (DOIs) from the metadata in the department, project, and tranche csv files, and writes the DOIs back to those csv files.
|
|
|
|
|
|
Development of a configurable utility to delete and substitute non-standard ascii characters in a file.
|
|
|
A script that mints Digital Object Identifiers (DOIs) from the metadata in the department, project, and tranche csv files.
|
|
|
|
|
|
An attempt to improve the ability of the toolkit to cater for non-English metadata text. Unfortunately, nothing can be guaranteed in this regard.
|
|
|
|
|
|
A facility to cater for archival levels of unlimited depth.
|
|
|
|
|
|
|
|
|
## Contact
|
|
|
|
|
|
Nick Bywell (Digital Library Developer)
|
... | ... | @@ -212,13 +193,12 @@ When I first joined the Digital Library Team at the LSE, six years ago, there wa |
|
|
|
|
|
I am grateful to the following colleagues for their input into the development of the toolkit:
|
|
|
|
|
|
- **Robert Miles** and **Fabi Barticioti**, our current and former digital assets managers whose archival expertise has been invaluable in developing various aspects of the toolkit. Fabi developed the LSE's internal cataloguing guide for digitised material.
|
|
|
- **Robert Miles** and **Fabi Barticioti**, our current and former digital assets managers whose archival expertise has been invaluable in developing various aspects of the toolkit.
|
|
|
|
|
|
- **Henry Rowsell** and **Neil Stewart**, my current and former line managers, who (along with those further up the management hierarchy) gave me the time to develop a generic toolkit, rather than one that is specific to the LSE's requirements.
|
|
|
|
|
|
- **Silvia Gallotti**, who developed the LSE's internal cataloguing guide for born digital material.
|
|
|
|
|
|
- **Emma Pizarro**
|
|
|
- **Silvia Gallotti**
|
|
|
- **Emma Pizarro**
|
|
|
- **Clare Mulhall**
|
|
|
- **George Jukes**
|
|
|
- **Andy Jack**
|
... | ... | @@ -232,4 +212,4 @@ I am also grateful to: |
|
|
|
|
|
It would be interesting hear from anyone who starts using the toolkit, or has problems with it, or is willing to give some feedback on its functionality. It would also be encouraging to hear from any developers who wish to contribute new scripts for additional upload targets. I can be contacted at n.bywell@lse.ac.uk
|
|
|
|
|
|
**Nick Bywell** (6th Feb 2024) |
|
|
\ No newline at end of file |
|
|
**Nick Bywell** (7th Feb 2024) |
|
|
\ No newline at end of file |