Setting consistent and intelligent file naming standards is always smart in any digitization project, but books raise unique problems. In most cases, books have an overall sequence and an internal pagination and at UNT both must be delineated before a user can access the item online. The overall sequence may or may not contain the front and back covers, inside covers, end pages, title page[s], blank pages, and many other possibilities in addition to pages actually paginated in the book. How does one begin organizing such a morass of items?

At UNT we use a system called MagickNumbers that utilizes an 8-character code to notate both the sequence, or order, and pagination of digitized items.* The first 4 digits set the sequence of the items while the last 4 characters denote the pagination code used by our page-turning system to display the pagination online.

SSSSPPPP.tif, where SSSS are the sequence digits and PPPP is the pagination code.

The example below depicts a simple scenario of 12 TIFF scans of 12 pages, which are paginated 1 through 12 followed by the pagination info displayed in our page turning system:

00010001.tif -  1
00020002.tif -  2
00030003.tif -  3
00040004.tif -  4
00050005.tif -  5
00060006.tif -  6
00070007.tif -  7
00080008.tif -  8
00090009.tif -  9
00100010.tif - 10
00110011.tif - 11
00120012.tif - 12

There’s more to come, but I was sidetracked by starting a script to aid in MagickNumbering. We currently use the commercial program, ACDSee Photo Manager to MagickNumber our book scans and being able to provide an open source alternative may help others get on the road to file naming nirvana.

* MagickNumbers are used when we want to display the item as a series of “pages” as opposed to a more generalized “series”, such as the 2 images comprising the front and back of a scanned photograph.

Advertisements