Andy Adams of FlakPhoto has a quote up on his Facebook page that has me on my soap box this morning:

“People think far too much about techniques and not enough about seeing.” – Henri Cartier-Bresson

The context on this quote is illuminating as many will interpret this as hailing the all-mighty image without regard to technical expertise. This is something that continually bothers me as a photographic/imaging educator because the actual photographic prints I see in galleries continue to slacken in their quality. HCB was dealing with an entirely different set of technical issues as he set out to curate time and space through the cyclopean gaze of a Leica viewfinder.

“Constant new discoveries in chemistry and optics are widening considerably our field of action. It is up to us to apply them to our technique, to improve ourselves, but there is a whole group of fetishes which have developed on the subject of technique. Technique is important only insofar as you must master it in order to communicate what you see….The camera for us is a tool, not a pretty mechanical toy. In the precise functioning of the mechanical object perhaps there is an unconscious compensation for the anxieties and uncertainties of daily endeavor. In any case, people think far too much about techniques and not enough about seeing.” – HCB

Only after mastering technique can one truly discard it. Photographers who have mastered technique understand this, while those who need to spend more time on their craft will not. This is the crux of where this quote-mining is incredibly misleading and not one to be re-Tweeted. Additionally, this quote makes it sound like HCB is advocating thinking while photographing, which he never did. His work is often spoken of related to the idea of “thin slicing”, which is the antithesis of thinking. How about this quote printed large across pages 2 & 3 of Westerbeck and Meyerowitz’s book Bystander: A History of Street Photography in which they have a conversation about Winogrand:

Westerbeck: I remember hearing Garry and the rest of you often calling pictures “tough” or “beautiful.” Why was “tough” such a key word for you?

Meyerowitz: “Tough” meant it was an uncompromising image, something that came from your gut, out of instinct, raw, of the moment, something that couldn’t be described in any other way. So it was TOUGH. Tough to like, tough to see, tough to make, tough to understand. The tougher they were the more beautiful they became. It was our language.

HCB wanted to talk about this “tough language” which precedes thought. In my opinion, pulling this quote from the context of HCB’s discussion in today’s photographic world puts it in an entirely different and incorrect context to which HCB would be appalled. Of course, doing so has meant we are talking “tough language,” so he would probably approve. </soap box>


So I’m teaching a workshop at TCDL in 2013 and I’ve pasted my prospectus below. I’ll start fleshing this out more next week as I’m giving a practice lecture to a digital curation course here on campus before the end of the month. One thing I know I’ll need to add is a simple introduction to loops, e.g. “for i in *; do…”.

My workshop will introduce command-line image manipulation to its participants using the open source software ImageMagick. While many are familiar with image manipulation software using graphical user interfaces, or GUIs, such as Adobe Photoshop, GIMP, and Pixelmator, there is less awareness of programs built on command-line interfaces, or CLIs. In a CLI, the user issues commands to the program by typing lines of text. This workshop is targeted at first time and novice CLI users and is structured to provide a strong foundation of skills to build upon.

Topics covered will include:

  • History of the command-line interface
  • Navigating folder hierarchies on the command-line
  • The “find” command
  • Installing ImageMagick in Linux, Mac OS X, and Windows XP/Vista/7
  • Creating an image by typing a single line of text
  • Convert versus mogrify
  • Image manipulations possible in ImageMagick
  • Creating web derivatives from master image files
  • The Identify command
  • Applying and removing image compression

Due to the nature of this workshop, participants will need to bring a laptop on which they have Administrative rights.

Building on Part 1 and Part 2, below are all of the special pagination codes currently in use at UNT for MagickNumbering the individual scans of book pages followed by the text displayed in the drop-down list of our page-turning system followed by a possible filename. These are only examples of possible file names, the actual file name would be whatever is correct to represent the book digitized.

####00fc - Front Cover - 000100fc.tif
####00fi - Front Inside - 000200fi.tif
####00tp - Title Page - 000500tp.tif
####r### - <Roman numeral ###> - ####r###.tif
####r001 - I - 0005r001.tif
####r004 - IV - 0008r004.tif
####pt01 - Plate 1* - 0013pt01.tif
####0000 - <Blank in the drop-down> - 00360000.tif
####00bi - Back Inside - 003700bi.tif
####00bc - Back Cover - 003800bc.tif

**Plates are only notated in the book if the plate is numbered in the book. Unnumbered plates are named with using the blank ####0000 code.

In the last update of the MagickNumbering system, Roman numerals were made much easier to deal with using the r### pagination code. Previously, Roman numeral filenames had to be hand-entered and this functionality still lives in the system.

####000i.tif - ####r001.tif - I
####00iv.tif - ####r004.tif - IV
####xviii.tif - ####r018.tif - XVIII

One interesting feature of the system is it does not break on the 9 character file name needed to describe XVIII using the legacy Roman numeral pagination code. This is due to some intelligent parsing of MagickNumbers by the system. The first 4 digits of the (ostensibly) 8-digit code are grabbed for the sequence. The system ignores any leading zeroes, but displays anything else. Of course, the special pagination codes display something different than entered, but I have yet to encounter a page paginated “fc” or “tp”. I can see r100 as a possible pagination (gov docs are often funky) , but in such a case I would probably resort to another feature in the system. It will also accept a text file with the pagination of each page written on a new line. This isn’t as useful as MagickNumbering the scans themselves because we using MagicNumbering as a quality control (QC) step. Performing a QC check of the images against a separate text file is an added headache when visually inspecting the scans to make sure the one-thousand-and-fourteenth item really is page nine-hundred-eighty-three, or 10140983.tif.

This leads to another quandary, which is what happens if you have a book with over ten thousand pages (unlikely) or a serial set, which is paginated to over ten thousand pages (highly likely with gov docs!). MagickNumbers easily scale to handle this by switching from an 8-character file name comprising two 4-character codes to a 10-character file name comprised of two 5-character codes. Such a code handles a sequence and pagination up to 99,999. I have yet to deal with anything requiring a 12-character file name, but that would be the next step up.

There are a few codes I think could help expand the system and make the drop-down page lists more useful. The first would be tc for “Table of Contents”. In many books, the table of contents is numbered and, like an index, not referenced as often when you have full-text search capabilities, but many gov docs don’t number their tables of contents. The other is something to delineate a blank page from a page with content, which is not paginated, such as an illustration. This can get confusing for the user quickly, though, and there are edge cases like gov docs with “page intentionally left blank” printed on them — is this a blank page or a page with content?

In part 4, I’ll mention some common mistakes I see people making and some ideas for programmatically fiddling with MagickNumbers. I also need to fully flesh out a workshop on manipulating images from the command-line using ImageMagick and am thinking this is a perfect place to do so.

Continuing from part 1, such a simple example as 12 images paginated 1 through 12 introduces MagickNumbers, but does not showcase the extensible nature of the file naming standard. The next example shows how the files for  a hard cover book with end pages, title page, and 14 pages (4 Roman numerals, 10 numbered) totaling 22 scans is named followed by the pagination info displayed in our page turning system. The pagination info is derived from the 4 character pagination code.

000100fc.tif - Front Cover
000200fi.tif - Front Inside
00030000.tif - 
00040000.tif - 
000500tp.tif - Title Page
0006r002.tif - II
0007r003.tif - III
0008r004.tif - IV
00090001.tif - 1
00100002.tif - 2
00110003.tif - 3
00120004.tif - 4
00130005.tif - 5
00140006.tif - 6
00150007.tif - 7
00160008.tif - 8
00170009.tif - 9
00180010.tif - 10
00190000.tif -
00200000.tif -
002100bi.tif - Back Inside
002200bc.tif - Back Cover

The following new pagination codes are introduced above:

####00fc - Front Cover
####00fi - Front Inside cover
####0000 - unnumbered page that displays no information
           in the drop-down list of page numbers
####00tp - Title Page
####r### - Roman numeral
    r002 - II
    r003 - III
    r004 - IV
####00bi - Back Inside cover
####00bc - Back Cover

Note in the example the Title Page is numbered Roman numeral I in the book, but instead of ####r001 the file has the special pagination code ####00tp. This is because we assume most users prefer having the ability to jump directly to the title page of a book.

Still to come RE: MagickNumbers:

  • a complete list of our current pagination codes
  • possible additions to the pagination codes I have been considering suggesting
  • common problems easily solved with how MagickNumbers are created and used
  • common problems MagickNumber novices make which reduce file naming consistency
  • programmatic possibilities for writing and validating MagickNumbers

Setting consistent and intelligent file naming standards is always smart in any digitization project, but books raise unique problems. In most cases, books have an overall sequence and an internal pagination and at UNT both must be delineated before a user can access the item online. The overall sequence may or may not contain the front and back covers, inside covers, end pages, title page[s], blank pages, and many other possibilities in addition to pages actually paginated in the book. How does one begin organizing such a morass of items?

At UNT we use a system called MagickNumbers that utilizes an 8-character code to notate both the sequence, or order, and pagination of digitized items.* The first 4 digits set the sequence of the items while the last 4 characters denote the pagination code used by our page-turning system to display the pagination online.

SSSSPPPP.tif, where SSSS are the sequence digits and PPPP is the pagination code.

The example below depicts a simple scenario of 12 TIFF scans of 12 pages, which are paginated 1 through 12 followed by the pagination info displayed in our page turning system:

00010001.tif -  1
00020002.tif -  2
00030003.tif -  3
00040004.tif -  4
00050005.tif -  5
00060006.tif -  6
00070007.tif -  7
00080008.tif -  8
00090009.tif -  9
00100010.tif - 10
00110011.tif - 11
00120012.tif - 12

There’s more to come, but I was sidetracked by starting a script to aid in MagickNumbering. We currently use the commercial program, ACDSee Photo Manager to MagickNumber our book scans and being able to provide an open source alternative may help others get on the road to file naming nirvana.

* MagickNumbers are used when we want to display the item as a series of “pages” as opposed to a more generalized “series”, such as the 2 images comprising the front and back of a scanned photograph.