True to type
I read a report in Le Monde about how the French cryptologist David Naccache managed to discover an intentionally censured (blacked out) word in a CIA print document released by the White House on April 10. The phrase was â€œoperative told an XXXXXXX serviceâ€¦â€.
First they used OCR (check out Simson Garfinkelâ€™s recent encomium of the virtues of character recognition technology) to identify the font used, since it determines the number of characters per unit of length (16 mm). Luckily the font â€“ Arial â€“ was proportional rather than monospace, which meant that an â€˜iâ€™ letter took up less space than â€˜nâ€™. So they used a dictionary to list the possible words (only 1,530!) of 16 mm. Since the target word came after the string â€˜anâ€™, this limited the possibilities to 346 nouns and adjectives. Of these, only 7 made possible sense in context (Ukrainian, uninvited, unofficial, incursive, Egyptian, indebted and Ugandan). Given extra-textual circumstances, ‘Egyptian’ was chosen as the most likely candidate.
None of this deciphering was automated, of course, and the actual decoding was hardly earth-shattering. But thereâ€™s obviously still semantic mileage in the formal properties of fonts.