Visualizing language

Visual Thesaurus is a fun, for-sale tool that represents synonymy (and antonymy) relations between English words as graph displays looking like Calder mobiles of interlinked word nodes. I haven’t examined it in detail, but you can click on each related word within a synonym graph to generate new graphs around your term, and also call up boxes with definitions and examples of usage for each term. The actual length of the line linking a word to a synonym is probably meaningful in itself (the longer the line the more ‘distant’ the semantic relationship with the core word or something?).

Fascinating though such a dynamic presentation is, I don’t personally find this sort of interface to language knowledge any more useful in practice than the standard listing of terms in a text-based thesaurus. It’s attractive to the eye, but what can you really do apart from click and watch? It doesn’t link to your documents, or drive a search engine. But its great virtue is to give a momentary illusion of language as a dynamic system of virtual relationships, not just a string of signs on a page. And this might open up interesting new directions for dynamically representing the meaning and style of texts themselves and their relations with each other in ‘knowledge space’. So here’s a wee riff on what you could do with better visualization tech.


Historically, techniques available for ‘representing’ language (e.g. to teach it, or show how two languages or family of languages compare, or demonstrate etymology or syntax etc) has boiled down to leveraging the affordances of:

. print (e.g. using special characters such as the IPA alphabet, and word lists and tables), and

. diagrams – trees to show syntactical structure or historical relations between languages, or transition networks to show relationships that underlie structures.

Phonetics (and now speech technology for pronunciation training) have benefited from spectrograms that show formant behavior for given pronunciations, but they hardly speak to the naïve inquirer. No one seems yet to have invented interesting digital visual models of a given natural language in its dynamic entirety, or of how humans might represent their linguistic knowledge.

Yet visualization techniques for displaying various sorts of knowledge are now going mainstream. You have search engines such as Mooter, Grokker, or The Brain that show search results as shimmering globes of categories or networks of nodes instead of text lists, or enterprise knowledge management tools that present content in terms of visual ‘maps’. Try this from xrefer’s new Research Mapper for a search on ‘machine translation’. There’s even a site that displays looping links between groups and singers according to the query you type in, and masses of other attempts to to use geometry to replace text as the inherent interface to web links.

Now that there are reports of 3D screens about to reach the desktop, isn’t it time for someone to develop a few tools to expand the dynamic display space for language and textuality in general? They might not serve any pressing need for information collection or business process optimization, but they might play a role in teaching anything from writing to linguistics. I’m think of such visual applications as:

. showing dynamic ‘films’ of alternative parses of ambiguous sentences,

. demonstrating phonetic change over time in a fun way (e.g. the Great Vowel shift in English),

. tracing etymologies as visual journeys through time, space and semantic fields

. showing the translation process from the inside: animating how choices are made, how a rule based or a data-driven MT system does its job,

. showing how a speech recognition device uses a database to map incoming acoustic entities onto language models and optimizes its choices,

. displaying ‘complex’ morphological processes in languages such as Russian or Arabic,

. and eventually, showing how a total stretch of speech can reveal its composition through the massive interaction of different types of linguistic phenomena, from prosody through syntax to semantics.

I would imagine being able to click on any link in any part of these displays to enter further into a particular dimension of the model in question, and in this way explore language itself as a system of systems. This kind of application of visualization technology might start as a teaching device, but there’s always someone out there who would hijack it and use it for something more adventurous such as a video game or even an ad. You could even offer it to translation clients who want to watch what happens to their documents rather like a FedEx user tracking their parcel through the geographical maze of the delivery process.

Andrew Joscelyne
European, a language technology industry watcher since Electric Word was first published, sometime journalist, consultant, market analyst and animateur of projects. Interested in technologies for augmenting human intellectual endeavour, multilingual méssage, the history of language machines, the future of translation, and the life of the digital mindset.

Related Articles

Weekly Digest

Subscribe to stay updated