Words that count

Sorry I’ve been offline again for a long time. Promise to keep on track now I’m back from la France profonde.

Let’s start back with a spot of fun – at least for English-speakers.

Don’t expect a superior lexeme counter for your translations if you go to WordCount . It’s actually an ‘artistic experiment’ carried out by Jonathan Harris, a designer with Number27. He presents the “86,800 most frequently used English words (from proper nouns to prepositions), ranked in order of commonality” along a line on diminishing font size from left to right. What does he mean by commonality here? Dictionaries suggest it means “The possession, along with another or others, of a certain attribute or set of attributes: a political movement’s commonality of purpose, or A shared feature or attribute”. For WordCount this presumably means that the words share the feature of being “scaled to reflect its frequency relative to the words that precede and follow it”.

What you can do with this artistic toy is either find out how “frequent” a word is (e.g. the last word in the whole list is oddly, conquistador, ahead of items like conflas and fwag), or enter any digit up to 86,800 and see which word has that position in the rankings. Harris takes the data from the 100 million word British National Corpus that covers a very wide range of spoken and written sources. He plans to use WordCount to track “word usage within any desired text, website, and eventually the entire Internet.”

But what people really like about WordCount, apparently, is the fact that the arbitrary word line of relative frequencies sometimes generates surreal phrases: e.g. around translate you get “pussy patting translates geomorphology impasse”. Quite. And around globalization, you find “BASF Hokkaido repugnance globalization Sunderby”. Odd stuff. In other words, we keep on seeing meaning where the machine simply lines up arbitrary forms.

Andrew Joscelyne
European, a language technology industry watcher since Electric Word was first published, sometime journalist, consultant, market analyst and animateur of projects. Interested in technologies for augmenting human intellectual endeavour, multilingual méssage, the history of language machines, the future of translation, and the life of the digital mindset.


Weekly Digest

Subscribe to stay updated

MultiLingual Media LLC