Sorry Iâ€™ve been offline again for a long time. Promise to keep on track now Iâ€™m back from la France profonde.
Letâ€™s start back with a spot of fun â€“ at least for English-speakers.
Donâ€™t expect a superior lexeme counter for your translations if you go to WordCount . Itâ€™s actually an â€˜artistic experimentâ€™ carried out by Jonathan Harris, a designer with Number27. He presents the â€œ86,800 most frequently used English words (from proper nouns to prepositions), ranked in order of commonalityâ€ along a line on diminishing font size from left to right. What does he mean by commonality here? Dictionaries suggest it means â€œThe possession, along with another or others, of a certain attribute or set of attributes: a political movement’s commonality of purpose, or A shared feature or attributeâ€. For WordCount this presumably means that the words share the feature of being â€œscaled to reflect its frequency relative to the words that precede and follow itâ€.
What you can do with this artistic toy is either find out how â€œfrequentâ€ a word is (e.g. the last word in the whole list is oddly, conquistador, ahead of items like conflas and fwag), or enter any digit up to 86,800 and see which word has that position in the rankings. Harris takes the data from the 100 million word British National Corpus that covers a very wide range of spoken and written sources. He plans to use WordCount to track â€œword usage within any desired text, website, and eventually the entire Internet.â€
But what people really like about WordCount, apparently, is the fact that the arbitrary word line of relative frequencies sometimes generates surreal phrases: e.g. around translate you get â€œpussy patting translates geomorphology impasseâ€. Quite. And around globalization, you find â€œBASF Hokkaido repugnance globalization Sunderbyâ€. Odd stuff. In other words, we keep on seeing meaning where the machine simply lines up arbitrary forms.