Text analytics

April 30, 2004

Bernard Huber has recently published Textalyser, a web server with a French-English interface that will analyse any input text into statistical data about word tokens. This sort of information is useful for pricing translations, sizing website content and anticipating difficulties in text to speech conversion for a text’s ‘longest’ words.

What text workers need is a fast-reaction tool box including at least word analytics, a concordancer to see words in context, and term extraction capabilities. Ideally accessible by clicking on any word in a text. We should be able to experience electronic words as portals to knowledge about the their lexical dominion, and their given instantiation in a document. But we cannot capture this ‘knowledge’ on a personal hard disc: what might look like a useful add-on to a word processing application actually needs to be web-based to benefit from richer, broader knowledge streams about words and language. Maybe the data that Textalsyer generates about texts, for example, can itself be aggregated to provide a further level of useful statistics about web-wide textual practice. But you probably need some sort of classificatory metadata about the semantic rather than purely formal content of the texts themselves to make this useful.

Text analytics

RELATED ARTICLES

US Department of Education awards $120 million to improve education for English learners

Boostlingo Introduces the First Built-In AI Transcripts for Interpreted Calls

Global10x Announces New Strategic Localization Course for May 2025

NEC and Sumitomo Develop App for Multilingual Workplaces

Dals Sets Out Net Zero 2030 Target as It Unveils New Brand

Weekly Newsletter, Subscribe to stay updated!

Login or Register