Terminology Glosses: Semantic painting

In languages, a lot has to do with labeling. We label concepts by naming them and ideally, each label uniquely identifies one referent. This simple mechanism is visually summarized in the semantic triangle.

As depicted by C.K. Ogden and I.A. Richards in 1923 in The Meaning of Meaning, the semantic triangle (see above) exemplifies the relationship between any given concept (thought or reference), the corresponding object (referent), and the word or any other representation (symbol) we use to express it. The keyword here is semantic, a qualifier that has entered English through French from the Greek sēma (sign) to express a connection with meaning.

Recently, the term semantic painting captured my attention. In this adjective-noun combination, the first constituent word is related to meaning, while the latter indicates the act of applying color by using different pigments, media, and tools. By painting, however, we create a depiction (symbol) representing the concept we want to express. So from an exclusively linguistic standpoint, the use of semantic and painting together is somewhat redundant. From a terminology management perspective, the term well deserves a place in our ideal termbase, where we will record it as a compound noun the meaning of which bypasses the sum of its individual components.

When creating an entry for semantic painting, we will clearly mention that it is linked to machine learning, thus defining its domain of occurrence. The term refers to a method used to help computers learn associatively. In “Semantic Paint Makes Real-World 3D Labelling Child’s Play,” David Amerland summarizes it as the use “of color recognition to create object segmentation in the real world in real time.” The segmentation is obtained by breaking the environment into classes, such as chair, floor and so on. The authors of  “SemanticPaint: Interactive Segmentation and Learning of 3D Worlds” explain that “users interact physically with the real world touching objects and using voice commands to assign them appropriate labels.” Then, thanks to specific algorithms, computers refine what they learn, adding new layers of meaning.

Generally speaking, though, if a term is somewhat dim from a linguistic point of view, but vibrant in terminology management, then we are faced with a disconnect somewhere. Consider for a moment the adjective semantic and its corresponding noun semantics. Their use in IT is not new and it is actually rather diversified. Webopedia defines how the term is used in computer programming as being “frequently used to refer to the meaning of an instruction as opposed to its format.”

In our ideal termbase, we use metadata to delineate the boundaries between domains and in so doing we compose a detailed mosaic of the various senses at a given point in time, in what is called a synchronic perspective. The different meanings (all having a common origin) of a term are technically called polysemy. They haunt every termbase and daunt terminologists. In the language industry and in localization, we are very familiar with polysemy and in terminology management we can say without fear that the better the polysemic terms are handled, the higher the quality of the termbase.

But there is more. Looking at the evolution of the qualifier semantic and, in particular, at how it is used in semantic painting within machine learning and artificial intelligence, it becomes evident that the labeling and classifying tasks are no longer performed solely to help humans categorize and make sense of the world, but rather to teach computers. Simple, very concrete objects are being labeled so that computers can learn to recognize them and become better in this process through the application of corrective algorithms. Concurrently, the domain of utilization has gone from linguistics to programming to artificial intelligence. If we adopt a diachronic perspective — one that takes into account the passing of time — then we are looking at a paradigm shift and, with machine translation, our industry has already embraced this new course. In what other ways is the language industry as we know it today ready for this?

The original idea is still at the basis of the whole approach. Semantic painting has a lot to do with categorizing and labeling. Even though the documentation is still limited, there are sporadic examples of multilingual semantic labeling projects in which the labeling work was completed occasionally using a translator or, in some cases, machine translation. In one computer vision study video named “How we teach computers to understand pictures,” Fei-Fei Li, associate professor at Stanford University, describes how her team built a database of 15 million photos (ImageNet) to teach computers to understand pictures. For the labels, they used crowdsourcing through the Amazon Mechanical Turk platform and the lexical resources of WordNet put out by the University of Princeton. At the other end of the spectrum, if we extend the idea of labeling to that of creating structured, methodological ontologies, moving from a lexicographic (semasiologic) to a terminological (onomasiologic) approach, new openings for increased cross-functional collaboration involving professional linguists and terminologists are desirable.

As for machine translation, its effects are already well visible to those translators and localizers who are now asked to post-edit translated texts. However, the initial fear for the profession is somewhat mitigated by the statistics. David Rumsey, president of the American Translators Association, recently declared to CNBC reporter Kate Rogers in “Where the Jobs Are” that “the opportunities for people with advanced language skills will continue to grow sharply.” According to the Bureau of Labor Statistics, the number of translators will increase by 29% through 2024. This, after doubling in the last seven years thanks to the effects of globalization. In the same press report, one major player declared that for complex tasks it is still better to have human translators; technology just helps speed up the translation process. If today’s statistical forecasts and on-the-field conclusions hold true, in the next few years it might still be a matter of complementarily designing innovative landscapes rather than completely erasing consolidated practices.