Patent news

Just before Christmas, Microsoft Research discreetly filed a patent for a data-driven type translation system: 

An adaptive machine translation service for improving the performance of a user’s automatic machine translation system is disclosed. A user submits a source document to an automatic translation system. The source document and at least a portion of an automatically generated translation are then transmitted to a reliable modification source (i.e., a human translator) for review and correction. Training material is generated automatically based on modifications made by the reliable source. The training material is sent back to the user together with the corrected translation. The user’s automatic translation system is adapted based on the training material, thereby enabling the translation system to become customized through the normal workflow of acquiring corrected translations from a reliable source.

But don’t give up on that project you had to beat the rest of them to market. Just this week, IBM announced that it would freeing up 500 patents for use by the Open Source Software movement. Included in the list, I found these covering aspects of automatic translation:

US5644775 Method and system for facilitating language translation using string-formatting libraries

US5251130 Method and apparatus for facilitating contextual language translation within an interactive software application

US5640575 Method and apparatus of translation based on patterns

US5267156 Method for constructing a knowledge base, knowledge base system, machine translation method and system therefor

US6236958 Method and system for extracting pairs of multilingual terminology from an aligned multilingual text

Others include

US5640487 Building scalable n-gram language models using maximum likelihood

maximum entropy n-gram models

US5636291 Continuous parameter hidden Markov model approach to automatic

handwriting recognition

US5220621 Character recognition system using the generalized hough transformation and method

US6249605 Key character extraction and lexicon reduction for cursive text recognition

US6182115 Method and system for interactive sharing of text in a networked environment

US5678052 Methods and system for converting a text-based grammar to a compressed syntax diagram

US6311177 Accessing databases when viewing text on the web

US6216102 Natural language determination using partial words

Andrew Joscelyne
European, a language technology industry watcher since Electric Word was first published, sometime journalist, consultant, market analyst and animateur of projects. Interested in technologies for augmenting human intellectual endeavour, multilingual méssage, the history of language machines, the future of translation, and the life of the digital mindset.


Weekly Digest

Subscribe to stay updated

MultiLingual Media LLC