Statistics as a medical translation specialization

By Luciana Cecilia Ramos August 13, 2012

In our daily living as laypeople, irrespective of our involvement in the science of statistics, we often observe, probe, consult, research and make a hypothesis before reaching a decision, looking for the highest benefits possible, at the minimum risks and under the most cost-efficient circumstances. We may do this without thinking about it, or we may be the type who makes lists of pros and cons.

Likewise, when a medical professional prescribes treatment to a patient or looks for an authoritative approval, he or she should be reasonably certain of the efficacy of such therapy, knowing anticipated pros and cons. Since the late 1990s, the trend dictates that any medical procedure, either for preventive, diagnostic, therapeutic, prognostic or rehabilitating purposes, should be defined based on its level of scientific evidence — so called evidence-based medicine. In the medical arena, statistics plays a predominant role as a subfield of medical specialization, in particular in the branches of epidemiology and clinical trials.

Clinical trials are research studies that prove the effect of new clinical approaches in humans. The investigator inputs and controls variables, and patients are randomly assigned to the different treatments being compared. Clinical trials attempt to answer scientific questions and search for the best practices in order to prevent, explore, diagnose or treat a disease or condition. Often, clinical trials compare and contrast new treatments against other already cleared by the US Food and Drug Administration (or any other relevant agency, based on the geographical location) and available in the market. Each clinical trial is ruled by a particular protocol or action plan, and statistics is a core component of data analysis and testing. It is not part of the hypothesis, as it is usually mistakenly believed, since hypotheses are elaborated by the investigator based on his or her findings.

As a research method itself, the clinical trial should reflect the steps shown in Figure 1, and translators should be able to understand and identify each link of this chain to proactively spot the terminological focus germane to each step, and plan for research and documentation efficiently, and in advance.

In clinical trials, cases (or events) are usually complex and variable, and tend to be influenced by multiple causes and linked by diverse relationships. The role of science is observing facts appropriately and identifying consistent elements. The scientific method tells us whether a hypothesis should not be rejected or should not be accepted, and it comprises a set of principles and procedures for systematically yielding knowledge. There are two key words in the definition of science: knowledge and method. Knowledge refers to what is being searched for, and method refers to how knowledge is yielded.

Within the context of any research, putting an issue or hypothesis under consideration and drawing conclusions from observed cases derived from clinical data, diagnoses or treatments, using the appropriate method, is of essence. Thus, the need of sieving data through a scientific methodology and statistical analysis becomes apparent in order to avoid misleading conclusions based on the desire that the case turned out to be of a particular fashion or as subjectively seen by the investigator.

Hence, complex terms from statistics, sometimes senseless even for linguists, pour into our source text of biomedical content. As linguists, we should value the relevance of this science in the sphere of clinical trials, and become aware of and interested in deepening our knowledge to make full sense of the information being conveyed. Many linguists truly versed in biomedical translation just aim at getting the correct rendering and the grammatical aspects of such terms and conjugate them appropriately in their target language. Notwithstanding that this approach may work for some minor jobs, digesting the concept beneath the superfluous term may pave the way toward language specialization.

When you, as a translator, embark upon researching concepts beyond target equivalents, you will not only learn a term but also its shades of meanings, its synonymies, terms exchangeability and additional related lexical components of the discipline, which in turn will build your strengths in the subject matter. Statistics poses the advantage that its application in clinical trials is rather systematic; therefore, the time invested in your initial research will prove valuable for any further translation on the field of epidemiology or clinical research. We will cover some basic statistical terms as used in clinical trials, with their equivalents in universal Spanish. In Table 1 you can see statistical terms embedded as part of the medical text to translate into Spanish.

Obviously, highlighted terms emerge from the science of statistics and are sometimes used vaguely both in source and target texts. And of course, looking them up in the appropriate bilingual glossary may guide the translator to their Spanish equivalent, but being aware of what the concept means, what calculations lie underneath or what specific shade of meaning the term has may help the translator have a clearer idea of the course of the investigation or its findings. This would put him or her in a better position to render the meaning accurately and comprehensively rather than just delivering a word-per-word translation. We will dissect a few of such terms for illustrative purposes.

For example, every ratio is a quotient (the result of the division of one number or quantity by another); its mathematical expression is c = a/b. In general, the term ratio is used in a broad sense, like an umbrella term encompassing the concepts of rate, proportion and percentage. However, in epidemiology or genetics, it may have a more constrained meaning, as in 1:3, expressing the relationship between one male and three females, or one hospitalized and three non-hospitalized patients. In Spanish, this should be translated as razón. There are other terms frequently used, such as relación, but in this case it should never be translated as proporción. Why? Let’s elaborate a bit more on the concepts of proportion and percentage. It is worth mentioning that proportion (in Spanish, translated plainly as proporción) is a ratio in which the numerator is included in the denominator; it is part of the whole. Then, why is it so frequently, and correctly, translated as percentage (porcentaje)? Proportions are usually expressed as percentages, but for a proportion to become a percentage, it has to be multiplied by 100. This is a common practice since percentages depict a clearer idea both for the specialized and the lay reader. So, when you’re aware of how the proportion has previously been expressed or will be shown in the text you are translating, you may be entitled to translate proportion as porcentaje in Spanish. If the proportion is one in every ten women, it is a tenth. If we multiply a tenth by 100, you get the percentage figure for that proportion, which is 10% of the women in that population.

Both in epidemiology and demography, a rate (in Spanish, tasa) is defined as a relative measure expressing the relations of a demographical event (marriages, births, deaths and so on) in a defined population during a specific period of time. However, it is worth insisting on the fact that even though all rates are quotients, not all quotients are rates. I dare say this term, as plain and simple as it may seem, results in a great deal of inaccuracies. For instance, the expression case-fatality ratio should be translated as tasa de letalidad, which is the Spanish term for expressing the proportion of recorded deaths among people suffering from a particular disease during a specific period of time. The translator should be cautious so as not to use it interchangeably with fatality rate in English, which refers to the proportion of deaths recorded for a set of individuals affected by one single event — for instance, people involved in the same catastrophe. In Spanish, both case-fatality ratio and fatality rate are translated equally. And from this example, tasa de letalidad, a quotient calculated based on the number of deaths due to a particular disease, I will try to point out the differences between the Spanish terms índice (index) and tasa (rate), which also tend to be translated indistinctly. Rate, in this field of specialization, is calculated by dividing the number of deaths due to a particular disease by the number of people who were suffering such condition during, for example, the first semester of 2011. To make that rate easy to understand, it has to be multiplied by a constant number (such as 10,000, depending on the denominator, to avoid nonsensical figures such as 0.03 people). This can be illustrated as follows: if 15 deaths are recorded among 1,250 people suffering from breast cancer, the resulting quotient is 0.012; this value is then multiplied by 1,000 to obtain a rate to be expressed as 12 out of 1,000.

In this sense, rates (tasas) are not expressed as percentages, and this is a key concept the translator should understand. What about index? By definition, indexes (índices, in Spanish) are numerical indicators that resort to a formula usually based on two or more indicators. The index is the figure expressing the relation between a series of data, which allows for conclusions. Index variation, not the index itself, is expressed as a percentage. Even though in the field of medical research rates are the values most commonly used, once again, knowing the concept will lead you to the most accurate equivalent. Looking at an example of the price index will help you understand the math beneath the concept.

A pocket bilingual dictionary costs $80 in 2012, while in 2011 its price was $75. Price index is calculated dividing 80 by 75, and the quotient is then multiplied by 100. Thus, the index is 133.3. To interpret and communicate the variation in a meaningful way, the operation 133.33 – 100 is done to explain that the dictionary increased by 33.3% in 2012 compared to 2011. That 33.3% is not the index, but the variation, and definitely not a rate.

In the context of clinical trials, the term index drifts to other semantic areas, which have nothing to do with quotients, such as is the case of index group, which can be translated as grupo experimental, grupo indicador or grupo control among other alternatives depending on the type of research. This is not a novelty: the more versed you are on the topic, the farther you get from error, so there is no excuse for not drilling down a bit more in statistics.

Even a simple and transparent term as indicator (indicador in Spanish) — a value that summarizes or reflects a particular aspect of a population at a certain place and at a certain moment — may require that the translator clarify when needed, when he or she knows how such indicator is presented in the paper, since the most frequently used indicators are rates, ratios and proportions. This term is also used as an adjective in English, in phrases such as indicator variable, as a synonym of dummy variable, both translatable as variable indicadora. Even though the term indicator is no longer explicit, the savvy linguist will be able to bridge this concept with relevant figures in the scope of the text.

The term outcome refers to the event, value or measure (in Spanish, acontecimiento, valoración, valor, medida) found in a subject or a therapeutic unit from the clinical trial, either during such trial or as a consequence of it. It is used to assess the safety and efficacy of the investigational treatment. In the context of clinical research, this term is commonly seen as outcome variable (or implicitly in the term endpoint), referring to the appraisal criterion (criterio de valoración), and it is usually combined in expressions such as primary outcome (criterio principal de valoración), surrogate outcome (criterio indirecto de valoración), to name a few. In other medical and pharmacological contexts, it may be translated into Spanish as desenlace, beneficio, consecuencia, efecto, evolución, puntuación, reacción adversa, acontecimiento adverso, respuesta and so on. As seen from these examples, this term is highly polysemic, demanding a shrewd eye. Table 2 shows some other terms whose definitions will empower the translator to convey a message clearly.

Mean, median and mode are three types of averages (promedios) and, in fact, it is not rare to see them translated simply as such. We cannot claim this to be incorrect, but it is certainly too vague for certain contexts. In the field of statistics there are several types of averages, and these are the most frequently used, particularly in the sphere of clinical trials and protocols. Some basic concepts that the translator should know about them are the following:

Mean (in Spanish, media) can be used both as an adjective and a noun. It is the one most frequently translated as average, and it is obtained by adding up all the figures in the set and dividing them by the amount of figures. In statistics, it generally relates to the arithmetic mean (media aritmética), but the translator should bear in mind there are several different statistical means: geometric mean (media geométrica), harmonic mean (media armónica) or the quadratic mean (media cuadrática), to name a few. A frequent phrase from statistics in the context of clinical trials is mean dose (dosis media), which should not be mistaken for median.

Median (in Spanish, mediana) is also translated as centil 50 or porcentil 50 (please note that percentil is not correct in Spanish). It is a measure of central tendency; it reflects the value taken by the central item of the sample ordered from the lowest value to the highest one — the value in the middle of the scale. To calculate the median, values should be sorted in ascending numeric order.

Mode (in Spanish, moda) is a measure of central tendency, representing the most frequent value. If there are no repeated values, there is no mode.

Range covers the stretch from the maximum to the minimum value taken by the variable. In Spanish, this concept can be perfectly described as amplitud de la variable, but the most frequent jargon equivalent is rango.

Needless to say, translators are not usually fond of statistics, and likely after reading this article not a single one would dare think of taking up algebraic calculus. I do not mean we will have to do estimates — firstly, because we are not paid to, and, secondly, because respectable researchers have been busy enough juggling with them, in a figurative sense of speaking. But, being aware of the fact that these terms are not exchangeable may help the translator convey the message clearly and accurately even when the source is not so explicit. In any case, if changing terminology or deviating from a glossary is not an adequate step to follow, the versed translator will be well-informed to sense there is too much play in the set of ideas presented by the author and therefore will be capable of asking for a precise and straightforward clarification for his or her doubts, increasing the chances of getting a relevant and useful reply from the content owner.

Even though concepts from statistics may seem too awkward at first sight, just as in any kind of learning experience, comfort comes as you practice. The appetite for real-life examples is the best way to start digesting them. The array of sciences involved in the field of biomedical translation that linguists face is vast and challenging — statistics is only one of those sciences.