Physics and the meaning of natural language text

The characteristics of natural language text meaning have many similarities to Newtonian physics, relativity theory and quantum theory. The mathematical models of physics can be utilized to develop analogous, although heavily modified, models for meaning.

The analogous models may provide insights into how text constructs meaning. They may also provide a format for meaning that is easily manipulated through discrete signal processing methodology.

To offer an overview into these models for the meaning of text, we should avoid the use of mathematical equations, but provide information about the characteristics of physics and the models for meaning.

The basics

Several basic concepts must be identified before the characteristics of either physics or the meaning of text can be addressed. These concepts will build a foundation of how the analogous models are related.

The coordinate systems of physics: The concept of coordinate systems is used throughout all discussions of physics. We are all familiar with the three-dimensional coordinate system. It is composed of the orthogonal x, y and z coordinates that represent the horizontal, vertical and depth offsets from a point of origin.

Associated with the coordinate systems are offsets along the different axes. These offsets are continuous and are defined in units of measure that are appropriate for each specific physics problem.

Value at specific coordinates for physics: The coordinate system allows us to identify a specific point in the three-dimensional space. The mathematics used to model the physics problem will determine a numeric value associated with that point. That value may be dependent on the offsets along the axes used to identify the point, or be a completely independent calculation.

The coordinate systems of meaning: Similar to physics, we will utilize coordinate systems to identify the position of specific meanings of text.

Since the meaning of text is dependent on the order of text, a sequence number axis will always be utilized.

The most common measures of text and the meaning of text are the discrete units of words or phrases.

Sometimes these measures are combined with part of speech or phrase tags that indicate the grammatical function of the text. The choice of axes used to locate a meaning may be based on the specific word, phrase or the grammatical tags.

These choices above have several disadvantages. If an axis is chosen to represent each specific word, the number of axes will be very large. Even if an axis is chosen to represent each of the smaller number of grammatical tags, the text might be associated with multiple, different grammatical tags. An example of this is where an adverbial phrase is usually associated with an adverb phrase tag. However, an adverbial phrase sometimes acts as an adjective and should then be associated with an adjective phrase tag.

It is for this reason that most future discussion of coordinate systems will use the concept of composition units. These are defined as portions of text that can be assigned a unique composition unit tag. The definition of each composition unit tag is shown in Table 1.

These tags have been defined so that there are no situations where text might be associated with two different composition unit tags. Take the sentence “the man sitting on a bench wearing a red shirt waved wildly to the man in the street.” Each composition unit and the associated tag are shown in Table 2.

The composition unit cannot be associated with any other composition unit tags. The axes used to represent these different composition unit tags will be orthogonal to each other.

The combination of the different composition unit tags with the sequence number will result in a coordinate system of eight dimensions.

Value at specific coordinates for meaning

We now have coordinate systems to identify the position of a meaning of text. We want to associate meaning with that position, but how will the value of meaning be determined?

Let’s use the phrase “alternative text” to represent the meaning. The original text cannot be used to define the meaning. That would be similar to the dictionary providing a definition of the word “building” as the repeated word “building.” Alternative text such as “a structure with a roof and walls” would be independent of the original text and provide an independent description of the meaning.

A simple dictionary lookup might be sufficient to provide this alternative text for individual words with single definitions. For text with multiple potential meanings, alternative text will be chosen that has the highest probability of matching the meaning of the original text.

Newtonian models

We will first describe the similarities between the Newtonian model of both physics and the meaning of text.

Newtonian model of physics: This is a class of physics problems that studies large, but not massive, items moving at speeds that do not approach the speed of light. It is the study of the world around us and how we interact with it.

There are a variety of equations that relate force, mass and acceleration, velocity and speed or work and energy.

The major characteristic of these equations is that they are linear. If a force (F1) acts upon an object mass (M1), the object will experience acceleration (A1).

If double the force (2*F1) acts upon the same object, the object will experience twice the acceleration (2*A1).

Newtonian model of the meaning of text: There are situations where the meaning of text is also linear. An example is shown in the sentence “Tim climbs the ladder.” The meaning of each word can be analyzed as in Table 3.

The meaning of the complete text is a simple combination of the meanings of each individual word. Unfortunately, this situation of linear meaning is not a common occurrence in natural language.

Relativity models

The special theory of relativity and the general theory of relativity of physics describe environments of massive proportions and involve speeds approaching the speed of light.

The meaning of text does encounter such extreme physical environments. However, the concepts utilized for those environments can also be applied to the meaning of text. Let’s look at the two concepts of relativity, and the curvature for both space-time and the domain of meaning.

Relativity model of physics: The Special Theory of Relativity expanded the three-dimensional x, y, z coordinate system to include the additional fourth dimension of time (t).

This theory also recognized that the observation of a variety of attributes of physics had undergone a change. Measurements that had been universal to all observers in Newtonian physics were now perceived differently by observers in separate locations. Some of these measurements included time, velocity and distance.

Curvature and the relativity model of physics: The general theory of relativity recognized that spacetime is not a linear four-dimensional mathematical space. Space-time is distorted, or curved, by the existence of objects with mass. The identification of this cause and effect allows a concise set of mathematical equations to model this phenomenon.

Relativity model of the meaning of text: The meaning of text also changes depending on the observer. The most extreme case is text that has a meaning in the original language, but has no meaning to an observer that is not capable in that language.

A less extreme, but more common, example is “It bit him.” Readers of different documents may identify different meanings for this sentence. They may identify a different object for “It” and a different person for “him.” This is because the meaning of text is dependent on the types of information given from a preceding text, as seen in Table 4.

Curvature and the relativity model of the meaning of text: The meaning of text is often constructed in a nonlinear manner. This lack of linearity is an attribute of a curved domain of meaning. An example will be shown based on the linear meaning of text from the earlier example sentence “Tim climbs the ladder.” The sentence is now modified to become “Tim climbs the corporate ladder.”

The meanings of the individual words in both the linear example and the modified sentence are shown in Table 5.

We can see that the addition of the word “corporate” has not only added its own meaning to the sentence, but it has also changed the meaning conveyed by the word “ladder.” Instead of the meaning of a physical structure, the meaning of the word has changed to become “hierarchy or organization structure.”

We cannot simply aggregate the meanings of the individual words of this modified sentence to determine the meaning of the complete sentence. This sentence does not have a linear meaning. The meaning is nonlinear. This may be caused by imagery, analogy, metaphor, colloquialism or other reasons.

Another way to view this situation is that the coordinate systems used to map to the meaning are curved, sometimes abruptly. A linear coordinate system would map text to the linear combination of meanings of the individual text. Instead, the coordinate system representing the domain of meaning is curved and the text is mapped to a nonlinear meaning.

These multiple, and not fully understood, causes of nonlinearity make the meaning of text a very different problem from the Relativity Model of Physics. That theory identified the mass of an object as the single measureable cause of curvature and nonlinearity.

The relativity model of the meaning of text

Several new concepts for the relativity model of the meaning of text have been identified:

•The development of a sequence number and composition unit tag coordinate system.

•The use of “alternative text” to provide a value for the meaning of text.

•Similarities with the relativity model of physics.

•The identification of dependencies of meaning on information from previous text.

•The coordinate system representing the domain of meaning is curved.

These new concepts enable the identification of a single equation that represents the development of meaning as text is incrementally added to a sentence. This equation is comparable to the equation for the relativity model of physics.

The relativity model for physics has a single, quantified cause of the curvature of spacetime. The model is self contained.

Unfortunately, without a consistent, well-defined cause of nonlinearity and curvature, the relativity model of meaning is not self-contained. It does provide a method to comprehensively map the meaning of text, however. This mapping can be used to build a repository of meaning.

Quantum models

The quantum theory of physics (also referred to as quantum mechanics) describes environments of minute proportions. The actions of matter in these smallest of environments are very different than the actions of matter in the massive proportions of relativity models. Because of this, the models for quantum physics are very different from those of the relativity model.

Again, the meaning of text does not encounter these extreme environments of quantum theory.

Instead of a completely different model being needed for the meaning of text, updates can be developed to the relativity model of the meaning of text. These updates will integrate the concepts found in the quantum models of physics.

Quantum model of physics. The quantum model of physics displays the following characteristics:

•Energy, momentum and other measurements are restricted to discrete values.

•Objects display characteristics of both discrete particles and waves.

•Particles can be represented by wave functions which provides information about the probability of position, momentum and other physical properties of a particle.

Quantum model of the meaning of text. The quantum model of the meaning of text displays characteristics comparable to the quantum model of physics. They include:

•Text and meaning are restricted to discrete values (this has always been true).

•Meaning is represented by both a discrete probabilistic model and a wave function model.

•A wave function representing meaning provides information about the probability of each individual part of the meaning of text.

Both the discrete probabilistic model and the probabilistic wave function model will be addressed in this section.

Discrete probabilistic model of the meaning of text

Previously, the relativistic model of the meaning of text assumed that a single alternative text, or definition, would be identified to define the meaning of text. The alternative text that was chosen would have the highest probability of matching the meaning of the original text.

We will now update this definition of meaning for the discrete probabilistic model. The meaning of text will now incorporate multiple possible alternative texts, each with a probability of correctly matching the meaning of the original text.

The addition of text to a sentence will always provide a probability distribution for meaning of that particular text. They may also provide multiple probability distributions for modifications to the meanings of other text. A method must be developed to handle the many different sets of probabilities.

The method used here will be to utilize the composition unit tags that were previously used as the basis for coordinate systems. Each text is associated with a specific composition unit tag. The probability distributions for meaning will be oriented along the axis used to represent that particular composition unit tag. This will generate an eight-dimensional probabilistic model for the meaning of text. A visualization of this approach is shown in Figure 1.

If text #1 was associated with a tag of “Item,” it would have the probability distribution of meanings oriented along the “Item” axis, as seen in Figure 2.

If text #2 was associated with the tag of “Activity” and modified the probabilities of meaning of text #1, there would be probability distribution of meanings for both, as seen in Figure 3.

•Sequence number 2 along the “Activity” axis.

•Modification of the probabilities for sequence number 1 along the “Item” axis.

These diagrams can be visualized to be stacked one on top of another as new text is added to a sentence.

When a sentence has been completed, the combination of the original probability distribution and all modifications to the probabilities will be combined for each sequence number. This will provide a final probability distribution. The possible meaning with the highest final probability will selected.

The diagrams shown above are strictly a visualization of this approach. The alignment along a component text tag axis is shown to be a solid line. The actual probabilities are located at distinct points along the axis.

Also, the different axes are shown in the diagram to have less than a 90 degree angle between them. As discussed earlier, the composition unit tags have been developed so that they are actually orthogonal to each other.

Probabilistic wave function model for the meaning of text

Unlike the quantum model of physics, the quantum model of the meaning of text does not have to deal with wave functions. The discrete model described in the previous section is sufficient.

However, wave functions provide us with an opportunity for further processing of meaning. Wave functions are familiar to electronic communication engineers. The use of a variety of discrete signal processing procedures may result in a better understanding of the meaning of text.

The method that will be used to create the wave functions will be based on the Fourier Transform. This procedure combines a variety of sine waves of different frequencies, magnitudes and offsets from the origin to replicate the original information.

The advantage of the wave functions is that the processing of the signal may become much easier.

The Fourier Transform processes continuous functions. Since the probability distributions are only defined for discrete meanings, we will be using the Discrete Fourier Transform (DFT).

The values that will be transformed by the DFT are the eight-dimensional probability distributions for the meaning of text. However, the mathematics of the Discrete Fourier Transform in multiple dimensions can quickly become complex.

The choice of the orthogonal composition unit tags provides a simplification to this issue.

The probability distributions are oriented along multiple different axes. Because the axes are orthogonal, however, the DFT for the complete set of probability distributions can be decomposed in distributions along different composition unit tag axes. The wave function resulting from each DFT will be oriented along the same tag axis as the original probability distribution.

The DFT’s of each specific probability distribution can then be combined to create an eight-dimensional DFT wave function representing the complete set of probabilities.

One observation that stands out from these probabilistic models is that the nonlinear equations that were developed for the relativity model of the meaning of text have now produced a set of probability distributions. We have moved away from problems dealing with curvature and nonlinearity.

These new probability distributions and wave functions can now be further processed in a linear manner. This allows the decomposition of a large and complex problem into separate components. Each of these component problems can be solved and then combined with other component solutions to build a solution to the original complex problem.

Summary

There are many similarities between the meaning of natural language text and Newtonian physics, relativity theory and quantum theory. Models were developed for the meaning of text that are analogous to the physics models.

The “Newtonian” model of meaning provided a simple aggregation of the individual meanings of text in a sentence. The relativity model of meaning recognized the more common nonlinearity of meaning and the curvature of the domain of the meaning of text.

The discrete probabilistic model of meaning integrated the concept of probability from quantum theory. This probabilistic approach recognized that a single meaning of text is often difficult to identify.

And finally, the probabilistic wave function model of meaning utilized the Discrete Fourier Transform to represent meaning as an eight-dimensional wave function. This format may create many opportunities for further processing of the meaning of text. This new processing may utilize methods developed for electronic communication systems.

The meaning of text has progressed from a linear model, to a nonlinear model, to a nonlinear probabilistic model that produced a linear eight-dimensional probability distribution.