Symbolic AI vs. machine learning in natural language processing

Jordi Torras

Jordi Torras founded Inbenta in 2005 to help clients improve online relationships with their customers. The first beta version of the Inbenta Semantic Search Engine was released in 2010.

Jordi Torras headshot

Since its foundation as an academic discipline in 1955, the AI research field has been divided into different camps: symbolic AI and machine learning. While symbolic AI used to dominate in the first decades, machine learning has been very trendy lately, so let’s try to understand each of these approaches and their main differences when applied to natural language processing (NLP).

Machine learning is an application of AI where statistical models perform specific tasks without using explicit instructions, relying instead on patterns and inference. Machine learning algorithms build mathematical models based on training data in order to make predictions.

Machine learning technology uses algorithms to teach the computer how to solve problems, and how to gain insights from solving those problems. That’s how the computer learns automatically, without human intervention or assistance: by observing and looking for patterns in data and using feedback loops to monitor and improve its predictions. While humans would be overwhelmed with masses of data, machine learning thrives and is able to evolve its understanding in order to make better decisions in the future, based on the examples that were provided to it.

Machine learning applied to NLP

Machine learning can be applied to lots of disciplines, and one of those is NLP, which is used in AI-powered conversational chatbots.

Here’s how machine learning works in this specific case: the person who oversees the bot, usually called a botmaster, feeds the engine with as much relevant data as possible. The bot then gets asked questions by its users and it automatically decides which answer to push for every intent it’s queried for. An intent, in this context, is a kind of baseline query. You can type “show me today’s news” or “what’s the news today?” and the bot should recognize that the intent is the same.

The botmaster then needs to review those responses and has to manually tell the engine which answers were correct and which ones were not. That is how the machine learns how to serve the correct answer.

As you can easily imagine, this is a very time-consuming job, as there are many ways of asking or formulating the same question. And if you take into account that a knowledge base usually holds on average 300 intents, you now see how repetitive maintaining a knowledge base can be when using machine learning.

Don’t get me wrong, machine learning is an amazing tool that enables us to unlock great potential and AI disciplines such as image recognition or voice recognition, but when it comes to NLP, I’m firmly convinced that machine learning is not the best technology to be used.

Symbolic AI

Symbolic AI, also known as good old-fashioned AI (GOFAI), uses human-readable symbols that represent real-world entities or concepts as well as logic (the mathematically provable logical methods) in order to create rules for the concrete manipulation of those symbols, leading to a rule-based system.

In a nutshell, symbolic AI involves the explicit embedding of human knowledge and behavior rules into computer programs.

One of the many uses of symbolic AI is with NLP for conversational chatbots. With this approach, also called “deterministic,” the idea is to teach the machine how to understand languages in the same way we humans have learned how to read and how to write. In order to do so, we went to school and we learned how to structure language through rules, grammar, conjugation and vocabulary. Computational linguists do exactly the same: they use rules, lexicons and semantics in order to teach the bot’s engine how to understand a language.

Using symbolic AI, everything is visible, understandable and explainable, leading to what is called a “transparent box,” as opposed to the “black box” created by machine learning.

As a consequence, the botmaster’s job is completely different when using symbolic AI technology than with machine learning-based technology, as the botmaster focuses on writing new content for the knowledge base rather than utterances of existing content. The botmaster also has full transparency on how to fine-tune the engine when it doesn’t work properly, as it’s possible to understand why a specific decision has been made and what tools are needed to fix it.

To summarize, one of the main differences between machine learning and traditional symbolic reasoning is how the learning happens. In machine learning, the algorithm learns rules as it establishes correlations between inputs and outputs. In symbolic reasoning, the rules are created through human intervention and then hard-coded into a static program.

If machine learning can appear as a revolutionary approach at first, its lack of transparency and a large amount of data that is required in order for the system to learn are its two main flaws. Companies now realize how important it is to have a transparent AI, not only for ethical reasons but also for operational ones, and the deterministic (or symbolic) approach is now becoming popular again.