Humans, Machines, and Semantics
King, as per the Oxford dictionary, is defined as “the male ruler of an independent state that has a royal family”.
Now if we were to change just a single word in this definition from male to female, the female ruler of an independent state that has a royal family, what would it define? Queen, right! It is, after all, the first word that pops-up in our mind. Semantically speaking, a King can be defined as a combination between a man and a royalty in an abstract form. Likewise, a Queen is a combination of a woman and a royalty. By knowing the abstract formulation, we can easily define, Prince as a boy plus a royalty and Princess as a girl plus a royalty.
Given years of evolution and interacting with languages since the day we are born, we have successfully achieved to communicate even in abstract forms. For us, finding such semantic relations comes naturally; however, what if we ask the same question to an Artificial Intelligence (AI) system? The problem then becomes incredibly difficult, and the answer lies in the rapidly-growing, sub-field of AI, Natural Language Processing, widely known by its abbreviation NLP. NLP enables modern-day computers to interpret human language, establish semantic relations among pieces of text, and consequently communicate back with humans.
Looking back at that roots of NLP, they date back to as early as the 1900s, when a Swiss linguistic professor, Ferdinand de Saussure, laid the idea of patterns and functions of a language. Saussure asserted that a language is a social phenomenon as it is a structured system which can be seen synchronically and diachronically. In other words, a language exists at any particular time, and it changes over time. He argued that meaning is created inside language, among relationships and contrasts through humour, sarcasm, irony, etc., where the literal meaning is far-apart from the intended meaning. The Oscar-nominated movie ‘The Imitation Game’ (2014) is based on a biography of the British mathematician, early computer scientist, and widely considered to be the father of theoretical computer science and AI, Alan Turing. He wrote a paper in 1950 describing a test for a “thinking” machine. He developed a test to evaluate the machine’s ability to exhibit intelligent behaviour equivalent to that of a human. Turing proposed for a human examiner to engage in a conversation with two players, a machine and a human. If the examiner failed to identify which player was a machine, then the AI system had successfully passed the Turing test. Turing’s work laid the foundations of today’s NLP systems, as seen today with the advancements in the chatbot industry.
“Machines may have passed the Turing test, but they have a long way to traverse before they begin to create thoughts.”
NLP: Academia and Industry
Between 1950 and 1980, the progress in the field of NLP became limited due to computational power and availability of data. Today, we have both of them in abundance; consequently, algorithms have demonstrated significant improvement in text processing. The 2019 data of ‘Annual Conference of the Association for Computational Linguistics‘, one of the high-ranked conferences in linguistics, showed dramatic growth in the number of submissions of research articles related to NLP. The focus of AI research groups has almost doubled from 2018 up to 2019.
Today we have a significant number of software packages and coding libraries, which enables developing effective NLP algorithms. Although NLP is older than a few decades, it can be said that it has just begun to take baby steps!
With such an unprecedented growth of NLP in academia, the industry is also rapidly deploying machine learning systems to exploit the benefits of NLP. This can be applied to various tasks such as sentiment analysis, topic extraction, text summarisation, semantic similarity, analysis of unstructured documents, content classification, news article categorisation, product description analysis, etc. NLP has become a bridge of communication between computers and humans. Machines now have a framework which allows them to understand us better. With the adoption of advanced AI methods such as deep learning in the industry, it sees exponential growth, thanks to big data, computational power, and rising interest in human-to-computer communications. The global natural language processing market is estimated to be worth 34.80 billion by 2025, registering a compound annual growth rate of 21.5% during 2020-2025.
“AI will become our greatest companion in day-to-day activities, just like our mobile phones but smarter and intelligent.”
Relationships Among Words
So, coming back to our question, how does an AI system understand that a royalty plus a woman is equal to a queen? The answer lies in “word embeddings”, a type of word representation that allows words with similar meaning to have a similar representation. It is a representation of words in a high-dimensional vector space which captures semantic relationships among words based on their position in a sentence. Bengio et al. (2003) published a research article introducing the concept of word embeddings, which forms the basis of various NLP algorithms today. The following picture illustrates the answer to our question. The semantic relation “the king is to the queen as a man is to a woman” is encoded in word embeddings. Similarly, word embeddings can also capture, verb-tense relationships, “walking is to walked as swimming is to swam”, and country-capital relationships “Vienna is to Austria as Rome is to Italy as Berlin is to Germany and so on…”.
The focus is rapidly shifting towards designing online applications which can process textual information. The advantages of employing an AI method have been well tested throughout the industry, boosting the efficiency of a system. We want to develop AI systems to understand humans way of communications. It is easier to teach an AI our ways of life than it is another way around. A language, though a structured system, is exceptionally complex within itself. Understanding a language involves knowing numerous variables such as culture, dialects, history, science, art, etc. A perfect example could be the moment that marked Charlie Chaplin and Albert Einstein’s friendship at the premiere of Chaplin’s film ‘City lights’. An alleged conversation took place between the two masterminds:
“What I admire most about your art” Albert Einstein said, “is its universality. You do not say a word, and yet … the world understands you”.
It’s true replied Chaplin: “But your fame is even greater… the world admires you when nobody understands you.”
Today, NLP has just begun to demonstrate its ability, whereas future will observe an overwhelming integration of NLP into business scenarios. We at STRG keep ourselves up-to-date with the advances in NLP. Drop an email at email@example.com to know more about how NLP can be incorporated within your business framework or visit our website https://www.strg.at/ to get a view of our services.
by Dadhichi Shukla