By being constantly connected in the networks, we are depositing small fragments of information, information that with its proper treatment can obtain high value. For this type of analysis, there is NLP or Natural Language Processing created a long time ago.

It consists of a series of techniques that aim to analyze human communication, focused mainly on the analysis of text and speech using artificial intelligence techniques to decompose a text/speech input into small pieces that can be handled by a neural network.

A basic operation of this technology consists of converting audio to text if necessary, then algorithms based on the rules of the desired language are loaded and the text obtained is cleaned, eliminating words that have little impact on this, as the articles and then assign a label to each word based on the context applying the algorithms mentioned above and with this, there is a sufficient basis to apply artificial intelligence on this.

The objectives of these techniques are varied, they can be applied to obtain constant feedback on the opinion (Positive / Negative) of customers about a specific product, to detect keywords in a sentence, detect hate speech, fake news, feelings, create advanced chatbots, among many possibilities.

A couple of recent notorious cases were those applied by the large technology companies Facebook and Twitter, the first applied NLP and Computer Vision to detect if a publication was talking about Covid-19 and thus, show a warning to the user to try to reduce the amount of fake news surrounding the topic.

In the case of Twitter, it was applied to detect strong hate speech and hide them as soon as possible to prevent their spread on the social network. This type of algorithm ended up silencing numerous tweets from politicians during the presidential elections in the United States, the most affected was Donald Trump with the permanent blocking of his account when obtaining several incidents, additionally, they apply this technology on each tweet published to classify it in a series of predefined categories such as sports, politics or entertainment.

This type of algorithm is widely used on the modern internet, similar to NLP there are computer vision algorithms that apply an analysis with AI-oriented to images and object recognition, in future blogs we will talk in-depth about this and about its integration with text analysis techniques.

Posted by

Juan Rambal – Systems Engineer