• Natural Language Processing

    Research for Business Applications

What is NLP

Natural Language Processing (NLP) is a field of artificial intelligence that enables computers to analyze and understand human language. In particular, how to program computers to process and analyze large amounts of natural language data. Challenges in natural-language processing frequently involve speech recognition, natural language understanding, and natural language generation. Questit NLP Platform has been developed for textual analysis. The platform is capable, given an input of plain text, to produce a structured object returned via API as JSON. This object is a sort of ‘enhanced’ text which contains fields filled with information produced at different levels of analysis:

  • morphological
  • syntactic
  • semantic

The NLP Platform has different layer of analysis. Each level inherit the analysis properties from the previous one

How NLP use Deep Neural Networks

Neural networks are a set of algorithms, modeled loosely after the human brain, that are designed to recognize patterns. They interpret sensory data through a kind of machine perception, labeling or clustering raw input. The patterns they recognize are numerical, contained in vectors, into which all real-world data, be it images, sound, text or time series, must be translated.
Our NLP Platform uses Deep Learning architecture and algorithms. Deep learning has shown to be effective in the area of NLP. Techniques such as Part-of-Speech Tagging (POS tagging), character generation and learning word embeddings are common applications of Deep Learning.

Word Embedding

The collective name for a set of language modeling and feature learning techniques in natural language processing (NLP) where words or phrases from the vocabulary are mapped to vectors of real numbers. Conceptually it involves a mathematical embedding from a space with one dimension per word to a continuous vector space with a much lower dimension.
Word2vec is a two-layer neural net that processes text. Detects similarities between words mathematically by learning the context in surrounding words. Its input is a text corpus and its output is a set of vectors: feature vectors for words in that corpus. While Word2vec is not a deep neural network, it turns text into a numerical form that deep nets can understand

Layers of analysis

  1. Tokenizer
  2. Sentencer
  3. Lemmatizer
  4. POS Tagger
  5. Collocation Detector
  6. Chunker
  7. WSD
  8. NER
  9. NotNER
  10. SVC/SVO Extractor
  11. Domain Tagger
  12. Main Concepts Extractor
  13. Main Words Extractor
  14. Sentiment Analysis
  15. Quote Extractor
  16. Emotion Analysis

WSD, that is Word Sense Disambiguation which produces the right sense of a word given its context (returns also similar lemmas given that sense);

  1. input:’i put money in the bank’ -> [bank] -> Sense:[a financial institution that accepts deposits and channels the money into lending] , similar lemmas:[depository financial institution, banking company, banking concern]

where these sense IDs have the right semantic defined by similar lemmas:
v:W0000017835: put,place,set,lay,position, pose
n:W0000129954: money
n:W0000078346: bank, banking company, banking concern, depository financial institution

Which splits text into chunks, that is syntagmas 

NERNamed Entity Recognition (NER)

Seeks to locate and classify named entities in text into pre-defined categories such as the names of persons, organizations or locations, Detects entities classified as “Proper Noun” by assigning a label that represents the type of the entity.

NotNER

Like Named Entity Recognition, detects entities classified as “Common Noun” like  expressions of times, quantities, monetary values, percentages etc.

Temporal Expression Extractor

The NLP Platform detects the temporal expression and the IE Platform normalize it in a specific day (day/month/year) using a powerful reasoning engine

WSD, that is Word Sense Disambiguation which produces the right sense of a word given its context (returns also similar lemmas given that sense);

Sentiment Analysis, which detects the sentiment polarity (negative or positive) of the text;

  1. input:’i hate war’ we can see words which have a sentiment:

and we can also get the overall score on the text:

Emotion Analysis, which detects all emotions of the text;

  1. input:’I’m very happy’

we can see words which have an emotion:
and we can also an overall score on the text:

Keep in touch

Acconsento

Acconsento al trattamento dei dati.