10 More Common NLP Terms Explained for the Text Analysis Novice

10 More Common NLP Terms Explained for the Text Analysis Novice


This is the second edition of our NLP terms explained blog posts. The first edition deals with some simple terms and NLP tasks while this edition, gets a little bit more complicated. Again, we’ve just chosen some common terms at random and tried to break them down in simple English to make them a bit easier to understand.

Part of Speech tagging (POS tagging)

Sometimes referred to as grammatical tagging or word-category disambiguation, part of speech tagging refers to the process of determining the part of speech for each word in a given sentence based on the definition of that word and its context. Many words, especially common ones, can serve as multiple parts of speech. For example, “book” can be a noun (“the book on the table”) or verb (“to book a flight”).


Parsing is a major task of NLP. It’s focused on determining the grammatical analysis or Parse Tree of a given sentence. There are two forms of Parse trees Constituency based and dependency based parse trees.

Semantic Role Labeling

This is an important step towards making sense of the meaning of a sentence. It focuses on the detecting semantic arguments associated with a verb or verbs in a sentence and the classification of those verbs into into specific roles.

Machine Translation

A sub-field of computational linguistics MT investigates the use of software to translate text or speech from one language to another.

Statistical Machine Translation

SMT is one of a few different approaches to Machine Translation. A common task in NLP it relies on statistical methods based off bilingual corpora such as the Canadian Hansard corpus. Other approaches to Machine Translation include Rule Based Translation and Example-Based Translation.

Bayesian Classification

Bayesian classification is a classification method based on Bayes Theorem and is commonly used in Machine Learning and Natural Language Processing to classify text and documents. You can read more about it in Naive Bayes for Dummies.

Hidden Markov Model (HMM)

In order to understand a HMM we need to define a Markov Model. This is used to model randomly changing systems where it is assumed that future states only depend on the present state and not on the sequence of events that happened before it.

A HMM is a Markov model where the system being modeled is assumed to have unobserved or hidden states. There are a number of common algorithms used for hidden Markov models. The  Viterbi algorithm which will compute the most-likely corresponding sequence of states and the forward algorithm, for example, will compute the probability of the sequence of observations and both are often used in NLP applications.

In hidden Markov models, the state is not directly visible, but output, dependent on the state, is visible. Each state has a probability distribution over the possible output tokens. Therefore the sequence of tokens generated by an HMM gives some information about the sequence of states.

Conditional Random Fields (CRFs)

A class of statistical modeling methods that are often applied in pattern recognition and machine learning, where they are used for structured prediction. Ordinary classifiers will predict labels for a sample without taking neighboring samples into account, a CRF model however, will take context into account. CRF is commonly used in NLP (e.g. in Named Entity Extraction) and more recently in image recognition.

Affinity Propagation (AP)

AP is a clustering algorithm commonly used in Data Mining, unlike other clustering algorithms such as, k-means, AP does not require the number of clusters to be estimated before running the algorithm. A semi-supervised version of AP is commonly used in NLP.

Relationship extraction

Given a chunk of words or a piece of text determining the relationship between named entities.


Text Analysis API - Sign up

Let's Talk