Natural Language Processing

Topic Modelling

Topic modeling is a type of statistical modeling for discovering the abstract “topics” that occur in a collection of documents. Latent Dirichlet Allocation (LDA) is an example of topic model and is used to classify text in a document to a particular topic. It builds a topic per document model and words per topic model, modeled as Dirichlet distributions.

Here we are going to apply LDA to Electric Vehicle + Tesla + Rivian subreddit comments that was collected using PRAW reddit API, and split them into topics.

Let’s get started!


LDA Visualisation, for interactive Viz scroll down to the notebook