Text Summarization is a process of generating a compact and meaningful synopsis from a huge volume of text. Sources for such text include news articles, blogs, social media posts, all kinds of documentation, and many more. If you are new to NLP and want to read more about text summarization, this article will help you understand the basic and advanced concepts. The purpose of this article is to demonstrate the implementation of an extractive text summarizer using state-of-the-art contextual embeddings.


The approach to building the summarizer can be divided into the following steps:



Natural Language Processing is one of the important processes for data science teams across the globe. With ever-growing data, most of the organizations have already moved to big data platforms like Apache Hadoop and cloud offerings like AWS, Azure, and GCP. These platforms are more than capable of handling big data that enables organizations to perform analytics at scale for unstructured data like text categorization. But when it comes to machine learning, there is still a gap between big data systems and machine learning tools.

Popular machine learning python libraries like sci-kit-learn and Gensim are highly optimized to perform on…

Topic Modelling is one of the most common tasks in natural language processing. Extracting topic distribution from millions of documents can be useful in many ways e.g. identifying the reasons for complaints about a particular product or all products or a more classic example of identifying topics in news articles. We won’t delve into the details about what topic modeling is or how it works. There are so many good articles about it on the internet but I find this article from Analytics Vidhya comprehensive. …

Satish Silveri

