Machine Learning

Text Analytics using Natural Language Processing

Posted on Updated on

Natural Language Processing (NLP) combines artificial intelligence and machine learning techniques with linguistics to process and understand human language. Using NLP, various sources of unstructured data such as social media, call (text) logs, emails etc. could be leveraged to extract actionable insights. Some of the applications include text processing for information retrieval, sentiment analysis, question answering etc.

The core of the problem is that natural languages have been constantly evolving with growing vocabulary. In addition, some of the inherent aspects of the language such as grammar, syntax, semantics and varied writing styles add to the complexity of their analysis. It is quite challenging to arrive at definitive rules while creating systems that make sense of the language. As a result, a logical process of building a parsing system should focus more on using application-specific techniques and the domain in context.

Some of the techniques being:

NLP using Natural Language Toolkit (NLTK) library from Python

Using the open source library – NLTK 3.0 from Python, I was able to understand the trend of a set of ailments (in the medical domain). This could be achieved by counting the frequencies of these words (ailments) from call (text) logs pertaining to a certain time period.

Stanford NLP

In another NLP application, I used Stanford NLP libraries to understand customer opinion. To be more specific, this was to perform Sentiment Analysis on Yelp reviews.