Natural Language Processing (NLP)

Analysis of Customer Service Tweets of Indian Telco’s

Why this use case?

Simply for two reasons: -

1. I spent almost a decade in telecom domain with quite a few telecom operators in multiple countries; so I am kind of aware that social connect is becoming more and more cost effective and impactful customer service channel and getting fair amount of leadership / investment focus in recent times.

2. I also spent two decades in bringing commercial / practical bearing of data driven decisions both as a data scientist as well as commercial resource, and it makes lot of sense to me to bring some commercial flavor to otherwise technical domain of natural language processing.

Disclaimer(s)

Public information: I am using data from publicly available Twitter API’s and stated customer care handles by respective telecom operators.

Personal opinion: All the opinions (mostly abstract) expressed here my own and purely driven by (my interpretation of) the data and machine learning techniques with the sole objective to bring commercial narrative (I said it again!!) to technical domain of data science.

Depth of analysis: I am neither developing a classification algorithm, not am I training a model (I do that for '$') in this use case. All I am doing is to put together logical flow of (mostly) Python codes, very well documented and available publicly (e.g. python.org, github.com and several blogs written in last 4 – 5 years or so); to drive insights using simple NLP techniques.

Objective

The flare-up of digital ecosystem and exponential adoption of digital channels in recent times have put forward tremendous opportunities and challenges to any customer facing business. On one hand these channels have made the customer conversations lot more impactful (in brand promotion sense) on the other brought new level of efficiencies in interacting with end customers.

Digital channels such as Pinterest, Twitter, Instagram, Line, Houzz etc. are not only being used for acquisition, monetization (advertising), re-targeting but also as customer touch point (similar to good old customer care hotline, call-centre etc.) to have one on one customer interactions for grievance and service  information delivery.

While there have been lots of advancements (and published work)  in the recent times to monitor the sentiment of twitter feeds (mostly what people are talking about the brand and services), I haven’t found too much literature on how twitter is utilized by the organizations while interacting with their customers. Specifically, to gauge the organization strategy while providing the Twitter support to their current or potential customers.

This work is by no means an attempt to benchmark the ‘Ideal’ expected behavior for Twitter as support channel or to assess the superiority of one organization over other; instead analyzing the difference in the way organizations reflect upon while interacting on their official Twitter support handles.

Common expectations (of a customer) from any customer touch-point is:

--> Instant connect.

--> Non mechanical conversation.

--> Compassion

--> Single point of contact.

Scope

The top 4 telecom operators namely Airtel, Vodafone, Jio and Idea and their official customer care twitter handles.

Stack and Methodology

Anaconda Python (3.6.5 64 bit windows) Spyder 3.2.8

Step 1: Collecting data from these twitters handles calling public Twitter API (using Python’s tweepy library)

import tweepy
<<api authorization>>
for tweet in tweepy.Cursor(api.user_timeline,
Screen_name='@idea_cares').items():

Note :: Twitter imposes restriction of around 3,200 latest tweets using user timeline / screen_name and therefore the whole study is based upon 3,200 latest tweets each from the companies in the scope. (which is fair for like to like comparison)

Step 2. Text scrubbing, cleaning and feature extraction.

Tweeter content ‘text’ is of prime interest for this analysis and requires loads of pre-processing steps including.

Tokenization – breaking ‘long’ sentences into more useful collocational patterns (‘tokens’) which aim to reduce the length of the text while retaining / extracting intent of the text. - used Python’s ‘nltk’ library features; such as tokenization and / or NER (named entity recognition )

HTML decoding – Twitter ‘text’ encases instances of html / xml references and need to be decoded for useful interpretation of information. - used Python’s ‘beautifulsoup ‘ library

Miscellaneous steps – removing non useful strings (@mentions, hashtags, stopwords, punctuations, stemming ) – used Python’s regular expression ‘re’ library

import pandas as pd
import re
from bs4 import BeautifulSoup
from nltk.tokenize import WordPunctTokenizer, ne_chunk

Step 3. Sentiment /Opinion / Subjectivity / Polarity extraction.

Some background

Majority of the sentiment extraction methods rely on looking at words (or combination) in isolation (lexicon based), looking for the pre-derived (derived from labelled texts, sentences or words) assessment (assessment of subjectivity, polarity etc.) of these patterns (in training samples). The final label (positivity, negativity, neutrality, subjectivity etc.) is mathematical combination (summation mostly) of the scores of individual lexicons in the text to be analysed for sentiment extraction. This is the simplest form of sentiment extraction and does not necessarily involve heavy duty machine learning algorithms.

Second category of popular sentiment extraction mechanisms (supervised) utilize modified Bayesian theorem. A naïve bayes classifier works by figuring out the probability of different attributes of the data being associated with a certain class. The attributes of classification are assumed to be independent to each other (naïve assumption). Simple way to visualize this classifier is to imagine the factors which we would process in our brain (e.g. wind speed, cloudy / sunny day, water temperature etc. etc.) to experientially decide (basis past decisions of good day or bad day for fishing) by looking various factors of decision independently. Applying those past experiences with current feature situation we decide to go for fishing (‘Positive decision’) or not (‘Negative decision’). One more example here.

Third category of sentiment extraction deals with the compositional construct of statement (recursive deep models). These models are trained on ‘tree like’ labelled statement compositions and are claimed to be superior to the other classification methods. More information here.

For this study

For this use case I experimented with the lexicon classification as well as Naïve bias classification and settled with NB as I found it to be slightly more accurate predicting labelled data.

  • For lexicon classification – used ‘sentlex’ library
  • For NB classification – used ‘TextBlob’ library
import sentlex
# sentlex.SWN3Lexicon()
from textblob import TextBlob

Results :

Sentiment distribution by operator response

This is where the twitter responses of customer care twitter handles are classified into negative, neutral and positive sentiments. As described before, I finally used Naïve Bayes to score individual tweet responses by operators and following is the summary:

-> Almost 50% of Airtel, Jio and Vodafone response sentiments are positive ones while Idea is far behind at 31%.

-> Interestingly half (~50%) Vodafone and Idea response sentiments turned out to be subjective (or neutral) in intent. Since prime objective of customer care twitter handles should be resolution of customer concerns; subjective (neutral) intent isn’t necessarily a good thing for this purpose.

-> 16% negative sentiment of Jio’s communication for customer’s issue resolution are closely followed by Airtel (14%) and Idea (12%), whereby Vodafone’s (2%) negative sentiment seems like a distinct strategy from the rest.

Human vs. Robotic responses

Post text processing, I grouped the responses from the collected tweets to observe the frequency of exact same response narrative. My objective for this analysis is to gauge the extent of personalized focus (which is better strategy) in resolving customer complaints vs. standard (and perhaps automated) responses. I categorize the uniqueness of responses into 3 categories: -

-> Human response: Where the verbiage (exact phrase) is distinct and do not repeat for more than 10 tweets in the entire dump.

-> Humanoid response: Where the same verbiage repeats for up to 50 tweets.

-> Robotic response: Where the same verbiage repeats for more than 50 tweets.

As an end customer, I would like to receive specific answer to my concern rather than generic feedback and here Jio surpasses rest of the telco’s by responding whopping 97% distinct responses followed by Airtel (87%), Idea (78%) and Vodafone (74%).

'I' vs. 'We'

While analysing the word cloud of twitter handles from each of the telco’s, I observed that barring one operator (Airtel) everyone else uses terms depicting collective responsibility (i.e. ‘we’, ‘our’, ‘team’ etc.) to resolve an issue instead of an individual (i.e. ‘I’, ‘me’, etc.) ownership in their responses.

While it might be a bigger topic for research, personally as a customer I feel more comfortable in getting a feel that an Individual is taking an ownership to address my concern, rather than putting the onus back to ‘others’. In the end it might be still the same mechanism to resolve the problem (collectively by the team), but from communication comfort, I like Airtel’s approach.

Word Cloud

Word cloud (tag cloud) is an interesting way (though slightly controversial in data science community) of visualizing the frequent usage of specific tokens (words, tuple etc.). There are observations on the usage of specific tokens for negative vs. positive vs. neutral category of tweets and might be useful for deeper assessment (and perhaps change in response strategy) and realignment of promoting more comfortable narrative to the end customer’s.

I used Python’s wordcloud and matplotlib libraries to visualize these.

from wordcloud import WordCloud
import matplotlib.pyplot as plt
plt.style.use('seaborn-white')

Conclusion

The prime objective of documenting the study was to showcase the interesting use case of one of the many application areas of natural language processing and analyse the outcome of these insights in commercial context.

In my experience many organizations are already working in these lines and benefiting from the immense value that the advancements of machine learning and data driven strategies are bringing to the table. As more and more data is becoming accessible for analysis, cost of storage /compute declining and data science communities openly contributing for the purpose; there are many interesting use cases that will continue to emerge.

Exciting times !!