Instead of highlighting one word, try to find important combinations of words in the text data, and highlight the most frequent combinations. If two words are combined, it is called Bigram, if three words are combined, it is called Trigram, so on and so forth.
The below code first finds the most important combinations in data using textblob library, then visualizes that information.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
# Textblob library installation #!pip install textblob # installing the wordcloud library # !pip install wordcloud # A sample text snippet from an article on web regarding Donald Trump Article= '''Trump-critical media do continue to find elite audiences Their investigations still win Pulitzer Prizes; their reporters accept invitations to anxious conferences about corruption, digital-journalism standards, the end of nato, and the rise of populist authoritarianism. Yet somehow all of this earnest effort feels less and less relevant to American politics. President Trump communicates with the people directly via his Twitter account, ushering his supporters toward favorable information at Fox News or Breitbart. Despite the hand-wringing, the country has in many ways changed much less than some feared or hoped four years ago. Ambitious Republican plans notwithstanding, the American social-welfare system, as most people encounter it, has remained largely intact during Trump’s first term. The predicted wave of mass deportations of illegal immigrants never materialized. A large illegal workforce remains in the country, with the tacit understanding that so long as these immigrants avoid politics, keeping their heads down and their mouths shut, nobody will look very hard for them. ''' ######################################################################## # Finding the important word combinations using textblob from textblob import TextBlob # Converting the sample text to a blob SampleTextInBlobFormat = TextBlob(Article) # Finding the noun phrases (important keywords combination) in the text # This can help to find out what entities are being talked about in the given text NounPhrases=SampleTextInBlobFormat.noun_phrases # Creating an empty list to hold new values # combining the noun phrases using underscore to visualize it as wordcloud NewNounList=[] for words in NounPhrases: NewNounList.append(words.replace(" ", "_")) # Converting list into a string to plot wordcloud NewNounString=' '.join(NewNounList) print('##### Important word combinations ####') print(NewNounString) ######################################################################## # Plotting the wordcloud %matplotlib inline import matplotlib.pyplot as plt from wordcloud import WordCloud, STOPWORDS # Creating a custom list of stopwords customStopwords=list(STOPWORDS) + ['less','Trump','American','politics','country'] wordcloudimage = WordCloud( max_words=50, font_step=2 , max_font_size=500, stopwords=customStopwords, background_color='black', width=1000, height=720 ).generate(NewNounString) plt.figure(figsize=(20,8)) plt.imshow(wordcloudimage) plt.axis("off") plt.show() |
Sample Output:

Author Details
Lead Data Scientist
Farukh is an innovator in solving industry problems using Artificial intelligence. His expertise is backed with 10 years of industry experience. Being a senior data scientist he is responsible for designing the AI/ML solution to provide maximum gains for the clients. As a thought leader, his focus is on solving the key business problems of the CPG Industry. He has worked across different domains like Telecom, Insurance, and Logistics. He has worked with global tech leaders including Infosys, IBM, and Persistent systems. His passion to teach inspired him to create this website!