Word Frequency in a Python String

[ad_1]

This tutorial provides several techniques to count the frequency of each word in a Python string, followed by simple examples.

Here, we have to write a Python program that will accept a string as input and calculate the occurrence of each word in it. We can address this problem with many programming logic. Let’s find out each solution one by one.

Python Program – Compute Frequency of Words in a String

It is always exciting to solve a problem by taking diverse approaches. A real programmer keeps trying and considers doing things in a better way.

Using List to count the word frequency in a string

Let’s see how can we use a list to count the occurrences of each word in a string. Following is the step by step detail:

  • The first thing, we’ll do it is to convert the string to a list. Python string has a split() method. It takes a string and some separator (actually a space in our case) to return a list.
  • Next, we’ll need to use another list that will be empty initially.
  • After that, we’ll store unique values of the first list into the second one.
  • Finally, we’ll use the Python range to iterate string list having unique values that mean inside a loop.
  • In the loop, the count() function will give us the count of each unique word present in the parent string.

See the full logic in the below coding snippet.

"""
Program:
 Python program to count frequency of each word in a string
"""
def get_word_freq(input_string): 

   # convert the input string into a list of words
   input_string_list = input_string.split()     
   
   print("*******************")
   print("input_string_list = ", input_string_list)
   print("*******************n")
    
   unique_string_list = [] 

   # iterate the input string list and find unique words 
   for i in input_string_list:         

      # test for duplicate values 
      if i not in unique_string_list: 

         # add unique words to second list
         unique_string_list.append(i) 

   print("*******************")
   print("unique_string_list = ", unique_string_list)
   print("*******************n")
   
   print("*******************")
   for i in range(0, len(unique_string_list)): 

      # compute word frequency in input string 
      print('Word Frequency [{}]: {}'.format(unique_string_list[i], input_string_list.count(unique_string_list[i])))
    
   print("*******************")

def Driver(): 
   input_string ='python csharp javascript php python javascript csharp python csharp php'
   get_word_freq(input_string)                

if __name__=="__main__": 
   Driver()          # call Driver() function 

The result of the above coding snippet is as follows:

*******************
input_string_list =  ['python', 'csharp', 'javascript', 'php', 'python', 'javascript', 'csharp', 'python', 'csharp', 'php']
*******************

*******************
unique_string_list =  ['python', 'csharp', 'javascript', 'php']
*******************

*******************
Word Frequency [python]: 3
Word Frequency [csharp]: 3
Word Frequency [javascript]: 2
Word Frequency [php]: 2
*******************

Sometimes, you might also need to convert a list to string, so have yourself go over it.

Using Python set method to get the word frequency

Subsequently, we can use Python’s set() function to compute the frequency of each word in a string. Given below are some high-level steps to accomplish the task.

  • Again, as in the first method, we did the splitting of the input string, here also, we have to do it.
  • After that, we’ll use the Python Set to remove the duplicates from the given string. In Python, the Set, by definition, has unique values and ignores the copies.
  • Finally, we’ll traverse over the set values and count the occurrences of each word.

See the full logic in the below coding snippet.

"""
Program:
 Python program to count frequency of each word in a string
"""
def get_word_freq(input_string): 

   # break the string into list of words 
   input_string_list = input_string.split() 

   # gives set of unique words 
   unique_string_set = set(input_string_list) 
   
   print("*******************")
   print("input_string_list = ", input_string_list)
   print("*******************n")
    
   print("*******************")
   print("unique_string_set = ", unique_string_set)
   print("*******************n")

   for entry in unique_string_set : 
      print('Frequency of ', entry , 'is :', input_string_list.count(entry)) 

# driver code 
if __name__ == "__main__": 
   
   input_string ='python csharp javascript php python javascript csharp python csharp php'
   
   # calling the freq function 
   get_word_freq(input_string) 

The result of the above coding snippet is as follows:

*******************
input_string_list =  ['python', 'csharp', 'javascript', 'php', 'python', 'javascript', 'csharp', 'python', 'csharp', 'php']
*******************

*******************
unique_string_set =  {'csharp', 'javascript', 'python', 'php'}
*******************

Frequency of  csharp is : 3
Frequency of  javascript is : 2
Frequency of  python is : 3
Frequency of  php is : 2

To learn more, read our flagship Python tutorial for beginners and advanced learners.

[ad_2]

Leave a Comment