Unit 4 Session 2 Standard (Click for link to problem statements)
Understand what the interviewer is asking for by using test cases and questions about the problem.
Plan the solution with appropriate visualizations and pseudocode.
General Idea: Convert the text to lowercase and remove punctuation. Split the text into words and count the frequency of each word using a dictionary. Then, identify the word(s) with the highest frequency.
1) Convert the entire `text` to lowercase to ensure case insensitivity.
2) Remove punctuation from the text.
3) Split the `text` into individual words.
4) Initialize an empty dictionary `frequency_dict` to store word frequencies.
5) Iterate through the list of words:
a) If the word is already in `frequency_dict`, increment its count.
b) If the word is not in `frequency_dict`, add it with a count of 1.
6) Determine the maximum frequency in `frequency_dict`.
7) Initialize a list `most_frequent_words` to store words with the highest frequency.
8) Iterate through `frequency_dict` and add words with the maximum frequency to `most_frequent_words`.
9) Return `frequency_dict` and `most_frequent_words`.
**⚠️ Common Mistakes**
- Not handling punctuation correctly, leading to incorrect word counts.
- Forgetting to account for case insensitivity when counting word frequencies.
- Not correctly identifying all words with the highest frequency in case of ties.
def word_frequency_analysis(text):
# Convert the text to lowercase and remove punctuation manually
text = text.lower()
clean_text = ''
for char in text:
if char.isalnum() or char.isspace():
clean_text += char
# Split the text into words
words = clean_text.split()
# Dictionary to store word frequencies
frequency_dict = {}
for word in words:
if word in frequency_dict:
frequency_dict[word] += 1
else:
frequency_dict[word] = 1
# Find the maximum frequency without using max
max_frequency = -1
most_frequent_words = []
for word, freq in frequency_dict.items():
if freq > max_frequency:
max_frequency = freq
most_frequent_words = [word]
elif freq == max_frequency:
most_frequent_words.append(word)
return frequency_dict, most_frequent_words
Example Usage:
text = "The quick brown fox jumps over the lazy dog. The dog was not amused."
print(word_frequency_analysis(text))
# Output: ({'the': 3, 'quick': 1, 'brown': 1, 'fox': 1, 'jumps': 1, 'over': 1, 'lazy': 1, 'dog': 2, 'was': 1, 'not': 1, 'amused': 1}, ['the'])
text_2 = "Digital nomads love to travel. Travel is their passion."
print(word_frequency_analysis(text_2))
# Output: ({'digital': 1, 'nomads': 1, 'love': 1, 'to': 1, 'travel': 2, 'is': 1, 'their': 1, 'passion': 1}, ['travel'])
text_3 = "Stay connected. Stay productive. Stay happy."
print(word_frequency_analysis(text_3))
# Output: ({'stay': 3, 'connected': 1, 'productive': 1, 'happy': 1}, ['stay'])