The code below tries to check if all the words in searchWords
appear in newsPaperWords
. Both lists can contain duplicates. If a word appears n times in searchWords
, it'll have to appear at least n times in newsPaperWords
for the method to return true. I thought that the time complexity was 2*O(n) + O(m)
but the interviewer told me that it is 2*O(n log n) + O(m log m)
.
/**
* @param searchWords The words we're looking for. Can contain duplicates
* @param newsPaperWords The list to look into
*/
public boolean wordMatch(List<String> searchWords, List<String> newsPaperWords) {
Map<String, Integer> searchWordCount = getWordCountMap(searchWords);
Map<String, Integer> newspaperWordCount = getWordCountMap(newsPaperWords);
for (Map.Entry<String, Integer> searchEntry : searchWordCount.entrySet()) {
Integer occurrencesInNewspaper = newspaperWordCount.get(searchEntry.getKey());
if (occurrencesInNewspaper == null || occurrencesInNewspaper < searchEntry.getValue()) {
return false;
}
}
return true;
}
private Map<String, Integer> getWordCountMap(List<String> words) {
Map<String, Integer> result = new HashMap<>();
for (String word : words) {
Integer occurrencesThisWord = result.get(word);
if (occurrencesThisWord == null) {
result.put(word, 1);
} else {
result.put(word, occurrencesThisWord + 1);
}
}
return result;
}
As I see it, the time complexity of the method is 2*O(n) + O(m)
(being n the number of elements in searchWords
and m the number of elements in newsPaperWords
):
- The method
getWordCountMap()
has a complexity ofO(n)
, being n the number of elements in the given list. The method loops the list once, and assuming that the calls toresult.get(word)
andresult.put()
areO(1)
. - Then, the iteration over
searchWordCount.entrySet()
is, worst-case,O(n)
, assuming, again, that calls toHashmap.get()
areO(1)
.
So, simply adding, O(n) + O(m)
to build the two maps plus O(n)
for the last look.
After reading this answer, taking O(n)
as worst-case complexity for HashMap.get()
, I could understand that the complexity of getWordCountMap()
goes up to O(n*2n)
and the final loop to O(n*n)
, which would give a total complexity of O(n*2n) + O(m*2m) + O(n*n)
.
But how is it 2*O(n log n) + O(m log m)
?