题目描述
Given a non-empty list of words, return the k most frequent elements.
Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.
Example 1:
Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]
Explanation: "i" and "love" are the two most frequent words.
Note that "i" comes before "love" due to a lower alphabetical order.
Example 2:
Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
Output: ["the", "is", "sunny", "day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words,
with the number of occurrence being 4, 3, 2 and 1 respectively.
Note:
- You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
- Input words contain only lowercase letters.
Follow up:
- Try to solve it in O(n log k) time and O(n) extra space.
题意:给一非空的单词列表,返回前 k 个出现次数最多的单词。返回的答案应该按单词出现频率由高到低排序。如果不同的单词有相同出现频率,按字母顺序排序。
算法1
(优先队列)
题解:因为要求是$O(nlogk)$的时间复杂度,提示我们使用优先队列(堆)这个数据结构来解决问题。首先我们使用一个hash表预先统计出每个单词出现的次数,然后我们定一个word的结构体存储每个单词以及出现次数。然后我们使用一个容量为K的优先队列(小顶堆序)。
先将前k
个单词塞进去,然后遍历剩余单词,如果当前单词比堆顶元素大,那么我们就弹出堆顶元素,将当前单词加入队列。
遍历完所有单词后,优先队列中存储的就是最大的k
个单词,然后依次弹出,因为是小顶堆所以弹出的单词是从小到大的,我们再做一次反转就可以了。
这里有一些小trick,c++默认是大顶堆序的,并且内部排序比较大小的时候是默认使用<
,我们为了使其成为小顶堆序,我们需要运算符重载小于号。根据条件,频次越高并且字典序越小的单词应该是更大的那个,为了让它成为小顶堆序,在重载小于号的时候我们认为频次越高并且字典序越小的单词应该是更小的那个。
这里还需要说明的是,在判断堆顶元素和当前元素大小的时候,由于我们把原来的小于号重载了,如果我们仍然使用小于号的话,会得到完全相反的结果,所以我们将大于号也重载,这时候我们是认为频次越高并且字典序越小的单词应该是更大的那个。
这时候有趣的现象就是这个两个运算符重载的函数竟然一模一样!!!
C++ 代码
class Solution {
public:
struct word{
string str;
int count;
word(){}
word(string str,int count):str(str),count(count){};
bool operator< (const word & b) const {
return (count == b.count)?(str < b.str):(count > b.count);
}
bool operator> (const word & b) const
{
return (count == b.count)?(str < b.str):(count > b.count);
}
};
vector<string> topKFrequent(vector<string>& words, int k) {
unordered_map<string,int> hash;
for(auto w:words) hash[w] ++;
vector<word> dict;
for(auto it:hash) dict.push_back(word(it.first,it.second));
priority_queue<word> q;
for(int i = 0 ; i < k ; i ++)
q.push(dict[i]);
for(int i = k ; i < dict.size() ; i ++)
{
if(dict[i] > q.top())
{
q.pop();
q.push(dict[i]);
}
}
vector<string> res;
while(!q.empty())
{
res.push_back(q.top().str);
q.pop();
}
reverse(res.begin(),res.end());
return res;
}
};