题目描述
Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.
For example,
[2,3,4], the median is 3
[2,3], the median is (2 + 3) / 2 = 2.5
Design a data structure that supports the following two operations:
void addNum(int num) - Add a integer number from the data stream to the data structure.
double findMedian() - Return the median of all elements so far.
Example:
addNum(1)
addNum(2)
findMedian() -> 1.5
addNum(3)
findMedian() -> 2
Follow up:
If all integer numbers from the stream are between 0 and 100, how would you optimize it?
If 99% of all integer numbers from the stream are between 0 and 100, how would you optimize it?
写法1【比较推荐】
双priority queue,使用even bool来记录当前的情况
中位数即排好序的数组中的中间的位置的数字,如果有偶数个数字,那么取中间两个值的平均值。
xxxx m yyyy
q1: xxxx大堆
q2: yyyy小堆
xxxxm yyyy z // push from odd
1. xxxxzm yyyy -> xxxxxx yyyy -> xxxxx yyyyx -> xxxxx yyyyy -> m = (x + y) / 2
2. xxxxm zyyyy -> xxxxm yyyyy -> xxxxx yyyyy -> m = (x + y) / 2
xxxm yyyy z // push from even
1. xxxzm yyyy -> xxxx yyyyx -> xxxxm yyyy
2. xxxm yyyyz -> xxxx yyyyy -> xxxxm yyyy
if there are even num of nodes
then push to large then push the top of large to small // now it’s odd
else
then push to small then push the top of small to large // now it’s even
C++ 代码
class MedianFinder {
private:
priority_queue<double> small, large;
bool even;
public:
/** initialize your data structure here. */
MedianFinder() {
even = true;// 0 is even
}
void addNum(int num)
{
if (even) { // small.size() == large.size()
large.push(-num);
small.push(-large.top());
large.pop();
} else { // small.size() - large.size() == 1
small.push(num);
large.push(-small.top());
small.pop();
}
even = !even;
}
double findMedian()
{
return !even ? small.top() : 0.5 * (small.top() - large.top());
}
};
/**
* Your MedianFinder object will be instantiated and called as such:
* MedianFinder* obj = new MedianFinder();
* obj->addNum(num);
* double param_2 = obj->findMedian();
*/
写法2
双priority queue,使用两个queue的大小来记录当前的情况
2
small: 2 -> empty -> 2
large: -2 -> empty
median: 2
2, 3
small: 2 -> 3,2 -> 2
large: empty -> -3
median: $\frac{2 - (-3)}{2} = 2.5$
2, 3, 4
small: 2 -> 4, 2 -> 2 -> 3, 2
large: -3 -> -3, -4 -> -4
median: 3
C++ 代码
class MedianFinder {
private:
priority_queue<long> small, large;
public:
/** initialize your data structure here. */
MedianFinder() {}
void addNum(int num) {
// blindly push in two pq
small.push(num);
large.push(-small.top());
// small is always greater than large in size, at most greater by one, equal is fine
small.pop(); // leave the new element in large
if (small.size() < large.size()) {
small.push(-large.top());
large.pop(); // large has one more, so move it to small
}
}
double findMedian() {
return small.size() != large.size() ? small.top()
: 0.5 * (small.top() - large.top());
}
};
/**
* Your MedianFinder object will be instantiated and called as such:
* MedianFinder* obj = new MedianFinder();
* obj->addNum(num);
* double param_2 = obj->findMedian();
*/
时间复杂度
priority_queue的每次添加是$O(\log{n})$, 访问头节点是$O(1)$,所以加在一起是$O(\log{n})$
但是空间复杂度较高,因为需要储存所有的数字,所以是$O(n)$。
在followup 1中说如果所有的数字都在0-100的范围内,那么其实可以牺牲一些find的时间但是提高储存的空间:
使用桶排序$O(100)$
class MedianFinder {
private:
int A[101], n;
public:
MedianFinder() {
n = 0;
memset(A, 0, sizeof A);
}
void addNum(int num) {
A[num]++;
n++;
}
double findMedian() {
int count = 0, i = 0;
while (count < n/2) count += A[i++];
int j = i;
while (count < n/2 + 1) count += A[j++];
return (n % 2 == 1) ? j-1 : (i + j - 2) / 2.0;
}
};
第二个followup没有什么想法……但是感觉可以根据第一个follow做优化?
你的第三个 写法是
基数排序
,不是桶排序