2

I'm trying to find the median of a std::set. Since std::set already sorts everything, I just have to pick the middle element. My idea is to advance to the half: std::advance(e, rtspUrls.size() / 2);, but I'm not sure how it'll behave. What about numbers like 1.5? Will it advance to something?

I'm using a try catch to try to not advance into something undefined. Is this safe?

According to http://www.cplusplus.com/reference/algorithm/min_element/?kw=min_element, std::advance throws if the iterator throws. I'm not sure if the iterator for std::set throws when we try to ++ it (https://en.cppreference.com/w/cpp/named_req/BidirectionalIterator does not say anything).

std::set<RTSPUrl, decltype(compare_rtsp_url)*> rtspUrls(compare_rtsp_url);
std::set<RTSPUrl, decltype(compare_rtsp_url)*>::iterator e = rtspUrls.begin();
for (const RTSPUrl &rtspUrl : stream.rtsp_urls())
{
    if (rtspUrl.has_resolution())
    {
        rtspUrls.push_back(rtspUrl);
    }
}
try
{
    std::advance(e, rtspUrls.size() / 2);
    return *e;
}
catch (std::exception &e)
{
    return std::nullopt;
}
smac89
  • 26,360
  • 11
  • 91
  • 124
Guerlando OCs
  • 163
  • 8
  • 34
  • You need to check if the number of elements is even/odd as that will determine what the *median* is [Median - wikipedia](https://en.wikipedia.org/wiki/Median) – David C. Rankin Sep 14 '20 at 05:52

3 Answers3

4

I just have to pick the middle element. My idea is to advance to the half: std::advance(e, rtspUrls.size() / 2);, but I'm not sure how it'll behave. What about numbers like 1.5? Will it advance to something?

std::set indices use unsigned integer values (size_t) so the double 1.5 will be converted to size_t 1.

I'm not sure if the iterator for std::set throws when we try to ++

No it will not, but advancing beyond end() is undefined.

A true median for a set with an even amount of elements would take the average of the two middle elements - but that requires that the type you store in your std::set both supports + and /. Example:

std::set<double> foo{1., 2., 3., 10.};

if(foo.empty()) throw std::runtime_error("no elements in set");

double median;

if(foo.size() % 2 == 0) {                 // even number of elements 
    auto lo = std::next(foo.begin(), foo.size() / 2 - 1);
    auto hi = std::next(lo);
    median = (*lo + *hi) / 2.;
} else {                                  // odd number of elements
    median = *std::next(foo.begin(), foo.size() / 2);
}

std::cout << median << '\n'; // prints 2.5

In your case, the type in the set does not look like it's supporting + and / to create an average of two RTSPUrls in case you have an even number of elements, so you should probably just go for one of the two middle elements in case you have an even amount. Either by returning an iterator (so the user can then check if it's rtspUrls.end()):

return std::next(rtspUrls.begin(), rtspUrls.size() / 2);

Or by returning a reference to, or copy of, the element:

if(rtspUrls.empty()) throw std::runtime_error("no elements in set");
return *std::next(rtspUrls.begin(), rtspUrls.size() / 2);
Ted Lyngmo
  • 37,764
  • 5
  • 23
  • 50
3

With std::set you are limited to using iterators to iterate to the middle element (in case of an odd number of entries in your set) or iterating to middle-1 and middle and taking the average (int the case of a even number of entries) to determine the median.

A simple loop and a counter is about as straight-forward as it gets. A short example would be:

#include <iostream>
#include <set>

int main (void) {
    
#ifdef ODD
    std::set<std::pair<char,int>> s {{'a',1}, {'b',2}, {'c',3}, {'d',4}, {'e',5}};
#else
    std::set<std::pair<char,int>> s {{'a',1}, {'b',2}, {'c',3}, {'d',4}, {'e',5}, {'f',6}};
#endif
    double median = 0.;
    size_t n = 0;
    
    for (auto iter = s.begin(); iter != s.end(); iter++, n++) {
        if (n == s.size() / 2 - 1 && s.size() % 2 == 0) {
            median += iter->second;
            std::cout << iter->first << "  " << iter->second << '\n';
        }
        if (n == s.size() / 2) {
            median += iter->second;
            if (s.size() % 2 == 0)
                median /= 2.;
            std::cout << iter->first << "  " << iter->second
                    << "\n\nmedian " << median << '\n';
            break;
        }
    }
}

(of course you will have to adjust the types to meet your data)

Example Use/Output

Compiled with ODD defined:

$ ./bin/set_median
c  3

median 3

Compiled without additional definition for the EVEN case:

$ ./bin/set_median
c  3
d  4

median 3.5

std::next

You can use std::next to advance to the nth iterator after the current. You must assign the result:

    median = 0.;
    auto iter = s.begin();
    
    if (s.size() % 2 == 0) {
        iter = std::next(iter, s.size() / 2 - 1);
        median += iter->second;
        iter = std::next(iter);
        median += iter->second;
        median /= 2.;
    }
    else {
        iter = std::next(iter, s.size() / 2);
        median += iter->second;
    }
    std::cout << "\nmedian " << median << '\n';

std::advance

std::advance advances the iterator provided as a parameter to the nth iterator after the current:

    median = 0.;
    iter = s.begin();
    if (s.size() % 2 == 0) {
        std::advance(iter, s.size() / 2 - 1);
        median += iter->second;
        std::advance(iter, 1);
        median += iter->second;
        median /= 2.;
    }
    else {
        std::advance(iter, s.size() / 2);
        median += iter->second;
    }
    std::cout << "\nmedian " << median << '\n';

(the output for median is the same as with the loop above)

Look things over and let me know if you have further questions.

David C. Rankin
  • 69,681
  • 6
  • 44
  • 72
1

I just have to pick the middle element

Only when the set contains an odd number of elements. Otherwise, when the size is even, the median is defined as the mean of the two middle values, sometimes called upper and lower median.

What about numbers like 1.5?

You will never get that since rtspUrls.size() / 2 is an integer division that truncates any decimal places.

I think, passing an float or double as second parameter, like std::advance(e, 1.5) shouldn't compile. As far as I can see the reference does not specify the type of the second paramter. However the "possible implementations"-section uses always the difference type specific to the first parameter, which is usually an integral type and seems reasonable.

I'm using a try catch to try to not advance into something undefined. Is this safe?

No, dereferencing or incrementing an invalid iterator is undefined behaviour and is not required to throw any exceptions. Allthough many implementations provide extensive error checking in debug builds and be so nice to throw an exception UB occurs. But advancing until half the sets size won't become a problem.

churill
  • 9,299
  • 3
  • 13
  • 23