0

this is the code for merge sort and sometimes it gives the right output but sometimes it gives output where one value is changed .

#include "bits/stdc++.h"
using namespace std;

//function to merge two array
vector<int> merging(vector<int> a,vector<int> b){   
    int x = (int)a.size() + (int)b.size();
    vector<int> v(x);
    int p = 0;
    int q = 0;
    for(int i=0;i<x;++i){
        if((q<(int)b.size())?a[p]<b[q]:true && p<(int)a.size()){
            v[i] = a[p];
            p++;
        }else{
            v[i] = b[q];
            q++;
        }
    } 
    return v;
}


//splitting the array and then merging the array
vector<int> mergeSort(vector<int> k){
   int x = (int)k.size();
   if(x<2){
       return k;
   }
   vector<int> a(k.begin(),k.begin()+(x/2));
   vector<int> b(k.begin()+(x/2),k.end());
   return merging(mergeSort(a),mergeSort(b));
}


int main(){
    vector<int> v = {3,5,34,11,32,7,35,54,67,89,23,4,3};
    //calling the merge function
    vector<int> b = mergeSort(v);
    for(int i=0;i<(int)b.size();++i){
        cout << b[i] << "\n";
    }
  return 0;
}

sometime output is expected 3 3 4 5 7 11 23 32 34 35 54 67 89

sometime output is 3 3 4 5 7 11 23 32 34 -423887504 35 54 67

gsamaras
  • 66,800
  • 33
  • 152
  • 256
  • 2
    The [first](https://stackoverflow.com/q/31816095/10077) [two](https://stackoverflow.com/q/1452721/10077) lines are bad news. – Fred Larson Jan 27 '20 at 19:52
  • 3
    @FredLarson dumping these two lines into the same bucket is like using the worst hashing algorithm ever. In any case, iabhishek15, this is not your problem, but yeah, the first line is no good for sure (it's not good practice). – gsamaras Jan 27 '20 at 19:53
  • i know about that . but i do cp so i like to minimize my time by writing less code. – iabhishek15 Jan 27 '20 at 19:54
  • 1
    This looks like a job for the debugger. Provide a small input set that demonstrates bad behaviour and step through the program until you catch the program doing the unexpected. – user4581301 Jan 27 '20 at 19:55
  • 1
    Unrelated in `vector merging(vector a,vector b)` passing `a` and `b` by value could prove expensive. – user4581301 Jan 27 '20 at 19:56
  • 4
    Note: `#include "bits/stdc++.h"` increases build times by an order of magnitude for small programs. The time saved on typing fewer headers is consumed by the second or third build unless you are a slow typist. – user4581301 Jan 27 '20 at 19:58
  • 1
    i am using geany and man it is hard to setup debugger in geany on windows. i have spent so much but still was not able to setup the debugger. – iabhishek15 Jan 27 '20 at 19:59
  • You're going off the end of a vector. I'm not sure where yet, but I replaced all the `[]` vector accesses in `merging()` with `.at()`, and it's throwing. – Fred Larson Jan 27 '20 at 20:01
  • 1
    If a debugger for your IDE is out and using GDB from the command line or IDE with built in support is impractical, add extra debugging outputs. Printing out the lists after each step can help narrow down the point of failure. – user4581301 Jan 27 '20 at 20:01
  • Fred Larson i agree with you i think somewhere i am going out of my vector range that's why it is showing unexpected behavior. – iabhishek15 Jan 27 '20 at 20:09

3 Answers3

5

Okay first off

if((q<(int)b.size())?a[p]<b[q]:true && p<(int)a.size()){

Yikes.

That's some truly convoluted logic. You've got a ternary inside an if, you have true && ... (which is the same thing as just ... there) and there's absolutely no spacing anywhere, making the whole thing even further unreadable. This would not pass any code review that occured within 100 feet of me. Even before I looked too far into your code, I guessed this was where the problem was.

So in your example, when you try to merge one final time you'll have the following values:

a = {35, 54, 67}
b = {3, 4, 23, 89}

Let's walk through it with your merge... All is well for the first few loops until you have:

p = 3
q = 3
x = 7
i = 6
v = {3, 4, 23, 35, 54, 67, 0}

Now, we're heading into the loop, i < x is true, so we're still going. And we get to your interesting if.

q < b.size() is true, 3 < 4. So we look at a[p] < b[q] or a[3] < b[3]. Uh oh! Do you see the problem? a[3] is out of range. Which means undefined behavior.

Instead of trying to do it all in one loop, I'd try to write clean code instead of short code and user another loop. This loop will empty out whichever of a or b you have left after you empty the other (or the way I've written it, it'll be two loops, only one of which will be run).

That'd look like:

size_t p = 0, q = 0, i = 0;

while(p < a.size() && q < b.size()) {
    if(a[p] < b[q]) {
        v[i++] = a[p++];
    } else {
        v[i++] = b[q++];
    }
}

while(p < a.size()) {
    v[i++] = a[p++];
}

while(q < b.size()) {
    v[i++] = b[q++];
}

You'll notice a couple things I've changed here:

  • I've moved from int to size_t so you can avoid all of those ugly casts everywhere since your variables will already match the type of the vector's sizes.
  • I've split off two loops to empty whichever of the two vectors still has values in it.
  • I've moved to while's, as I think they look nicer here.
scohe001
  • 13,879
  • 2
  • 28
  • 47
  • Yeap, that's the path to the solution - you got there first, nice. – gsamaras Jan 27 '20 at 20:09
  • I think it really is some undefined behavior magic happening, both array sizes are in fact checked (correctly-ish) in the conditional – rtpax Jan 27 '20 at 20:10
  • @rtpax As I explained, the check on `a`'s size comes too late. The out of bounds check is happening in that ternary. – scohe001 Jan 27 '20 at 20:11
  • Yes, but the second condition after the `&&` should still stop it from going into the first block in that case. – rtpax Jan 27 '20 at 20:12
  • 1
    @rtpax that condition will never be checked. The ternary evaluates like this (with better spacing and more parenthesis): `q < b.size() ? a[p] < b[q] : ( true && p < a.size() )`. Does that make sense? – scohe001 Jan 27 '20 at 20:15
  • Oooh, I got tripped up by operator precedence. You are correct – rtpax Jan 27 '20 at 20:16
1

Few things to notice:

Operator percedence - ((q < b.size()) ? a[p] < b[q] : true && p < a.size()) is equivalent to: (q < b.size() ? a[p] < b[q] : p < a.size()). Note that you don't consider p, which, unfortunately might be over the end (undefined behaviour) - Note that this is your bug. The fix is rather simple -

(q < b.size() && p < a.size() ? a[p] < b[q] : p < a.size()) 

Which applies the comparison only if there are any elements left on both the arrays.

Moreover, there are many bad patterns in your code:

First of all, you have a lot of redundant C style casts. This is bad for two reasons: 1. You are writing C++, use C++ casts! (static_cast in your case) 2. You are casting for no apparent reason. There is a very good reason the type of size in vectors (and generally in C++) is std::size_t. Use it!

Another thing in your code is that you pass arguments as value and not as const-ref which is a very bad practice and causes unnecessary copies happening all around.

Third, your variable names are meaningless.

And fourth you use the namespace std which is highly not recommended.

Michael
  • 880
  • 6
  • 14
0

May be the trouble in this line:

if((q<(int)b.size())?a[p]<b[q]:true && p<(int)a.size()){

you don't check the size of array 'a' before using a[p]. So, sometimes (due to waste in memory), a[p] can be less than b[q], sometimes no.

As a variant, of course. Lets check array size before using.

Evgeny
  • 721
  • 3
  • 5
  • as a bonus, you may try to build & run the code with clang address sanitizer which shows heap-buffer-overflow exactly in this line – Łukasz G. Jan 27 '20 at 20:10