0

I want to perform not operation to a block of continous elements of a bool array and then read back the complete array. I am using the following code to perform the operation.

bool arr[100000]={0};
cin>>x>>y;
for(i=x; i<=y; i++)
 arr[i]=!arr[i];

//Some other operations on the array

for(i=0; i<=100000; i++)
 arr+=arr[i];

This works fine but i am trying to increase the speed of the program. Is there a better way to perform the same operation?

Yash Singla
  • 144
  • 6
  • 1
    Did you try unrolling? Did you try using structs as 32-bit parts to make 32-bit not operation with single operation? – huseyin tugrul buyukisik Sep 14 '12 at 21:18
  • 1
    `cin >> i` then `for(i = x...` why bother with the input if you're just going to replace it? – Jonathan Seng Sep 14 '12 at 21:19
  • 1
    This code makes no sense. Why do you populate `i` from `std::cin` then immediately overwrite it with `x`? What is `arr+=arr[i];` supposed to mean? – ildjarn Sep 14 '12 at 21:20
  • I think he is changing the address of array relative to other elements and array address starts dangling there and there – huseyin tugrul buyukisik Sep 14 '12 at 21:21
  • arr.. sorry guys, i didnt copy paste the code here. I wrote it and by mistake wrote cin>>i instead of cin>>x>>y. Sorry for the problem, I have edited the question now – Yash Singla Sep 14 '12 at 21:21
  • @tuğrulbüyükışık can you please elaborate hoe to perform 32-bit not operation in this case? – Yash Singla Sep 14 '12 at 21:23
  • 1
    pack 32 bits into an int(assuming it is 32 bit on your machine) then use bitwise "not" ont the int. You can even use 4 ints at the same time using SIMD commands. Did you try compiler optimizations? – huseyin tugrul buyukisik Sep 14 '12 at 21:25
  • @tuğrul büyükışık would a union also work? – andre Sep 14 '12 at 21:28
  • @ahenderson people say union is dangerous. – huseyin tugrul buyukisik Sep 14 '12 at 21:29
  • @ahenderson : Reading from a different union field than was last written to violates aliasing rules. – ildjarn Sep 14 '12 at 21:30
  • @tuğrulbüyükışık If that were an answer, I'd upvote it. – Jonathan Seng Sep 14 '12 at 21:30
  • I don't know about the efficiency of bitset, but really I would just make an array of int or char (`int arr[10000/sizeof(int)+1];` or `char arr[10000/sizeof(char)+1];`). And to set/clear just `memset` the damn thing (with 0xff or 0 respectively). C++ is great but C is still awesome. – paddy Sep 14 '12 at 21:43
  • @paddy memset wont work here as i need to flip the bits. – Yash Singla Sep 14 '12 at 21:44
  • Oh sorry, misunderstood. Well, with ints you can flip 32 bits at a time with XOR. – paddy Sep 14 '12 at 21:47
  • But I see the dilemma... You have to handle the ends separately. You can still setup the appropriate mask with a couple of bitshifts and ANDs. And, if the distance between x and y is significant, this would still be very fast. – paddy Sep 14 '12 at 21:50

2 Answers2

3

Consider to use bitset. Compare performance - maybe it will be better.

std::bitset<100000> arr;
cin>>x>>y;
for(i=x; i<=y; i++)
 arr.flip(i);

//Some other operations on the array
unsigned int carr = arr.count();

For even more optimized (please measure and don't believe) you can use your own version of bitset<>, THIS IS NOT TESTED CODE:

const size_t arr_bitlen = 100000;
typedef unsigned int arr_type;
const size_t arr_type_size = sizeof(arr_type);
const size_T arr_len = (arr_bitlen + arr_type_size - 1) / arr_type_size;
arr_type arr[arr_len] = { 0 };
cin>>x>>y;
unsigned int x_addr = x / arr_type_size;
unsigned int y_addr = y / arr_type_size;
unsigned int x_bit = x % arr_type_size;
unsigned int y_bit = y % arr_type_size;

if (0 == x_bit)
    for (i=x_addr; i<=y_addr; i++)
       arr[i] = ~arr[i]; // revert all bits (bools)
else {
  // deal with first element in range ( ....xxxx - change only x-s
  arr_type x_mask = ((1 << x_bit) - 1) << (arr_type_len - x_bit);
  arr[x_addr] ^= x_mask; 
  for (i = x_bit + 1; i < arr_type_size; ++i)
      arr[i] = ~arr[i]; // revert all bits (bools)
}
if (y_bit > 0) // try to invert 0..y_bit in arr[y_addr + 1] by yourself

//Some other operations on the array
see implementation of std::bitset<N>::count() - it is very clever - just copy it
PiotrNycz
  • 20,687
  • 7
  • 55
  • 102
  • This doesnt improves the performance much. I was hoping to remove the loop and perform the flip with a single operation, which would make a huge impact in the overall performance – Yash Singla Sep 14 '12 at 21:37
  • You can make your own structure like bitset - with direct manipulating of internal data. Then you can reduce number of negations by 32. – PiotrNycz Sep 14 '12 at 21:40
  • How to direct manipulate the data? Could you please give a small example. I am new to c++ so i am having a little trouble with this thing. – Yash Singla Sep 14 '12 at 21:42
  • 1
    Search for "C bit manipulation" in internet. I gave the second example - but it is just an example - not tested in any way. Unit test it before using. Or maybe you can find sth useful in BOOST library. I would recommend to stay with your first solution or just use bitset. This last proposal will be very hard for implement/understand. – PiotrNycz Sep 14 '12 at 22:03
  • Thanks for the effort, I am reading about bit manipluations right now. will read bitset implementation and try to solve this thing then – Yash Singla Sep 14 '12 at 22:16
  • IF only there was function `std::bitset::flip(x,y)`. While reading bitset think of implementing such function. – PiotrNycz Sep 14 '12 at 22:19
  • that would have saved my whole day. anyways, do you think there is any way to set(not flip) a part of bits in a bitset? – Yash Singla Sep 14 '12 at 22:28
  • bitset can't do it, see http://www.sgi.com/tech/stl/bitset.html. Maybe std::vector can - it has similar implementation to bitset but it is dynamic. – PiotrNycz Sep 14 '12 at 22:53
  • 1
    See my question http://stackoverflow.com/questions/12433154/are-stdfill-stdcopy-specialized-for-stdvectorbool - there are some hints in the comments. And do not forget about good old `memcpy` `memset` functions – PiotrNycz Sep 15 '12 at 06:27
1

Since I made the comment about using ints (or indeed int64), I may as well write it up and you can evaluate whether it's worth it. It would be something like this. Forgive any errors, as I'm just bunging this into a browser while my kids are watching ridiculously trashy saturday-morning cartoons.

// I'm gonna assume 32-bit ints here.  Makes the other maths clearer.
// Sorry about all the '4' and '32' constants =P
const size_t arrLen = 100000 / 4 + 1;
int arr[arrLen];

//This gets filled with your data...
memset((void*)arr, 0, arrLen*4);

cin >> x >> y;
int leftMask = 0xffffffff >> (x % 32);      // "(x & 0x1f)" faster?
int rightMask = ~(0x7fffffff >> (y % 32));  // "(y & 0x1f)" faster?
x /= 32;                                    // "x >>= 5" faster?
y /= 32;                                    // "y >>= 5" faster?

if( x == y )
{
    // Intersect the masks
    leftMask &= rightMask;
    arr[x] = (arr[x] & ~leftMask) | (~arr[x] & leftMask);
}
else if( x < y )
{
    // Flip the left and right ends
    arr[x] = (arr[x] & ~leftMask) | (~arr[x] & leftMask);
    arr[y] = (arr[y] & ~rightMask) | (~arr[y] & rightMask);

    // Flip everything in between
    for( int i = x+1; i < y; i++ ) {
        arr[i] ^= 0xffffffff;  // Or arr[i] = ~arr[i] -- whichever is faster
    }
}

Alternative for the above loop, if it makes any difference...

// Flip everything in between
for( int *a = arr+x+1, *b = arr+y; a < b; a++ ) {
    *a = ~*a;
}

Exercise is to try with 64-bit integers. Personally, I reckon this approach would be faster than anything else except in the cases where you are only flipping a few bits.

I might have an off-by-one-bit error in the right-hand mask. If anyone spots it please comment. Brain empty. =)

paddy
  • 52,396
  • 6
  • 51
  • 93
  • Ahh, corrected the right-hand bitmask to `0x7fffffff` to be inclusive of `y`. The off-by-one error I suspected was there. – paddy Sep 14 '12 at 22:31