A memcpy()-like function for bit vectors?

Question

I have a vector of bits, and I want to copy a slice of it to another vector (say, for simplicity, to the beginning of another vector). Note that all the bits may need to be shifted (or rather, rotated) in some direction, not just the first element, since the alignment of bits within each byte changes.

Suppose, for clarity, that the signature is:

void *memcpy_bits(
    char* destination,
    char* source,
    size_t offset_into_source_in_bits,
    size_t num_bits_to_copy);

And that data is stored in bytes, so no endianness issues, and the lower bits come first in the vector. We could make the signature more complex to accommodate for other assumptions but never mind that for now.

So,

Is there some hardware support for doing this (on x86 or x86_64 CPUs I mean)?
Is there some standard/idiomatic/widely-used implementation of this function (or something similar enough)?

Have you looked at the `std::vector` specialization? It is a bit-based vector with iterator support, so you should be able to use `std::copy()`, for instance. — Remy Lebeau, Sep 11 '14 at 17:16
@RemyLebeau: But can I 'trust' this specialization to behave like I expect with every C++ library implementation? And can I wrap my raw bit vector with a new std::vector without writing arcane allocator template code? — einpoklum, Sep 11 '14 at 17:18
Yes, you can trust it. And my point is, why would you want to use a manual vector implementation and not use `vector` instead? — Remy Lebeau, Sep 11 '14 at 17:33
Because I need to: 1. serialize it quickly 2. Use it by non-C++ code while in memory. Also, what's the basis for this trust? Do STL documents mandate its implementation specifics? — einpoklum, Sep 11 '14 at 17:35

score 1 · Answer 1 · answered Sep 11 '14 at 17:14

First you have to define how the data is stored. Is it stored in an array of uint8_t, uint16_t, uint32_t or uint64_t? Is bit #0 stored as a value 1u << 0? You should probably not use void* but the underlying type that is used for storing the data.

Second, you can obviously assume that offset_into_source_in_bits is less than the number of bits in the underlying data type (if it's not, what would you do? )

Third, if that offset is 0 then you can just call memcpy. That's an important thing to do, because the following code won't work if the offset is 0.

Fourth, as long as num_bits_to_copy >= number of bits in the underlying type, you can calculate the next unit to store into destination using two shifts.

Fifth, if 0 < num_bits_to_copy < number of bits in the underlying type, then you need to be careful not to read any source bits that don't actually exist.

You'd probably want to be careful not to overwrite any bits that you shouldn't overwrite, and personally I would have an offset into the destination bits as well, so you can copy arbitrary ranges of bits. I might implement a memmove_bits function as well.

A memcpy()-like function for bit vectors?

1 Answers1