How to convert boost multiprecision integers into big endian from little endian?

Question

I am trying to fix this part of an abandonware program because I failed to find an alternative program.

As you can see the data of PUSH instructions are in the wrong order whereas Ethereum is a big endian machine (address are correctly represented because they use a smaller type).
An alternative is to run porosity.exe --code '0x61004b60026319e44e32' --disassm

Theu256 type is defined as

using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;

Here’s a minimal example to reproduce the bug:

#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_int.hpp>

using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;

int main() {
    std::stringstream stream;
    u256 data=0xFEDEFA;
    for (int i = 0; i<5; ++i) { // print only the first 5 digits
        uint8_t dataByte = int(data & 0xFF);
        data >>= 8;
        stream << std::setfill('0') << std::setw(sizeof(char) * 2) << std::hex << int(dataByte) << "  ";
    }
    std::cout << stream.str();
}

So numbers are converted to string with a space between each byte (and only the first bytes).

But then I ran into an endianness problem: bytes were printed in the reverse order. I mean for example, 31722 is written 8a 02 02 on my machine and 02 02 8a when compiled for a big endian target.

So as I don’t which boost function to call, I modified the code:

#include <sstream>
#include <iostream>
#include <iomanip>
#include <boost/multiprecision/cpp_int.hpp>

using u256 = boost::multiprecision::number<boost::multiprecision::cpp_int_backend<256, 256, boost::multiprecision::unsigned_magnitude, boost::multiprecision::unchecked, void>>;

int main() {
    std::stringstream stream;
    u256 data=0xFEDEFA;
    for (int i = 0; i<5; ++i) {
        uint8_t dataByte = int(data >> ((32 - i - 1) * 8));
        stream << std::setfill('0') << std::setw(sizeof(char) * 2) << std::hex << int(dataByte) << "  ";
    }
    std::cout << stream.str();
}

Now, why are my 256 bits integers printed mostly as series of 00 00 00 00 00?

I'd recommend looping backwards over your numbers to get highest chunk first / MSD-first printing order, instead of byte-reversing the 32-byte object. Also, you could use `uint64_t` so you only need 4 chunks / 4 separate calls to `operator < — Peter Cordes, Sep 29 '19 at 07:34
BTW, this is *not* an endianness issue; you aren't doing byte accesses to the object-representation. You're operating on it as a 256-bit integer and simply asking for the low 8 bits at a time. If you did know the endianness of the target C implementation and the data layout of the object, you *could* loop over it in descending address order with `unsigned char*`. — Peter Cordes, Sep 29 '19 at 07:37
@PeterCordes ok I fixed the number of iterations for the byte order loop. But I still have the exact same problem (even when printing numbers which don’t contain zeroes). **I can say when printed the internal representation prints the expected numbers fully correctly** (as if `long long long long` was a native compiler type) but in the little endian order. — user2284570, Sep 29 '19 at 07:56

Peter Cordes · Answer 1 · 2019-09-29T08:48:10.093

BTW, this is not an endianness issue; you aren't doing byte accesses to the object-representation. You're operating on it as a 256-bit integer and simply asking for the low 8 bits at a time with data & 0xFF.

If you did know the endianness of the target C implementation, and the data layout of the boost object, you could efficiently loop over it in descending address order with unsigned char*.

You're introducing the idea of endianness only because it's associated with byte-reversal, which is what you're trying to do. But that's really inefficient, just loop over the bytes of your bigint the other way.

I'm hesitant to recommend a specific solution because I don't know what will compile efficiently. But you might want something like this instead of byte-reversing ahead of time:

for (outer loop) {
    uint64_t chunk = data >> (64*3);  // grab the highest 64-bit chunk
    data <<= 64;   // and shift everything up
    // alternative: maybe keep a shift-count in a variable instead of modifying `data`

    // Then pick apart the chunk into its component bytes, in MSB first order
    for (int = 0 ; i<8 ; i++) {
        unsigned tmp = (chunk >> 56) & 0xFF;
        // do something with it
        chunk <<= 8;                   // bring the next byte to the top
    }
}

In the inner loop, more efficient than using two shifts can be using a rotate to bring the high byte to the bottom (for & 0xFF) at the same time as shifting lower bytes upward. Best practices for circular shift (rotate) operations in C++

In the outer loop, IDK if boost::multiprecision::number has any APIs for efficient indexing of chunks built in; if so using that is probably more efficient.

I used nested loops because I assume data <<= 8 doesn't compile particularly efficiently, and neither would (data >> (256-8)) & 0xFF. But that's how you'd grab bytes from the top instead of the bottom.

Another option is the standard trick for converting numbers to strings: store characters into a buffer in descending order. A 256-bit (32-byte) number will take 64 hex digits, and you want another 32 bytes of spaces between them.

For example:

  // 97 = 32 * 2 + 32, plus 1 byte for an implicit-length C string terminator
  // plus another 1 for an extra space
  char buf[98];            // small enough to use automatic storage
  char *outp = buf+96;     // pointer to the end
  *outp = 0;               // terminator
  const char *hex_lut = "0123456789abcdef";

  for (int i=0 ; i<32 ; i++) {
      uint8_t byte = data & 0xFF;
      *--outp = hex_lut[byte >> 4];
      *--outp = hex_lut[byte & 0xF];
      *--outp = ' ';
      data >>= 8;
  }
  // outp points at an extra ' '
  outp++;
  // outp points at the first byte of a string like  "12 ab cd"
  stream << outp;

If you want to break that up into chunks to put a line break in there, you can do that too.

If you're interested in efficient conversion to hex for 8, 16 or 32 bytes of data at once, see How to convert a number to hex? for some x86 SIMD ways. The asm should port easily to C++ intrinsics. (You can use SIMD shuffles to handle putting bytes into MSB-first printing order after loading from little-endian integers.)

You could also use a SIMD shuffle to space-separate your pairs of hex digits before storing to memory like you apparently want here.

Bug in the code you added:

So I added this code before the loop above:
  for(unsigned int i=0,data,_data;i<33;++i)

unsigned i, data, _data declares new variables of type unsigned int that shadow the previous declarations of data and _data. That loop has zero effect on data or _data outside the scope of the loop. (And contains UB because you read _data and data without initializing them.)

If those vars are actually both still the u256 vars of the outer scope, I don't see an obvious problem other than efficiency, but maybe I'm missing the obvious too. I didn't look very hard because using 64x 256-bit shifts and 32x ORs seems like a horrible idea. It's possible it could optimize away completely, or into bswap byte-reverse instructions on ISAs that have them, but I doubt it. Especially not through the extra complication of the boost::multiprecision::number wrapper functions.

With your method of using 8 bytes chunks, shouldn’t the numbers exist in the wrong byte order inside each 64 bits chunk ? — user2284570, Sep 29 '19 at 08:24
@user2284570: What doesn't work? Are you still trying to fix your inefficient byte-reverse loop? — Peter Cordes, Sep 29 '19 at 08:24
Yes with 2 number to print per execution, I don’t need it to be efficient. I edited the question with the scope error being corrected (the numbers appears to be random with the new loop) (again without the byte order loop they are written into ASCII correctly but in the wrong byte order). — user2284570, Sep 29 '19 at 08:26
@user2284570: My nested loops extract the bytes of `data` one at a time, in most-significant-first order. That's not the "wrong" order, it's how numbers work. Printing order is highest-first because the English writing system goes left to right, and our system for writing numbers is based on place-value Arabic numerals which put the most-significant digit on the left. So we need to print the most-significant 8 bits first byte. That's not "wrong". You use that nested loop *instead* of reversing bytes of a 256-bit number and then printing it in LSB-first order. — Peter Cordes, Sep 29 '19 at 08:27
I’m talking about printing English text since the beginning. For example, 31722 is written `8a 02 02` on my machine and `02 02 8a` when compiled for a big endian target. So I need to convert `data` from little endian representation to big endian/network byte order. The stringtream is shared over all program, so while I’m ok at using the boost library for this, I failed to find a function for it. — user2284570, Sep 29 '19 at 16:03
Actually some architectures like PowerPC uses bit shift instructions which depends on the underlying endiannes for the result. — user2284570, Sep 29 '19 at 23:22
@user2284570: You're writing in C++ where the compiler knows the asm target endianness and can correctly implement `>>` with C++ semantics. And BTW, Re-reading your previous comment, I think you're just showing the byte-order in memory of `31722`, not claiming that your code would print it that way. But like I said, that's not relevant. Your C++ code is endian-agnostic and will work the same on any implementation. The point is that to print in normal printing order, it's sufficient to just take them from the top instead of reversing and then taking from the bottom. — Peter Cordes, Sep 30 '19 at 00:16

How to convert boost multiprecision integers into big endian from little endian?

1 Answers1

Bug in the code you added: