6

I'm trying to read bare Data into a Swift 4 struct using the withUnsafeBytes method. The problem

The network UDP packet has this format:

data: 0102 0A00 0000 0B00 0000

01           : 1 byte : majorVersion      (decimal 01)
02           : 1 byte : minorVersion      (decimal 02)
0A00 0000    : 4 bytes: applicationHostId (decimal 10)
0B00 0000    : 4 bytes: versionNumber     (decimal 11)

Then I have an extension on Data that takes a start and the length of bytes to read

extension Data {
    func scanValue<T>(start: Int, length: Int) -> T {
        return self.subdata(in: start..<start+length).withUnsafeBytes { $0.pointee }
    }
}

This works correctly when reading the values one by one:

// correctly read as decimal "1"
let majorVersion: UInt8 = data.scanValue(start: 0, length: 1)

// correctly read as decimal "2"
let minorVersion: UInt8 = data.scanValue(start: 1, length: 1)

// correctly read as decimal "10"
let applicationHostId: UInt32 = data.scanValue(start: 2, length: 4)

// correctly read as decimal "11"
let versionNumber: UInt32 = data.scanValue(start: 6, length: 4)

Then I created a struct that represents the entire packet as follows

struct XPLBeacon {
    var majorVersion: UInt8        // 1 Byte
    var minorVersion: UInt8        // 1 Byte
    var applicationHostId: UInt32  // 4 Bytes
    var versionNumber: UInt32      // 4 Bytes
}

But when I read the data directly into the structure I have some issues:

var beacon: XPLBeacon = data.scanValue(start: 0, length: data.count)

// correctly read as decimal "1"
beacon.majorVersion

// correctly read as decimal "2"
beacon.minorVersion

// not correctly read
beacon.applicationHostId

// not correctly read
beacon.versionNumber

I it supposed to work to parse an entire struct like this?

Jan
  • 6,366
  • 7
  • 40
  • 58
  • 1
    Note that you don't need to pass the length in your method signature. You can just use the size of your generic type `func scanValue(start: Int) -> T { return subdata(in: start...size).withUnsafeBytes { $0.pointee } }` – Leo Dabus Nov 28 '17 at 08:36
  • and change the naming `start` to `at` – Leo Dabus Nov 28 '17 at 08:44

3 Answers3

6

Since Swift 3 Data conforms to RandomAccessCollection, MutableCollection, RangeReplaceableCollection. So you can simply create a custom initializer to initialise your struct properties as follow:

struct XPLBeacon {
    let majorVersion, minorVersion: UInt8             // 1 + 1 = 2 Bytes
    let applicationHostId, versionNumber: UInt32      // 4 + 4 = 8 Bytes
    init(data: Data) {
        self.majorVersion = data[0]
        self.minorVersion = data[1]
        self.applicationHostId = data
            .subdata(in: 2..<6)
            .withUnsafeBytes { $0.load(as: UInt32.self) }
        self.versionNumber = data
            .subdata(in: 6..<10)
            .withUnsafeBytes { $0.load(as: UInt32.self) }
    }
}

var data = Data([0x01,0x02, 0x0A, 0x00, 0x00, 0x00, 0x0B, 0x00, 0x00,0x00])
print(data as NSData)     // "{length = 10, bytes = 0x01020a0000000b000000}\n" <01020a00 00000b00 0000>

let beacon = XPLBeacon(data: data)
beacon.majorVersion       // 1
beacon.minorVersion       // 2
beacon.applicationHostId  // 10
beacon.versionNumber      // 11
Leo Dabus
  • 198,248
  • 51
  • 423
  • 494
  • This of course works but I was trying to avoid that as it creates a lot of boiler plate code but it looks like it's the only way. – Jan Nov 28 '17 at 08:37
  • Sure but there is no way to avoid this, MartinR explained in his answer the reason of the failure. Check my comment also above how you can improve your extension method, – Leo Dabus Nov 28 '17 at 08:38
  • Yes the answer from MartinR explains it well. Now I understand what was wrong. Thanks for your answer as well – Jan Nov 28 '17 at 08:39
  • Note that you can simplify to `data[0...0].withUnsafeBytes { $0.pointee }` etc, you don't have to create an intermediate `Data` value. – Martin R Nov 28 '17 at 08:45
  • @MartinR yes I don't know why it gave me another error I thought it was a slice issue – Leo Dabus Nov 28 '17 at 08:49
  • I would also added `guard` checking `data` length in the `init`, and made `init` failable. – user28434'mstep Nov 28 '17 at 09:31
  • @user28434 The data size will never change, So a guard there it is not needed. He can also check the data size before initialising the structure – Leo Dabus Nov 28 '17 at 09:35
  • Note that the data count doesn't guarantee it is the correct data – Leo Dabus Nov 28 '17 at 09:39
  • 1
    @LeoDabus, indeed, it doesn't guarantee that data is correct, but it does guarantee that your init won't crash app with fatal error during subscript with range, but instead will provide nice catchable error. Also, if this type is not private boilerplate, it's always good idea to check inside init, than hope that caller did checked it before. – user28434'mstep Nov 28 '17 at 09:45
  • He can also make it non fallible and throw an error. Thats not related to the question. – Leo Dabus Nov 28 '17 at 09:48
  • @LeoDabus, it is related to the answer, for it is comment to answer, and not to the question. But anyways, it was just suggestion. – user28434'mstep Nov 28 '17 at 09:55
  • What I mean is we don't even know how OP is fetching the data so we don't even know if he needs it or not. Note that scanValue method would also crash if he pass the wrong length. – Leo Dabus Nov 28 '17 at 09:58
  • 1
    I did not mention on purpose how I get the data and whether it's valid or not to focus more on the question. Here we assume that the data does have the correct size. – Jan Nov 28 '17 at 11:12
  • There is a *potential* problem here. It is not *guaranteed* that `subdata(in:)` returns data with aligned memory, compare https://forums.swift.org/t/about-data-alignment/17447/2. That's why I recommended `copyBytes(to:)` in https://stackoverflow.com/a/38024025/1187415. – There were suggestions to make `load(as:)` work with unaligned data, but as far as I can see, nothing happened until now. – Martin R Sep 17 '20 at 16:59
  • @MartinR thanks Martin I thought that the problem was only happening when using subscript. I will check the links – Leo Dabus Sep 17 '20 at 17:01
  • @MartinR can you post an example where the approach above would fail? `var data = Data([0,0,0,0,0,0,0,0x01,0x02, 0x0A, 0x00, 0x00, 0x00, 0x0B, 0x00, 0x00,0x00]).subdata(in: 7..<17)` works but `var data = Data([0,0,0,0,0,0,0,0x01,0x02, 0x0A, 0x00, 0x00, 0x00, 0x0B, 0x00, 0x00,0x00])[7..<17]` doesn't. – Leo Dabus Sep 17 '20 at 17:19
  • 1
    @LeoDabus: I am afraid that I can't, that's why I said that it is a potential problem. It *seems* that `subdata(in:)` returns aligned memory, but nobody *guarantees* it. – That makes the `load(as:)` methods pretty useless in my opinion. – Martin R Sep 17 '20 at 17:28
  • @MartinR I was checking the differences between the subscript and subdata implementations. subscript initialises a new Data object from the subsequence `public func subdata(in range: Range) -> Data {` `if isEmpty || range.upperBound - range.lowerBound == 0 {` `return Data()` `}` `let slice = self[range]` `return slice.withUnsafeBytes { (buffer: UnsafeRawBufferPointer) -> Data in` `return Data(bytes: buffer.baseAddress!, count: buffer.count)` `}` `}` Doesn't it guarantee that the data is aligned? https://github.com/apple/swift-corelibs-foundation/blob/master/Sources/Foundation/Data.swift – Leo Dabus Sep 17 '20 at 18:23
  • 1
    @LeoDabus: Probably, but I don't know! Even the constructor `Data(bytes:count:)` does not explicitly state that the copied memory has some minimum alignment. It probably is, but at present that seems to be an implementation detail (which is a bit unsatisfying). – Martin R Sep 17 '20 at 19:21
6

Reading the entire structure from the data does not work because the struct members are padded to their natural boundary. The memory layout of struct XPLBeacon is

 A B x x C C C C D D D D

where

 offset    member
  0        A       - majorVersion (UInt8)
  1        B       - minorVersion (UInt8)
  2        x x     - padding
  4        C C C C - applicationHostId (UInt32)
  8        D D D D - versionNumber (UInt32)

and the padding is inserted so that the UInt32 members are aligned to memory addresses which are a multiple of their size. This is also confirmed by

print(MemoryLayout<XPLBeacon>.size) // 12

(For more information about alignment in Swift, see Type Layout).

If you read the entire data into the struct then the bytes are assigned as follows

 01 02 0A 00 00 00 0B 00 00 00
 A  B  x  x  C  C  C  C  D  D  D  D

which explains why major/minorVersion are correct, but applicationHostId and versionNumber are wrong. Reading all members separately from the data is the correct solution.

Martin R
  • 488,667
  • 78
  • 1,132
  • 1,248
  • I indeed noticed the size of `MemoryLayout.size` and was wondering why it's `12`. That explains a lot. I will use a custom initialiser even if it creates a lot of boiler plate code – Jan Nov 28 '17 at 08:38
0

Following on Leo Dabus answer, I created a slightly more readable constructor:

extension Data {
    func object<T>(at index: Index) -> T {
        subdata(in: index ..< index.advanced(by: MemoryLayout<T>.size))
            .withUnsafeBytes { $0.load(as: T.self) }
    }
}

struct XPLBeacon {
    var majorVersion: UInt8
    var minorVersion: UInt8
    var applicationHostId: UInt32
    var versionNumber: UInt32

    init(data: Data) {
        var index = data.startIndex
        majorVersion = data.object(at: index)

        index += MemoryLayout.size(ofValue: majorVersion)
        minorVersion = data.object(at: index)

        index += MemoryLayout.size(ofValue: minorVersion)
        applicationHostId = data.object(at: index)

        index += MemoryLayout.size(ofValue: applicationHostId)
        versionNumber = data.object(at: index)
    }
}

What is not part of this is of course the checking for the correctness of the data. As other mentioned in comments, this could be done either by having a failable init method or by throwing an Error.

Leo Dabus
  • 198,248
  • 51
  • 423
  • 494
Jan
  • 6,366
  • 7
  • 40
  • 58
  • I would also declare the properties constants. If you need to change the values better to create a new object – Leo Dabus Nov 28 '17 at 12:52