17

How can I get the RGB (or any other format) pixel value from a CVPixelBufferRef? Ive tried many approaches but no success yet.

func captureOutput(captureOutput: AVCaptureOutput!,
                   didOutputSampleBuffer sampleBuffer: CMSampleBuffer!,
                   fromConnection connection: AVCaptureConnection!) {
  let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
                CVPixelBufferLockBaseAddress(pixelBuffer, 0)
  let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)

  //Get individual pixel values here

  CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
}
Pibben
  • 1,452
  • 10
  • 25
scord
  • 1,357
  • 1
  • 17
  • 33
  • core video pixel buffer doesn't hold the information about a single pixel, rather it holds the pixel information of all the pixel captured from a scene (a bitmap matrix of pixels). Do you mean that want to get RGB values for every pixels inside the buffer? – Ayan Sengupta Jan 02 '16 at 19:31
  • Possible duplicate of [How to convert CVPixelBufferGetBaseAddress call to Swift?](http://stackoverflow.com/questions/29814143/how-to-convert-cvpixelbuffergetbaseaddress-call-to-swift). – Martin R Sep 12 '16 at 11:44

4 Answers4

23

baseAddress is an unsafe mutable pointer or more precisely a UnsafeMutablePointer<Void>. You can easily access the memory once you have converted the pointer away from Void to a more specific type:

// Convert the base address to a safe pointer of the appropriate type
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// read the data (returns value of type UInt8)
let firstByte = byteBuffer[0]

// write data
byteBuffer[3] = 90

Make sure you use the correct type (8, 16 or 32 bit unsigned int). It depends on the video format. Most likely it's 8 bit.

Update on buffer formats:

You can specify the format when you initialize the AVCaptureVideoDataOutput instance. You basically have the choice of:

  • BGRA: a single plane where the blue, green, red and alpha values are stored in a 32 bit integer each
  • 420YpCbCr8BiPlanarFullRange: Two planes, the first containing a byte for each pixel with the Y (luma) value, the second containing the Cb and Cr (chroma) values for groups of pixels
  • 420YpCbCr8BiPlanarVideoRange: The same as 420YpCbCr8BiPlanarFullRange but the Y values are restricted to the range 16 – 235 (for historical reasons)

If you're interested in the color values and speed (or rather maximum frame rate) is not an issue, then go for the simpler BGRA format. Otherwise take one of the more efficient native video formats.

If you have two planes, you must get the base address of the desired plane (see video format example):

Video format example

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddressOfPlane(pixelBuffer, 0)
let bytesPerRow = CVPixelBufferGetBytesPerRowOfPlane(pixelBuffer, 0)
let byteBuffer = UnsafeMutablePointer<UInt8>(baseAddress)

// Get luma value for pixel (43, 17)
let luma = byteBuffer[17 * bytesPerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)

BGRA example

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, 0)
let baseAddress = CVPixelBufferGetBaseAddress(pixelBuffer)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
let int32Buffer = UnsafeMutablePointer<UInt32>(baseAddress)

// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
Codo
  • 64,927
  • 16
  • 144
  • 182
  • let luma = int32Buffer[17 * int32Buffer + 43] does not compile. "Binary operator "*" cannot be applied to operands of type "Int" and "UnsafeMutablePointer" . Similar problem ive been encountering. will update if i find a way to convert this properly. – scord Jan 02 '16 at 22:05
  • Sorry. Typo. Fixed it. – Codo Jan 02 '16 at 22:11
  • How can you obtain an array of UInt8 from the entire CMSampleBuffer? CMSampleBuffer to [UInt8] – omarojo Mar 20 '17 at 13:43
  • 1
    @codo the swift4 conversion of `let int32Buffer = UnsafeMutablePointer(baseAddress)` seems to be `let int32Buffer = baseAddress.assumingMemoryBound(to: UInt32.self)` **But it does not seem to work!** Instead, `baseAddress.assumingMemoryBound(to: UInt8.self)` does. I can't understand why since my pixel buffer format is `kCVPixelFormatType_32BGRA`. Any clue? – Martin Dec 05 '18 at 16:29
  • 1
    /!\ there is a typo in the BGRA example: `let luma = int32Buffer[17 * int32PerRow + 43]` should be **`let bgra = int32Buffer[17 * int32PerRow + 43*4]`** Because each pixel has 4 values (B, G, R, A), the horizontal shift should be x4 – Martin Dec 05 '18 at 16:32
10

Here is a method for getting the individual rgb values from a BGRA pixel buffer. Note: Your buffer must be locked before calling this.

func pixelFrom(x: Int, y: Int, movieFrame: CVPixelBuffer) -> (UInt8, UInt8, UInt8) {
    let baseAddress = CVPixelBufferGetBaseAddress(movieFrame)
    
    let bytesPerRow = CVPixelBufferGetBytesPerRow(movieFrame)
    let buffer = baseAddress!.assumingMemoryBound(to: UInt8.self)
    
    let index = x*4 + y*bytesPerRow
    let b = buffer[index]
    let g = buffer[index+1]
    let r = buffer[index+2]
    
    return (r, g, b)
}
Josh Bernfeld
  • 3,357
  • 2
  • 28
  • 33
  • 2
    The width is given in number of pixels and each pixel of an BGRA-pixelbuffer is represented by 4 bytes. Therefore `index` should be `4*x + y*bytesPerRow`. – J.E.K Nov 19 '18 at 14:39
7

Update for Swift3:

let pixelBuffer: CVPixelBufferRef = CMSampleBufferGetImageBuffer(sampleBuffer)!
CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0));
let int32Buffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<UInt32>.self)
let int32PerRow = CVPixelBufferGetBytesPerRow(pixelBuffer)
// Get BGRA value for pixel (43, 17)
let luma = int32Buffer[17 * int32PerRow + 43]

CVPixelBufferUnlockBaseAddress(pixelBuffer, 0)
swift taylor
  • 590
  • 4
  • 10
  • How do you get an array of UInt8 out from this, that contains the entire image ? CMSampleBuffer to [UInt8] – omarojo Mar 20 '17 at 13:39
  • 3
    How can I get pixel values if I am using the kCVPixelFormatType_14Bayer_RGGB RAW format? – Matt Colliss May 25 '17 at 16:12
  • **Warning**: If you don't unwrap the call to `CVPixelBufferGetBaseAddress`, you might miss important warnings that will lead to undefined behavior. See [this answer](https://stackoverflow.com/a/65210114/35690) for more information. – Senseful Dec 09 '20 at 03:16
3

Swift 5

I had the same problem and ended up with the following solution. My CVPixelBuffer had dimensionality 68 x 68, which can be inspected by

CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))
print(CVPixelBufferGetWidth(pixelBuffer))
print(CVPixelBufferGetHeight(pixelBuffer))

You also have to know the bytes per row:

print(CVPixelBufferGetBytesPerRow(pixelBuffer))

which in my case was 320.

Furthermore, you need to know the data type of your pixel buffer, which was Float32 for me.

I then constructed a byte buffer and read the bytes consecutively as follows (remember to lock the base address as shown above):

var byteBuffer = unsafeBitCast(CVPixelBufferGetBaseAddress(pixelBuffer), to: UnsafeMutablePointer<Float32>.self)
var pixelArray: Array<Array<Float>> = Array(repeating: Array(repeating: 0, count: 68), count: 68)
for row in 0...67{
    for col in 0...67{
        pixelArray[row][col] = byteBuffer.pointee
        byteBuffer = byteBuffer.successor()    
    }
    byteBuffer = byteBuffer.advanced(by: 12)
}
CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0))

You might wonder about the part byteBuffer = byteBuffer.advanced(by: 12). The reason why we have to do this is as follows.

We know that we have 320 bytes per row. However, our buffer has width 68 and the data type is Float32, e.g. 4 bytes per value. That means that we virtually only have 272 bytes per row, followed by zero-padding. This zero-padding probably has memory layout reasons.

We, therefore, have to skip the last 48 bytes in each row which is done by byteBuffer = byteBuffer.advanced(by: 12) (12*4 = 48).

This approach is somewhat different from other solutions as we use pointers to the next byteBuffer. However, I find this easier and more intuitive.

Leonard
  • 328
  • 1
  • 14
  • ```let advance = (bytesPerRow - bufferWidth) / MemoryLayout.size``` – emrahgunduz Jul 17 '20 at 11:04
  • 2
    **Warning**: If you don't unwrap the call to `CVPixelBufferGetBaseAddress`, you might miss important warnings that will lead to undefined behavior. See [this answer](https://stackoverflow.com/a/65210114/35690) for more information. – Senseful Dec 09 '20 at 03:16