I'm writing some code to render camera preview using SkiaSharp. This is cross-platform but I came across a problem while writing the implementation for android.
I needed to convert YUV_420_888 to RGB8888 because that's what SkiaSharp supports and with the help of this thread, somehow managed to show decent quality images to my SkiaSharp canvas. The problem is the speed. At best I can get about 8 fps but usually it's just 4 or 5 fps. It turned out the biggest factor is the conversion. I now have about 3 versions of my ToRGB converter. I've even ended up trying "unsafe" code and parallel loops. I'll just show you my best one yet.
private unsafe byte[] ToRgb(byte[] yValuesArr, byte[] uValuesArr,
byte[] vValuesArr, int uvPixelStride, int uvRowStride)
{
var width = PixelSize.Width;
var height = PixelSize.Height;
var rgb = new byte[width * height * 4];
var partitions = Partitioner.Create(0, height);
Parallel.ForEach(partitions, range =>
{
var (item1, item2) = range;
Parallel.For(item1, item2, y =>
{
for (var x = 0; x < width; x++)
{
var yIndex = x + width * y;
var currentPosition = yIndex * 4;
var uvIndex = uvPixelStride * (x / 2) + uvRowStride * (y / 2);
fixed (byte* rgbFixed = rgb)
fixed (byte* yValuesFixed = yValuesArr)
fixed (byte* uValuesFixed = uValuesArr)
fixed (byte* vValuesFixed = vValuesArr)
{
var rgbPtr = rgbFixed;
var yValues = yValuesFixed;
var uValues = uValuesFixed;
var vValues = vValuesFixed;
var yy = *(yValues + yIndex);
var uu = *(uValues + uvIndex);
var vv = *(vValues + uvIndex);
var rTmp = yy + vv * 1436 / 1024 - 179;
var gTmp = yy - uu * 46549 / 131072 + 44 - vv * 93604 / 131072 + 91;
var bTmp = yy + uu * 1814 / 1024 - 227;
rgbPtr = rgbPtr + currentPosition;
*rgbPtr = (byte) (rTmp < 0 ? 0 : rTmp > 255 ? 255 : rTmp);
rgbPtr++;
*rgbPtr = (byte) (gTmp < 0 ? 0 : gTmp > 255 ? 255 : gTmp);
rgbPtr++;
*rgbPtr = (byte) (bTmp < 0 ? 0 : bTmp > 255 ? 255 : bTmp);
rgbPtr++;
*rgbPtr = 255;
}
}
});
});
return rgb;
}
You can also find it on my repo. You can also find on that same repo the part where I rendered the output to SkiaSharp
For a preview size of 1440x1080, running on my phone, this code takes about 120ms to finish. Even if all the other parts are optimized, the most I can get from that is 8fps. And no, it's not my hardware because the built-in camera app runs smoothly. By the way 1440x1080 is the output of my ChooseOptimalSize algorithm that I got from the mono-droid examples of android's Camera2 API. I don't know if it's the best way or if it lacks logic on detecting the fps and sizing down the preview to make it faster.