5

I managed to write a video recording demo which is similar to ContinuousCaptureActivity of grafika(Source code of ContinuousCaptureActivity.java).

The difference is that grafika used hardware encoding but I used software encoding. For software encoding, I get every video frame from GPU with PBO which is very fast and copy the image data to ffmpeg, then do the h264 encoding.

The performance is acceptable for most devices, glMapBufferRange() took less than 5ms and memcpy() took less than 10ms.

But the performance is low on the phone of huawei mate7. glMapBufferRange() took 15~30ms, memcpy() took between 25~35ms.

I have tested normal memcpy() on mate7, it's much faster when copy normal memory.

It is really strange, who can give me some help?

Device info:

chipset of the phone: HiSilicon Kirin 925
cpu of the phone: Quad-core 1.8 GHz Cortex-A15 & quad-core 1.3 GHz Cortex-A7

See detail here: huawei mate 7

The pbo code is as follows:

    final int buffer_num = 1;
final int pbo_id[] = new int[buffer_num];
private void getPixelFromPBO(int width, int height, boolean isDefaultFb) {
    try {
        long start = System.currentTimeMillis();

        final int pbo_size = width * height * 4;

        if (mFrameNum == 0) {
            GLES30.glGenBuffers(buffer_num, pbo_id, 0);
            Log.d(TAG, "glGenBuffers pbo_id[0]:" + pbo_id[0]);

            GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[0]);
            //glBufferData creates a new data store for the buffer object currently bound to target
            GLES30.glBufferData(GLES30.GL_PIXEL_PACK_BUFFER, pbo_size, null, GLES30.GL_DYNAMIC_READ);
            GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
        }

        GLES30.glPixelStorei(GLES30.GL_PACK_ALIGNMENT, 1);
            checkGlError("glPixelStorei");

        //we need read GL_BACK when the default frame buffer is binded
        //glReadBuffer specifies a color buffer as the source for subsequent glReadPixels, , glCopyTexImage2D, glCopyTexSubImage2D, and glCopyTexSubImage3D commands
        if (isDefaultFb) {
            GLES30.glReadBuffer(GLES30.GL_BACK);
        } else {
            GLES30.glReadBuffer(GLES30.GL_COLOR_ATTACHMENT0);
        }
        checkGlError("glReadBuffer");

        GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, pbo_id[0]);
        checkGlError("glBindBuffer 1 ");

        long ts = System.currentTimeMillis();
        glReadPixelsPBOJNI(0, 0, width, height, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, 0);
        Log.d(TAG, "glReadPixelsPBOJNI took " + (System.currentTimeMillis() - ts) + "ms\n\n\n");
        //GLES30.glReadPixels(0, 0, width, height, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, null);
        //glReadPixelsPBOJNI(0, 0, height, width, GLES30.GL_RGBA, GLES30.GL_UNSIGNED_BYTE, 0);
        checkGlError("glReadPixels");
        ts = System.currentTimeMillis();
        ByteBuffer buf = (ByteBuffer) GLES30.glMapBufferRange(
                GLES30.GL_PIXEL_PACK_BUFFER, 0, pbo_size, GLES30.GL_MAP_READ_BIT);
        checkGlError("glMapBufferRange");
        Log.d(TAG, "*****glMapBufferRange took " + (System.currentTimeMillis() - ts) + "ms");

        ts = System.currentTimeMillis();
        cpoyDataToFFmpeg(buf, 1, 1);
        Log.d(TAG, "####cpoyDataToFFmpeg took " + (System.currentTimeMillis() - ts) + "ms\n\n\n");


        GLES30.glUnmapBuffer(GLES30.GL_PIXEL_PACK_BUFFER);
        checkGlError("glUnmapBuffer");
        GLES30.glBindBuffer(GLES30.GL_PIXEL_PACK_BUFFER, 0);
        checkGlError("glBindBuffer 0 ");

    }catch (Exception e) {
        Log.e(TAG, "DO PBO  exp", e);
    }
}
dragonfly
  • 1,055
  • 12
  • 34
  • 1
    If you have a rooted device, you can use systrace with tags like `--freq` to monitor the clock frequencies on CPU and memory and see if the device is slowing down (due to perceived inactivity or thermal throttling). Is the memcpy() speed unique to the GLES buffer memory, or is it that slow just copying memory around in general? Is it any faster if you drag your finger across the screen constantly while the test runs? FWIW, prefer `System.nanoTime()`, as it uses the monotonic clock, which is not subject to resets. – fadden Feb 03 '16 at 17:23
  • I have tested normal memcpy(), it's much faster. And I am sure the performance is low all the time on that phone. Thanks! – dragonfly Feb 04 '16 at 02:18
  • Hey, dragon, would you mind sharing how you solve this later? I got this situation, too. – RxRead Feb 16 '17 at 02:33
  • @fadden please have a look at my new question :http://stackoverflow.com/questions/42508675/record-frames-displayed-on-textureview-to-mp4 – dragonfly Feb 28 '17 at 12:24

0 Answers0