57

I have a calibrated camera (intrinsic matrix and distortion coefficients) and I want to know the camera position knowing some 3d points and their corresponding points in the image (2d points).

I know that cv::solvePnP could help me, and after reading this and this I understand that I the outputs of solvePnP rvec and tvec are the rotation and translation of the object in camera coordinate system.

So I need to find out the camera rotation/translation in the world coordinate system.

From the links above it seems that the code is straightforward, in python:

found,rvec,tvec = cv2.solvePnP(object_3d_points, object_2d_points, camera_matrix, dist_coefs)
rotM = cv2.Rodrigues(rvec)[0]
cameraPosition = -np.matrix(rotM).T * np.matrix(tvec)

I don't know python/numpy stuffs (I'm using C++) but this does not make a lot of sense to me:

  • rvec, tvec output from solvePnP are 3x1 matrix, 3 element vectors
  • cv2.Rodrigues(rvec) is a 3x3 matrix
  • cv2.Rodrigues(rvec)[0] is a 3x1 matrix, 3 element vectors
  • cameraPosition is a 3x1 * 1x3 matrix multiplication that is a.. 3x3 matrix. how can I use this in opengl with simple glTranslatef and glRotate calls?
Community
  • 1
  • 1
nkint
  • 10,470
  • 30
  • 93
  • 159

2 Answers2

61

If with "world coordinates" you mean "object coordinates", you have to get the inverse transformation of the result given by the pnp algorithm.

There is a trick to invert transformation matrices that allows you to save the inversion operation, which is usually expensive, and that explains the code in Python. Given a transformation [R|t], we have that inv([R|t]) = [R'|-R'*t], where R' is the transpose of R. So, you can code (not tested):

cv::Mat rvec, tvec;
solvePnP(..., rvec, tvec, ...);
// rvec is 3x1, tvec is 3x1

cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3

R = R.t();  // rotation of inverse
tvec = -R * tvec; // translation of inverse

cv::Mat T = cv::Mat::eye(4, 4, R.type()); // T is 4x4
T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T

// T is a 4x4 matrix with the pose of the camera in the object frame

Update: Later, to use T with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in the left hand side of this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

// T is your 4x4 matrix in the OpenCV frame
cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame

These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.

ChronoTrigger
  • 7,849
  • 1
  • 34
  • 50
  • seems to work, I get the angles for `glRotatef` with a formula taken from this: http://www.euclideanspace.com/maths/geometry/rotations/conversions/matrixToEuler/index.htm and then normal conversion from radials to degree. But but if I plug those values in opengl I still get a wrong camera rotation (rotation X is something like 45° wrong) and slightly wrong translation.. – nkint Sep 06 '13 at 07:21
  • That may be due to the fact that the camera frame in OpenCV and OpenGL are different. Check my extended answer. – ChronoTrigger Sep 06 '13 at 12:47
  • yes I know the difference in the order of matrix in memory between opencv and opengl. And I also have to flip the y and z axis ( => use opencv y as opengl z, and use opencv z as opengl y) – nkint Sep 06 '13 at 17:21
  • almost near! the result are coherent with several trials. It seems to me that there is an error of 45° on X axis (that could be some difference in the frame between opencv and opengl that I don't understand) and of 10° on y angle (that I don't know how to interpret) – nkint Sep 06 '13 at 17:21
  • Are you sure the axes are as you say? I think they are as in my example, but I might be wrong. Check also trat the projection parameters of the camera are the same in opencv and opengl. – ChronoTrigger Sep 07 '13 at 07:25
  • ok so, it was an error somewhereelse. I have to rotateY(180) and rotateZ(180), I haven't understand exactly why but it seems to work very well. Thank you a lot for the complete answer! – nkint Sep 10 '13 at 14:32
  • (anyway it seems working well but I'm looking for a way to numerically test it, if you have suggestion, are welcome!) – nkint Sep 10 '13 at 14:33
  • @ChronoTrigger do you know where the OpenCV image frame is centered? The image center or the top-left corner? When you give 2d image points to solvePnp, are those 2d coordinates related to the image center or the top-left corner? – manatttta Apr 10 '15 at 09:50
  • @manatttta Use image coordinates starting at the top-left corner, because the camera calibration already contains the coordinates of the optical center. You may use any frame of reference other than the top-left corner by modifying the calibration parameters accordingly. – ChronoTrigger Apr 10 '15 at 09:58
  • For me works not rotate camera for 180 degrees, but I rotated whole scene around X axis and left camera pose matrix untouched. – TonyB Mar 23 '20 at 16:33
8

If you want to turn it into a standard 4x4 pose matrix specifying the position of your camera. Use rotM as the top left 3x3 square, tvec as the 3 elements on the right, and 0,0,0,1 as the bottom row

pose = [rotation   tvec(0)
        matrix     tvec(1)
        here       tvec(2)
        0  , 0, 0,  1]

then invert it (to get pose of camera instead of pose of world)

Hammer
  • 9,265
  • 1
  • 32
  • 51