solvePNP vs recoverPose by rotation composition: why translations are not same?

Question

Part 1: Before update

I am trying to estimate relative position using two different methods: solvePNP and recoverPose, and seems like R matrices are looks OK with respect to some error, but translation vectors are totally different. What am I doing wrong? In general, I need to find relative position from frame 1 to frame 2, using both methods.

    cv::solvePnP(constants::calibration::rig.rig3DPoints, corners1,
                 cameraMatrix, distortion, rvecPNP1, tvecPNP1);
    cv::solvePnP(constants::calibration::rig.rig3DPoints, corners2,
                 cameraMatrix, distortion, rvecPNP2, tvecPNP2);

    Mat rodriguesRPNP1, rodriguesRPNP2;
    cv::Rodrigues(rvecPNP1, rodriguesRPNP1);
    cv::Rodrigues(rvecPNP2, rodriguesRPNP2);

    rvecPNP = rodriguesRPNP1.inv() * rodriguesRPNP2;
    tvecPNP = rodriguesRPNP1.inv() * (tvecPNP2 - tvecPNP1);

    Mat E = findEssentialMat(corners1, corners2, cameraMatrix);

    recoverPose(E, corners1, corners2, cameraMatrix, rvecRecover, tvecRecover);

Output:

solvePnP: R: 
[0.99998963, 0.0020884471, 0.0040569459;
-0.0020977913, 0.99999511, 0.0023003994;
-0.0040521105, -0.0023088832, 0.99998915]

solvePnP: t:  
[0.0014444492; 0.00018377086; -0.00045027508]

recoverPose: R: 
[0.9999900052294586, 0.0001464890570028249, 0.004468554816042664;
-0.0001480011106435358, 0.9999999319097322, 0.0003380476328946509;
-0.004468504991498534, -0.0003387056052618761, 0.9999899588204144]

recoverPose: t: 
[0.1492094050828522; -0.007288328116585327; -0.9887787587261805]

Part 2: After update

Update: I have changed the way how R-s and t-s are composed after solvePnP:

    cv::solvePnP(constants::calibration::rig.rig3DPoints, corners1,
                 cameraMatrix, distortion, rvecPNP1, tvecPNP1);
    cv::solvePnP(constants::calibration::rig.rig3DPoints, corners2,
                 cameraMatrix, distortion, rvecPNP2, tvecPNP2);

    Mat rodriguesRPNP1, rodriguesRPNP2;
    cv::Rodrigues(rvecPNP1, rodriguesRPNP1);
    cv::Rodrigues(rvecPNP2, rodriguesRPNP2);

    rvecPNP = rodriguesRPNP1.inv() * rodriguesRPNP2;
    tvecPNP = rodriguesRPNP2 * tvecPNP2 - rodriguesRPNP1 * tvecPNP1;

This composition was checked with actual movements of the camera and seems correct.

Also, recoverPose now getting points from non-planar object, and these points are in general position. The motion tested was also not pure rotation to avoid degenerate case, but still translation vectors are very different.

First frame: First frame: Green points are tracked and matched and can be seen on the second frame (on the second frame they are blue though): Second frame: Second frame: Tracked points in general position for recoverPose:

    cv::solvePnP(constants::calibration::rig.rig3DPoints, corners1,
                 cameraMatrix, distortion, rvecPNP1, tvecPNP1);
    cv::solvePnP(constants::calibration::rig.rig3DPoints, corners2,
                 cameraMatrix, distortion, rvecPNP2, tvecPNP2);

    Mat rodriguesRPNP1, rodriguesRPNP2;
    cv::Rodrigues(rvecPNP1, rodriguesRPNP1);
    cv::Rodrigues(rvecPNP2, rodriguesRPNP2);

    rvecPNP = rodriguesRPNP1.inv() * rodriguesRPNP2;
    tvecPNP = rodriguesRPNP2 * tvecPNP2 - rodriguesRPNP1 * tvecPNP1;

    CMT cmt;
    // ...
    // ... cmt module finds and tracks points here
    // ...
    std::vector<Point2f> coords1 = cmt.getPoints();
    std::vector<int> classes1 = cmt.getClasses();

    cmt.processFrame(imGray2);

    std::vector<Point2f> coords2 = cmt.getPoints();
    std::vector<int> classes2 = cmt.getClasses();

    std::vector<Point2f> coords3, coords4;

    // Make sure that points and their classes are in the same order 
    // and the vectors of the same size
    utils::fuse(coords1, classes1, coords2, classes2, coords3, coords4,
                constants::marker::randomPointsInMark);

    Mat E = findEssentialMat(coords3, coords4, cameraMatrix, cv::RANSAC, 0.9999);

    int numOfInliers = recoverPose(E, coords3, coords4, cameraMatrix,
                                   rvecRecover, tvecRecover);

Output:

solvePnP: R:
[ 0.97944641,  0.072178222,  0.18834825;
 -0.07216832,  0.99736851,  -0.0069195116;
 -0.18835205, -0.0068155089, 0.98207784]

solvePnP: t:
[-0.041602995; 0.014756925; 0.025671512]

recoverPose: R:
[0.8115000456601129,  0.03013366385237642, -0.5835748779689431;
 0.05045522914264587, 0.9913266281414459,   0.1213498503908197;
 0.5821700316452212, -0.1279198133228133,   0.80294120308629]

recoverPose: t:
[0.6927871089455181; -0.1254653960405977; 0.7101439685551703]

recoverPose: numOfInliers: 18

I have also tried the case when camera stand still (no R, no t), and R-s are close but translations aren't. So what I am missing here?

Are your calibration rigs planar? If so, possibly that the essential matrix is not correctly estimated ([degenerate case](http://www.cs.unc.edu/~marc/tutorial/node58.html)). Also, does the relative translation make sense in the solvePnP case? — Catree, Aug 19 '18 at 12:14
Yes, its a regular calibration chessboard. In this particular case, **R** and **t** estimated from the corresponding points of the same pose, and I assume that solvePNP gives a correct results: **R** is close to **I** and **t** to zero. I did also tried two different views that involves both Rotation and translation, but result for recoverPose transformation still differs from solvePNP's. — stackoverflower, Aug 19 '18 at 15:04
So it is definitely a [degenerate case](http://www.cs.unc.edu/~marc/tutorial/node58.html) (planar scene) where the essential matrix cannot be correctly estimated ([see also](http://www.vision.jhu.edu/teaching/vision05/Lecture-D.pdf)). — Catree, Aug 19 '18 at 19:02
I have eliminated degenerate cases, and still have wrong results (see update). Any thoughts? — stackoverflower, Sep 18 '18 at 14:44

score 3 · Accepted Answer · answered Oct 26 '18 at 10:24

If you are using a monocular camera system to find relative position between frames, then by decomposing essential matrix, you could not get the absolute translation in the real world. Notice that all the translation vectors you get from recoverPose() are unit vectors. "By decomposing E, you can only get the direction of the translation, so the function returns unit t.", from document of decomposeEssentialMat().

And for solvePnP(), it uses 3D points of the world coordinate system. Thus, the translation calculated from solvePnP() should be the absolute value in the real world. For the rotation R, both methods produce the correct answer.

Thank you! Waited for an answer so long :) – stackoverflower Oct 26 '18 at 18:18 — stackoverflower, Oct 26 '18 at 18:18

solvePNP vs recoverPose by rotation composition: why translations are not same?

Part 1: Before update

Part 2: After update

1 Answers1