5

these days I am working on understanding the solvePnP algorithm used for computer vision. I found the official EPnP algorithm in MATLAB here https://github.com/cvlab-epfl/EPnP.

I am comparing the results of the official implementation with the OpenCV version via the Python API.

Running a simple example with both methods, gives me different R matrices and t vectors. The t vectors are different and the sign of the two upper rows of R are flipped (see output below under the code examples). What am I missing here?

Thanks!

Example Python code:

import numpy as np
import cv2

cam_in = [[2445.72, 0.0, 819.29],
          [0.0, 2442.39, 660.13],
          [0.0, 0.0, 1.0]]
          
# 3p model from opencv
objPts = [[  0. ,   0. ,   0. ],
          [ 82.5,   0. ,   0. ],
          [165. ,   0. ,   0. ],
          [247.5,   0. ,   0. ],
          [ 55. ,  27.5,   0. ],
          [137.5,  27.5,   0. ],
          [220. ,  27.5,   0. ],
          [ 27.5,  55. ,   0. ],
          [110. ,  55. ,   0. ],
          [192.5,  55. ,   0. ],
          [  0. ,  82.5,   0. ],
          [ 82.5,  82.5,   0. ],
          [165. ,  82.5,   0. ],
          [247.5,  82.5,   0. ],
          [ 55. , 110. ,   0. ],
          [137.5, 110. ,   0. ],
          [220. , 110. ,   0. ],
          [ 27.5, 137.5,   0. ],
          [110. , 137.5,   0. ],
          [192.5, 137.5,   0. ],
          [  0. , 165. ,   0. ],
          [ 82.5, 165. ,   0. ],
          [165. , 165. ,   0. ],
          [247.5, 165. ,   0. ]]


imagePts = [[648.84735, 335.1484 ],
            [522.6854 , 317.74222],
            [400.24448, 301.46362],
            [281.39792, 285.43964],
            [560.8046 , 366.6523 ],
            [437.57022, 349.67358],
            [318.269  , 333.33557],
            [598.38196, 415.80203],
            [475.02866, 397.87906],
            [354.60062, 380.84167],
            [636.9289 , 465.04666],
            [512.3496 , 446.39185],
            [391.26932, 428.63168],
            [273.65057, 411.7955 ],
            [549.6402 , 495.04532],
            [428.12842, 476.45554],
            [309.60794, 458.5343 ],
            [587.5397 , 543.42163],
            [465.2291 , 524.39795],
            [346.24826, 505.79684],
            [624.71814, 591.7365 ],
            [502.51782, 572.03394],
            [382.8287 , 552.7545 ],
            [266.8465 , 534.29364]]

cam_in, objPts, imagePts = [np.array(x) for x in [cam_in, objPts, imagePts]]
distortion = None
ret, rvec, T = cv2.solvePnP(objPts, imagePts, cam_in, distortion, flags=cv2.SOLVEPNP_EPNP)
R, _ = cv2.Rodrigues(rvec)

print('R')
print(R)
print()
print('T:')
print(T)

Example MATLAB code with Octave:

clear all
close all

cam_in = [[2445.72, 0.0, 819.29]; ...
          [0.0, 2442.39, 660.13]; ...
          [0.0, 0.0, 1.0]]
          
% 3p model from opencv
objPts = [[  0. ,   0. ,   0. ]; ...
          [ 82.5,   0. ,   0. ]; ...
          [165. ,   0. ,   0. ]; ...
          [247.5,   0. ,   0. ]; ...
          [ 55. ,  27.5,   0. ]; ...
          [137.5,  27.5,   0. ]; ...
          [220. ,  27.5,   0. ]; ...
          [ 27.5,  55. ,   0. ]; ...
          [110. ,  55. ,   0. ]; ...
          [192.5,  55. ,   0. ]; ...
          [  0. ,  82.5,   0. ]; ...
          [ 82.5,  82.5,   0. ]; ...
          [165. ,  82.5,   0. ]; ...
          [247.5,  82.5,   0. ]; ...
          [ 55. , 110. ,   0. ]; ...
          [137.5, 110. ,   0. ]; ...
          [220. , 110. ,   0. ]; ...
          [ 27.5, 137.5,   0. ]; ...
          [110. , 137.5,   0. ]; ...
          [192.5, 137.5,   0. ]; ...
          [  0. , 165. ,   0. ]; ...
          [ 82.5, 165. ,   0. ]; ...
          [165. , 165. ,   0. ]; ...
          [247.5, 165. ,   0. ]];


imagePts = [[648.84735, 335.1484 ]; ...
            [522.6854 , 317.74222]; ...
            [400.24448, 301.46362]; ...
            [281.39792, 285.43964]; ...
            [560.8046 , 366.6523 ]; ...
            [437.57022, 349.67358]; ...
            [318.269  , 333.33557]; ...
            [598.38196, 415.80203]; ...
            [475.02866, 397.87906]; ...
            [354.60062, 380.84167]; ...
            [636.9289 , 465.04666]; ...
            [512.3496 , 446.39185]; ...
            [391.26932, 428.63168]; ...
            [273.65057, 411.7955 ]; ...
            [549.6402 , 495.04532]; ...
            [428.12842, 476.45554]; ...
            [309.60794, 458.5343 ]; ...
            [587.5397 , 543.42163]; ...
            [465.2291 , 524.39795]; ...
            [346.24826, 505.79684]; ...
            [624.71814, 591.7365 ]; ...
            [502.51782, 572.03394]; ...
            [382.8287 , 552.7545 ]; ...
            [266.8465 , 534.29364]];
         
[R T reprojectionPts] = efficient_pnp(objPts, imagePts, cam_in);

display(R)
display(T)

Python output:

R
[[-0.9553479  -0.10390386  0.27661231]
 [-0.15995044  0.96896901 -0.18845399]
 [-0.24844766 -0.22428338 -0.94232199]]

T:
[[-105.91385446]
 [-201.18325419]
 [1596.85002509]]

Octave output:

R =

   0.95539   0.10395  -0.27644
   0.15990  -0.96902   0.18824
  -0.24830  -0.22405  -0.94242

T =

   -359.306
    -80.755
   1595.419
rassi
  • 130
  • 8

0 Answers0