8

I've recently implemented Perspective Transform in OpenCV to my app in Android. Almost everything works without issues but one aspect needs much more work to be done.

The problem is that I do not know how to count the right aspect ratio of the destination image of Perspective Transform (it does not have to be set manually), so that it could count the aspect ratio of the image to the size of the real thing/image despite the angle of a camera. Note that the starting coordinates do not form trapezoid, it does form a quadrangle.

If I have a photograph of a book taken from approximately 45 degrees and I want the destination image aspect ratio to be pretty much the same as this book's aspect ratio is. It is hard to do having a 2D photo, but CamScanner app does it perfectly. I've made very simple way to count the size of my destination image (with no expectations for it to work as I want), but it makes the image from 45 degree angle about 20% shorter and when lowering the angle the image height reduces significantly, while CamScanner does it perfectly despite the angle:

enter image description here

Here, CamScanner maintains the aspect ratio of the destination image (second one) the same as the book's, it did pretty accurately even at ~20 degree angle.

Meanwhile, my code looks like this (while counting sizes of destination image I have no intention for it to work as I ask in this question):

public static Mat PerspectiveTransform(Point[] cropCoordinates, float ratioW, float ratioH, Bitmap croppedImage)
{
    if (cropCoordinates.length != 4) return null;

    double width1, width2, height1, height2, avgw, avgh;
    Mat src = new Mat();
    List<Point> startCoords = new ArrayList<>();
    List<Point> resultCoords = new ArrayList<>();

    Utils.bitmapToMat(croppedImage, src);

    for (int i = 0; i < 4; i++)
    {
        if (cropCoordinates[i].y < 0 ) new Point(cropCoordinates[i].x, 0);
        startCoords.add(new Point(cropCoordinates[i].x * ratioW, cropCoordinates[i].y * ratioH));
    }

    width1 = Math.sqrt(Math.pow(startCoords.get(2).x - startCoords.get(3).x,2) + Math.pow(startCoords.get(2).y - startCoords.get(3).y,2));
    width2 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(0).x,2) + Math.pow(startCoords.get(1).y - startCoords.get(0).y,2));
    height1 = Math.sqrt(Math.pow(startCoords.get(1).x - startCoords.get(2).x, 2) + Math.pow(startCoords.get(1).y - startCoords.get(2).y, 2));
    height2 = Math.sqrt(Math.pow(startCoords.get(0).x - startCoords.get(3).x, 2) + Math.pow(startCoords.get(0).y - startCoords.get(3).y, 2));
    avgw = (width1 + width2) / 2;
    avgh = (height1 + height2) / 2;

    resultCoords.add(new Point(0, 0));
    resultCoords.add(new Point(avgw-1, 0));
    resultCoords.add(new Point(avgw-1, avgh-1));
    resultCoords.add(new Point(0, avgh-1));

    Mat start = Converters.vector_Point2f_to_Mat(startCoords);
    Mat result = Converters.vector_Point2d_to_Mat(resultCoords);
    start.convertTo(start, CvType.CV_32FC2);
    result.convertTo(result,CvType.CV_32FC2);

    Mat mat = new Mat();
    Mat perspective = Imgproc.getPerspectiveTransform(start, result);
    Imgproc.warpPerspective(src, mat, perspective, new Size(avgw, avgh));

    return mat;
}

And from relatively the same angle my method produces this result:

enter image description here

What I want to know is how it is possible to do? It is interesting for me how did they manage to count the length of the object just by having coordinates of 4 corners. Also, if it is possible, please provide some code/ mathematical explanations or articles of similar/same thing.

Thank you in advance.

Dainius Šaltenis
  • 1,264
  • 13
  • 27
  • do you know the aspect ratio of the real object (since float ratioW, float ratioH are input parameter?)? – Micka Jul 13 '16 at 06:08
  • maybe they assume some camera intrinsics with quadratic pixels and assume that the object is rectangular. Maybe you can compute the aspect ratio from the information that there are 3 points on each border line known. But I didn't try yet. – Micka Jul 13 '16 at 06:15
  • 3
    Ok, I guess I know how to compute the aspect ratio of the real rectangular object. I thought back at my computer graphics lectures and remembered 2-point-perspective to measure distances in projections. See http://computergraphics.stackexchange.com/questions/1762/calculate-aspect-ratio-from-2d-shape-in-3d-space and http://www.handprint.com/HP/WCL/perspect3.html once you know the aspect ratio of the real object you are able to solve the problem, right? – Micka Jul 13 '16 at 08:33
  • That was a lot of math and calculations and diagrams and geometry ;) @Micka – ZdaR Jul 13 '16 at 08:48
  • well, you could try the angle-ratio of the vanishing lines as suggested in the stackexchange answer. That's quite easy to compute and maybe that's good enough. But: Projective geometry IS a lot of math ;) – Micka Jul 13 '16 at 08:50
  • Well, looking at the example image I provided I think that method could still not work (nice source by the way, interesting), because it looks like there is only one vanishing angle (at the top), because top and bottom lines are nearly parallel, so the ratio would be extremely high. Tell me if I am wrong. – Dainius Šaltenis Jul 13 '16 at 10:12
  • @DainiusŠaltenis maybe there are similar ways for 1-point-perspective, not sure. – Micka Jul 13 '16 at 12:48
  • @DainiusŠaltenis in 1-point-perspective you must know or guess the focal distance. With that you can compute the diagonas vanishing point of a square and from that you can compute the ratio (maybe directly with cross-ratios or in some kind of reverse engineering). – Micka Jul 13 '16 at 14:11
  • 2
    You may find http://stackoverflow.com/questions/1194352/proportions-of-a-perspective-deformed-rectangle interesting. – jodag Jul 14 '16 at 01:43

2 Answers2

12

This has come up a few times before on SO but I've never seen a full answer, so here goes. The implementation shown here is based on this paper which derives the full equations: http://research.microsoft.com/en-us/um/people/zhang/papers/tr03-39.pdf

Essentially, it shows that assuming a pinhole camera model, it is possible to calculate the aspect ratio for a projected rectangle (but not the scale, unsurprisingly). Essentially, one can solve for the focal length, then get the aspect ratio. Here's a sample implementation in python using OpenCV. Note that you need to have the 4 detected corners in the right order or it won't work (note the order, it is a zigzag). The reported error rates are in the 3-5% range.

import math
import cv2
import scipy.spatial.distance
import numpy as np

img = cv2.imread('img.png')
(rows,cols,_) = img.shape

#image center
u0 = (cols)/2.0
v0 = (rows)/2.0

#detected corners on the original image
p = []
p.append((67,74))
p.append((270,64))
p.append((10,344))
p.append((343,331))

#widths and heights of the projected image
w1 = scipy.spatial.distance.euclidean(p[0],p[1])
w2 = scipy.spatial.distance.euclidean(p[2],p[3])

h1 = scipy.spatial.distance.euclidean(p[0],p[2])
h2 = scipy.spatial.distance.euclidean(p[1],p[3])

w = max(w1,w2)
h = max(h1,h2)

#visible aspect ratio
ar_vis = float(w)/float(h)

#make numpy arrays and append 1 for linear algebra
m1 = np.array((p[0][0],p[0][1],1)).astype('float32')
m2 = np.array((p[1][0],p[1][1],1)).astype('float32')
m3 = np.array((p[2][0],p[2][1],1)).astype('float32')
m4 = np.array((p[3][0],p[3][1],1)).astype('float32')

#calculate the focal disrance
k2 = np.dot(np.cross(m1,m4),m3) / np.dot(np.cross(m2,m4),m3)
k3 = np.dot(np.cross(m1,m4),m2) / np.dot(np.cross(m3,m4),m2)

n2 = k2 * m2 - m1
n3 = k3 * m3 - m1

n21 = n2[0]
n22 = n2[1]
n23 = n2[2]

n31 = n3[0]
n32 = n3[1]
n33 = n3[2]

f = math.sqrt(np.abs( (1.0/(n23*n33)) * ((n21*n31 - (n21*n33 + n23*n31)*u0 + n23*n33*u0*u0) + (n22*n32 - (n22*n33+n23*n32)*v0 + n23*n33*v0*v0))))

A = np.array([[f,0,u0],[0,f,v0],[0,0,1]]).astype('float32')

At = np.transpose(A)
Ati = np.linalg.inv(At)
Ai = np.linalg.inv(A)

#calculate the real aspect ratio
ar_real = math.sqrt(np.dot(np.dot(np.dot(n2,Ati),Ai),n2)/np.dot(np.dot(np.dot(n3,Ati),Ai),n3))

if ar_real < ar_vis:
    W = int(w)
    H = int(W / ar_real)
else:
    H = int(h)
    W = int(ar_real * H)

pts1 = np.array(p).astype('float32')
pts2 = np.float32([[0,0],[W,0],[0,H],[W,H]])

#project the image with the new w/h
M = cv2.getPerspectiveTransform(pts1,pts2)

dst = cv2.warpPerspective(img,M,(W,H))

cv2.imshow('img',img)
cv2.imshow('dst',dst)
cv2.imwrite('orig.png',img)
cv2.imwrite('proj.png',dst)

cv2.waitKey(0)

Original:

enter image description here

Projected (the resolution is very low since I cropped the image from your screenshot, but the aspect ratio seems correct):

enter image description here

David
  • 658
  • 4
  • 19
yhenon
  • 3,317
  • 14
  • 32
  • 2
    Any idea why I'm getting negative Numbers for fSquare and thus f = nAn? I tried to implement this in java... EDIT: This only happens sometimes... is there the possibility that my double is overflown? – 1resu Nov 03 '17 at 16:41
  • Check if your 4 points are ordered as expected. They follow a particular ordering. – yhenon Nov 03 '17 at 17:45
  • 1
    Not sure if anyone is using this, but I figured out the f = NaN issue. If the image is skewed to where the width of the top is smaller than the width of the bottom then the number in the sqrt is negative hence having to negate it like in the code above. However, if the top is wider than the bottom, then the number is positive so the negation causes an issue. If you replace the negative in the sqrt with an np.abs, then it should work. – David Jun 17 '18 at 00:55
  • What about singularity? The system has no solution if `n23` and/or `n33` are (close to) zero. According to the paper this should happen when the image is already a rectangle, which would make the problem easy to solve. In my experience though it happens whenever two opposing lines are parallel (which unsurprisingly is rather often). I've yet to figure out the best way to solve this in this particular approach. – Elte Hupkes Jul 25 '18 at 14:39
  • @ElteHupkes the treatment of this case is detailed in section 2.3.2 of the paper. In particular, you should be able to calculate the aspect ratio using equation (31). Did you give that a try? I'll try and update my answer when I have some time. – yhenon Jul 25 '18 at 16:26
  • @yhenon It's all I've been doing this afternoon ;). I initially implemented it just like that but after running into lots of degenerate cases that were far from rectangular I started to investigate it further. As it turns out, `k2==1` whenever `m1-m2` and `m4-3` are parallel (likewise for `k3` and the other two sides). I haven't yet figured out where the logic fails (the math _looks_ sound) but you can try it for yourself: construct a quadrangle such that `m1-m2` and `m3-m4` are parallel; calculate `k2` - it will be 1 regardless of the direction of the other lines. – Elte Hupkes Jul 25 '18 at 16:38
  • 1
    More Googling yields this blog post: http://andrewkay.name/blog/post/aspect-ratio-of-a-rectangle-in-perspective/. At the end he mentions that the problem cannot be solved if two lines are parallel and the other two are not. That settles that I guess... – Elte Hupkes Jul 25 '18 at 17:03
  • @yhenon what license is this code under? I would like to include it in a CC-0 or CC-BY licensed image cropping tool i wrote. – blackcoffeerider May 15 '20 at 19:35
  • @blackcoffeerider I believe that code on SO is CC BY-SA. So you are free to use this code as long as there is attribution. Enjoy, and I'd love to see your project when it is done. – yhenon May 15 '20 at 19:51
  • @yhenon thanks a lot you can find my first tiny steps in image processing here if interested: https://github.com/blackcoffeerider/kkaffeedetect – blackcoffeerider May 16 '20 at 16:41
  • This method suffers from **runaway** numerical instability whenever any two lines start to get *almost* parallel, which happens quite often in my application. In my experience, this method it is unreliable and I doubt it that Microsoft is actually using it in production. They have probably made it more robust somehow, maybe by means of some heuristics. Beware: just "plugging in" this algorithm "as is" in your application will likely cause a mess. – gd1 Feb 11 '21 at 00:42
  • Interesting, could you share some samples that leads to numerical issues? – yhenon Feb 11 '21 at 19:14
1

Thanks to y300 and this post https://stackoverflow.com/a/1222855/8746860 I got it implemented in Java. I'll leave this here in case someone has the same problems I had converting it to Java...

public float getRealAspectRatio(int imageWidth, int imageHeight) {

    double u0 = imageWidth/2;
    double v0 = imageHeight/2;
    double m1x = mTopLeft.x - u0;
    double m1y = mTopLeft.y - v0;
    double m2x = mTopRight.x - u0;
    double m2y = mTopRight.y - v0;
    double m3x = mBottomLeft.x - u0;
    double m3y = mBottomLeft.y - v0;
    double m4x = mBottomRight.x - u0;
    double m4y = mBottomRight.y - v0;

    double k2 = ((m1y - m4y)*m3x - (m1x - m4x)*m3y + m1x*m4y - m1y*m4x) /
            ((m2y - m4y)*m3x - (m2x - m4x)*m3y + m2x*m4y - m2y*m4x) ;

    double k3 = ((m1y - m4y)*m2x - (m1x - m4x)*m2y + m1x*m4y - m1y*m4x) /
            ((m3y - m4y)*m2x - (m3x - m4x)*m2y + m3x*m4y - m3y*m4x) ;

    double f_squared =
            -((k3*m3y - m1y)*(k2*m2y - m1y) + (k3*m3x - m1x)*(k2*m2x - m1x)) /
                    ((k3 - 1)*(k2 - 1)) ;

    double whRatio = Math.sqrt(
            (Math.pow((k2 - 1),2) + Math.pow((k2*m2y - m1y),2)/f_squared + Math.pow((k2*m2x - m1x),2)/f_squared) /
                    (Math.pow((k3 - 1),2) + Math.pow((k3*m3y - m1y),2)/f_squared + Math.pow((k3*m3x - m1x),2)/f_squared)
    ) ;

    if (k2==1 && k3==1 ) {
        whRatio = Math.sqrt(
                (Math.pow((m2y-m1y),2) + Math.pow((m2x-m1x),2)) /
                        (Math.pow((m3y-m1y),2) + Math.pow((m3x-m1x),2)));
    }

    return (float)(whRatio);
}
1resu
  • 474
  • 4
  • 11