This tool will calculate a perspective transformation matrix from four matching points using the getPerspectiveTransform method from OpenCV. Initially I wanted to implement it in pure JavaScript, but I couldn't be bothered to write the method that could solve the linear system with SVD.

So why is this useful? It's an easy way to transform an image in a perspective way by simply multiplying the original coordinates with the transformation matrix. Look at this example:

   

We want to turn these images into this:

To do this, we use four points in the sheep image (the four corners) and enter them as source coordinates. We then select four points in the box image and enter them as destination coordinates.

With this information you can calculate the transformation matrix to transfer all your pixels in the sheep image to the box image.

You might wonder how you can multiply your 2D coordinates with a 3x3 matrix. The answer is homogeneous coordinates. Simply add a 1 to your vector: \begin{align*} \left[ \begin{array}{c} x\\ y \end{array} \right] \Rightarrow \left[ \begin{array}{c} x\\ y\\ 1 \end{array} \right] \end{align*} Multiply that with the transformation matrix T: \begin{align*} \left[ \begin{array}{c} x'\\ y'\\ z' \end{array} \right] = T\left[ \begin{array}{c} x\\ y\\ 1 \end{array} \right] \end{align*} You will end up with a vector that has 3 entries. Divide the first two components by the third and you got your new x/y values. \begin{align*} \left[ \begin{array}{c} \bar x\\ \bar y \end{array} \right] = T\left[ \begin{array}{c} x'/z'\\ y'/z' \end{array} \right] \end{align*}

To create the transformation matrix you need two specify four points in the original image and then specify where you want these points to be "mapped" (transformed) to.

Enter four points as source coordinates:
(x1, y1) = (, )
(x2, y2) = (, )
(x3, y3) = (, )
(x4, y4) = (, )

Now enter four matching points as destination coordinates:
(x1, y1) = (, )
(x2, y2) = (, )
(x3, y3) = (, )
(x4, y4) = (, )



If you're interested in what the OpenCV C++ code looks like, here you go. I've used Qt for argument parsing and output preperation. Also, it's not written for speed but rather readability:

#include <iostream>

#include <QCoreApplication>
#include <QString>
#include <QStringList>

#include <opencv/cv.h>
#include <opencv/highgui.h>

using namespace cv;
using namespace std;

int main(int argc, char *argv[])
{
    // Check number of arguments
    QCoreApplication a(argc, argv);
    QStringList args = a.arguments();
    if(args.length() != 3) {
        cout << "Usage: persTransform <srcx1>,<srcy1>,... ";
        cout << "<dstx1>,<dsty1>,..." << endl;
        return -1;
    }

    // Check number of coordinates
    QStringList asrc = args.at(1).split(",");
    QStringList adst = args.at(2).split(",");
    if(asrc.length() != 4 || adst.length() != 4) {
        cout << "Invalid use: need 4 x/y pairs for source ";
        cout << "and destination coordinates!" << endl;
        return -1;
    }

    // Assign source/destination coordinates
    Point2f src[4], dst[4];
    for(int i=0; i<4; i++) {
        src[i].x = asrc.at(2*i).toFloat();
        src[i].y = asrc.at(2*i + 1).toFloat();
        dst[i].x = adst.at(2*i).toFloat();
        dst[i].y = adst.at(2*i + 1).toFloat();
    }

    // Calculate transformation matrix
    Mat t = getPerspectiveTransform(src, dst);

    // Create list of matrix
    QStringList output;
    for(int r = 0; r < t.rows; r++)
        for(int c = 0; c < t.cols; c++) {
            QString m;
            // Please note that the .at method accesses (y,x), not (x,y)
            m.setNum(t.at<double>(r, c));
            output.push_back(m);
        }

    // Output matrix as comma-seperated list
    cout << output.join(",").toStdString();
    cout << endl;

    return 0;
}
 

Leave a Reply

Set your Twitter account name in your settings to use the TwitterBar Section.