Segmenting Trading Cards

 
A few years ago, I wrote some code for detecting the names of Ashes: Rise of the Phoenixborn game cards, from photographs. That was done as a learning experience for myself, and dealt with a handful of different topics such as image segmentation, and text extraction and detection. Today, I don't want to spend time talking about the latter, largely because technology moves fast and I'm sure there are better approaches to the problem today than there were when I created this. But I do want to talk about the former, because I think it is still broadly applicable to the topic of computer vision and image processing. So let's talk about it, starting with a problem statement:

I have some trading cards on a flat surface. How can I detect them?



I'm going to rely on using OpenCV functionality here, because it's what's familiar to me, and it just makes sense to use it for tasks like this. But conceptually, you could use whatever library or framework you want, or even roll your own code by hand.

To start with, we're going to want to turn our image black and white, and then apply a Gaussian blur to it. This will aid us in our next step, which will be to detect the edges using Canny edge detection.




Once you have the mat containing the Canny edges, we want to prepare it for segmentation by first applying some morphological transformations. By performing Canny edge detection, the background has been left a solid black, with the outline of the cards in white. However, all of the information within the borders of the cards is problematic. By applying dilation and erosion, we can cause this detail to become more "chunky", with larger areas of contiguous white and black pixels.


Having the cards like this is useful, because we can perform some magic to really close the shapes. I learned this neat little trick years back, and it's been a powerful tool in my toolbox ever since. But the idea is that we invert the image, then floodfill the image in order to fill in all (or most) of the pixels. This gets us something like this:




This entire process looks something like:

// Apply binary thresholding to the image, and apply processing to it to segment the cards from the surface they are on
void color_to_thresh(cv::Mat & src, cv::Mat & dst) {
    cv::Mat img_bw;
    cv::cvtColor(src, img_bw, cv::ColorConversionCodes::COLOR_BGR2GRAY);
    cv::GaussianBlur(img_bw, img_bw, cv::Size(3, 3), 7);
    cv::Mat canny = img_bw.clone();
    cv::Canny(img_bw, canny, 50, 255);
    cv::Mat elem = cv::getStructuringElement(cv::MORPH_ELLIPSE, cv::Size(7, 7));
    cv::dilate(canny, canny, elem, cv::Point(-1, -1), 2);
    cv::erode(canny, canny, elem, cv::Point(-1, -1), 2);
    dst = canny.clone();
    cv::Mat tmp = dst.clone();
    fillHoles(tmp, dst);
}

It isn't perfect, but it doesn't have to be. The shapes are clean, and that allows us to do what we need to do next, which is use OpenCV's findContours function to seek the outlines of these shapes. If we were to render them, it would look something like the following.



To do this is a one-liner in OpenCV:

cv::findContours(thresh, contours, hierarchy, cv::RetrievalModes::RETR_TREE, cv::ContourApproximationModes::CHAIN_APPROX_TC89_KCOS);

We're now most of the way there, but we aren't done just yet. First we want to clean up the contours, and we can do that in three steps. First, we want to eliminate all of the contours that are too small, (or too large.) The object returned by OpenCV's findContours function does contain the size of the contour, so with a bit of experimental trial-and-error, you can figure out what sizes work for the shapes you are detecting, under the conditions you are detecting them in. After that, we can run a looping test to remove any contour which is contained inside of another contour. Lastly, we can utilize OpenCV's convexHull function to take the contours that we have left -- the ones we haven't pruned -- and get the outer edge surrounding them. This is really where the contour gets cleaned up.

// Removes smaller contours inside larger ones, and utilizes the convex hull of contours, to avoid contours bleeding into background
// Also remove especially large and especially small contours
void cleanContours(const std::vector<std::vector<cv::Point>>& inputContours, std::vector<std::vector<cv::Point>>& outputContours, double minArea) {
    std::vector<bool> keepContour(inputContours.size(), true);

    // Step 1: Remove contours inside of another contour, and contours that are too small
    for (size_t i = 0; i < inputContours.size(); ++i) {
        double area = cv::contourArea(inputContours[i]);
        if (area < minArea) {
            keepContour[i] = false;
            continue;
        }
        for (size_t j = 0; j < inputContours.size(); ++j) {
            if (i != j && keepContour[j]) {
                // Check if contour j is inside contour i
                if (cv::pointPolygonTest(inputContours[i], inputContours[j][0], false) >= 0) {
                    keepContour[j] = false;
                }
            }
        }
    }

    // Step 2: Generate the convex hull of the remaining contours
    for (size_t i = 0; i < inputContours.size(); ++i) {
        if (keepContour[i]) {
            std::vector<cv::Point> hull;
            cv::convexHull(inputContours[i], hull);
            outputContours.push_back(hull);
        }
    }
}

And voila, we have neatly segmented Ashes cards! And I propose that this process would work for segmenting most trading card-shaped objects.



Comments

Popular posts from this blog

Projecting New Points onto a Mediapipe Face Mesh

Connecting AI to Google Places

Solving Sudoku with Computer Vision