Posts

Solving Sudoku with Computer Vision

Image
I decided today that it was time to get some mental exercise, and that I wanted to create a program that could take a picture of a Sudoku puzzle, and then solve it. A quick Google search will reveal that this idea is pretty far from being novel. I'm not the first person to do this, and I surely will not be the last. Nonetheless, I think working through the problem was an enjoyable experience, and a nice way to stretch my problem-solving skills. So let's just hop into how I did it. First, I wanted to preprocess the image. The first step in reading the digits from the board is to actually find where the board is in the image. There are tools we have at our disposal for doing that, but none of them will perform well on an image like this. Instead, we should binarize the image, and clean it up as best we can. #-- Preprocess the image so that it is a clean black and white image #-- that the puzzle can be extracted from def preprocessImage(img, visualize=False): #-- Cnvert the ...

Projecting New Points onto a Mediapipe Face Mesh

Image
  Alright, so let me start by saying: there won't be any code shared this time. While that might be disappointing, I'm not entirely sure to what extent I'm allowed  to write and share code on this topic, due to non-compete clauses I may have signed while working with my previous employer, and I'd rather not risk crossing any lines. I will, however, talk about a really cool problem that I was able to solve, and lay out the high level approach that I took to solving it. If you're familiar with Google's  Mediapipe , you may know that it provides a powerful framework for detecting vertices on the surface of a subject's face. One limitation of it, however, is that the surface of the mesh produced by Mediapipe stops about halfway up the forehead. For what I was working on at the time, that wasn't enough for us; it was important that we generate vertices that extend to the top of the forehead as well. And after some amount of problem solving and engineering, I ...

Segmenting Trading Cards

Image
  A few years ago, I wrote some code for detecting the names of Ashes: Rise of the Phoenixborn game cards, from photographs. That was done as a learning experience for myself, and dealt with a handful of different topics such as image segmentation, and text extraction and detection. Today, I don't want to spend time talking about the latter, largely because technology moves fast and I'm sure there are better approaches to the problem today than there were when I created this. But I do want to talk about the former, because I think it is still broadly applicable to the topic of computer vision and image processing. So let's talk about it, starting with a problem statement: I have some trading cards on a flat surface. How can I detect them? I'm going to rely on using OpenCV functionality here, because it's what's familiar to me, and it just makes sense to use it for tasks like this. But conceptually, you could use whatever library or framework you want, or ...

Connecting AI to Google Places

I have a proof of concept web app called Ready Yup included in my portfolio. You can see a live demo of it here , but what it is, is a demonstration of an integration between a large language model (LLM), and Google's Map and Places APIs. I thought it was an interesting idea, and I wanted to talk a bit about how I made it. So first of all, let's talk about what I wanted the application to do, because that's what informed my choices for which technologies I used. My idea was for an application that would allow you to get a short list of restaurants in a given area, and to then give you basic information such as distance and an average customer rating. I also then wanted it to produce very short blurb-summaries of the good and bad points of each restaurant, to help a user decide between potential locations at a quick glance. The first part was easy enough. Google Places supports querying through a RESTful API, and it supports powerful text-based search queries. For exam...