A computer vision app that scans physical puzzles and provides real-time solving assistance.
Smart Toys & Games is a Belgian company that designs physical and digital puzzles and games for children and families. Their logic puzzles challenge players to fit uniquely shaped pieces onto a board: easy to start, but hard to finish.
When a player got stuck, the only option was a printed booklet, but it showed the full solution at once. There was no way to get just a small hint. And if the player was trying a challenge outside the booklet, there was no help at all: the booklet only covered 120 challenges and their solutions, while the puzzle had far more possible configurations.


That's where I came in. I developed a computer vision app that scanned the physical puzzle board through the phone camera, detected every piece and its position, and rebuilt the board digitally. From that digital state, the app could check whether the puzzle was still solvable and provide progressive hints, starting with a single cell, then revealing more, up to the full piece placement. If the current placement led to a dead end, it identified exactly which pieces needed to be removed. Everything ran on the user's device, with no backend, no installation needed.
The app processed each photo through a full computer vision pipeline. The image was first preprocessed and fed into a YOLOv26 Nano model that detected every piece and the board using instance segmentation. The detected masks were then decoded, and the camera angle was corrected using homography to produce a top-down view. From there, each piece was mapped to its position on the digital grid. Finally, a backtracking solver determined whether the puzzle was still solvable and generated hints or a full solution.
Built the dataset two ways: 1,800+ synthetic images rendered in Blender with fully automated annotation, plus 240+ real photos hand-labeled with instance segmentation in Roboflow.
Trained a YOLOv26 Nano model with a two-stage approach, pretrained on synthetic data, then finetuned on real photos, reaching 99.5% mAP.
Ported the entire computer-vision pipeline from Python/OpenCV to pure JavaScript, running fully in the browser, with the YOLO model executed in WebAssembly via ONNX Runtime. No server.
Built a backtracking solver with progressive hints (one cell → two cells → full piece) and solvability checking.
Applied the same architecture to a second puzzle with a completely different board, grid, and piece shapes.
The same computer-vision pipeline running on two different puzzles, scanning the board, detecting every piece, and generating live hints, all in the browser.
The first puzzle — wave-shaped pieces on the original board.
The second puzzle — same architecture, a different board and pieces.
This internship was my first experience working on an AI project in a professional environment, and it was very different from anything I had done at school. The project I built will actually be used by children playing Smart NV puzzles, and knowing that my work has a real and lasting impact makes me proud of what I accomplished.
Working alongside senior developers taught me the value of asking the right questions. AI tools can help you code faster, but real experience and guidance from people who have been through similar challenges saves you from wasting time on approaches that will not work. On the technical side, I grew significantly in computer vision, model training, data generation, and building full applications from scratch. On the personal side, I learned when to push through a challenge and when to speak up and redirect, which is a skill just as important as any technical one.
Three documents that tell the full story of the internship: the project plan, the realization thesis, and a personal reflection.
Scope, goals, approach, and planning set out at the start of the internship.
OpenThe full thesis detailing what was built and how, with results and analysis.
OpenA personal account of the internship: the challenges I faced, what I learned, and how I grew.
Open