If you haven’t heard of Sudoku puzzles (数独 , sūdoku) you’ve either been sleeping under a rock or been space traveling for quite a while. These 9×9 square puzzles originating from around 1900 became an international hit in 2005. Sudokus appear in newspapers, online and special sudoku puzzle books all-over-the-world. And as if that is not yet enough Sudoku TV shows and all kinds of variants of the puzzle are made. One can solve a sudoku using logic only. Because of this computational algorithms to solve every possible Sudoku must exist. This is part one in the series on such algorithms.
We must speak of multiple algorithms (strategies) because one algorithm would consist of multiple solve strategies. The most simple strategy not included in this article is a brute force or trail and error ‘attack’ on given input. This is not logic and could take quite a while as there are 70759827985602812313600000000 possibilities to a default 9×9 sudoku, not considering variations (in Dutch) of e.g. 16×16 hex or jigsaw sudokus. The strategies discussed by Andrew Stuart are based on the default 9×9 sudoku and some are limited to those puzzles and some block only strategies overlap with row column strategies. What I want to try with this article is programming a more generic solution which implements some strategies making the result cover all strategies. I think this is possible if you imagine a sudoku board not as a set of cells, divided in rows, columns and blocks, but as a set of equal sections which might have one or more cells in common. This way it will also be easy to implement variants simply by extending the model for a sudoku, resulting in e.g. the X-sudoku or hypersudoku and strategies need to be implemented only once.
Sudoku model
First of all, a model is needed which represents every possible sudoku. That model should include the character set (numbers 1 to 9), the cells (81) and the sections (27). Defaults for a 9×9 sudoku are filled out between brackets. If not mentioned otherwise, every example will concern such a default 9×9 sudoku.
In the model, each cell is identified by a number (0-80). This array keeps an array of possibilities for every cell. On initilization, every possibility is set to true, as everything still is possible but while solving the sudoku more and more will toggle to false, finally leaving only one to true.
$possibilities = array( false, //false if unsolved, if solved integer of solution (redundant*) true, true, true, true, true, true, true, true, true //1 - 9 are possible (true) ); for($i = 0; $i < 81; $i++) { $cells[$i] = $possibilities; }
* The first key in the ‘possibilities’ array is redundant: it could be determined by the other keys (if only one is true, it is solved and has that value). But this way, the array keys match the value they represent and it might increase speed by avoiding determining if the cell is solved or not over and over again.
The sections are probably the most important piece of the model as they define how cells relate to each other. There are 27 sections, each containing nPossibilities (9) cells. There are the rows, columns and blocks. Example resp. top row, left column and top left block with the cell identifiers:
$section = array(); $section[] = array(0, 1, 2, 3, 4, 5, 6, 7, 8); //top row //... 8 more ... $section[] = array(0, 9, 18, 27, 36, 45, 54, 63, 72); //left column //... 8 more ... $section[] = array(0, 1, 2, 9, 10, 11, 18, 19, 20); //top left block //... 8 more ...
Also, some default actions (methods) can be added to this model (class). A way to set a cell to a value, one to set a possibility to false and one to count instances of possible locations for a digit in a section. They can call each other to implement recursion:
- setCellValue() calls removeFromCell() for all other cells for the same value
- removeFromCell() calls countPossibleLocations() for that value on every section in which that cell occurs, and if the count equals one
- countPossibleLocations() calls setCellValue() for that value
Solving loop
When you defined the model and filled out some initial digits the algorithm should include a loop searching for the solution. I have included a ‘changed’ property to the sudoku model. This boolean is initially set to true. The first line in the while(sudoku.changed) loop is setting this boolean to false. Next, when iterating over every section trying to find solutions on every change (setting or deleting a number from a cell), this boolean is set to true causing another iteration to occur (a kind of recursion).
Now we have a setup where we can add solve strategies to the loop. In this article, the first strategy will be explained.
Simple elimination strategy
This first strategy to implement in the section iteration is elimination. This is pretty straight forward. As a digit can only occur once in a section (row, column or block), other cells still having this digit set to true can toggle it. This iterative process continues while this resolves possibilities in a single cell to be unique for a section, all other possibles have been set to false.
In Sudoku Logic – part II more strategies will be discussed. Also some sudoku extensions, to simulate sudoku variants and a live demonstration of an implementation of this algorithm in PHP will be included.
#1 by Bart - May 9th, 2009 at 21:58
I’de like to see this one :)