Generating Sudoku Boards pt. 1: Algorithm & Structures

Part my Sudoku Board Generation Series:

I was looking for non-trivial problem as a foil for looking at web assembly, and I decided to write a Sudoku puzzle solver. I ended up writing solutions in Typescript, C++, and Rust. I’ll talk about the differences in implementation and what I learned in future posts. In this post I’ll talk about how I built build a Sudoku puzzle generator, which could easily be converted into a Sudoku puzzle solver

Sudoku has a bunch of different varieties. In the most common a board of 81 cells is divided in to 9 rows, columns and blocks. Each row is 9 cells wide, 1 tall. Each column is 1 cells wide, 9 tall. Each block is 3 cells at a side. The game starts with some of the cells filled in with numbers, and the goal is to fill the remaining cells without repeating a number within a row, column, or block.

One effective strategy is to look at each cell in turn and then try to eliminate possible numbers. The more neighbors (which I define as cells that are in the same row, column, or block) that a cell has filled in the fewer possibility options that cell has. If a cell only has one valid possible move, is can be filled. As cells get filled it’s neighbors get fewer and fewer options, and more cells can be filled.

More difficult puzzles require a bit more work as there aren’t enough cells filled in for this simple method.

Sudoku Board — By Tim Stellmach, CC0

Sudoku Solution — By Tim Stellmach, CC0

My initial plan was to work from an existing set of Sudoku puzzles, and implement a solver with those inputs. So, the first thing I did was try to find a test set of a Sudoku boards that would be easy to digest into a command line app.

When I failed to find such a data-set, I shifted gears and started working on a Sudoku puzzle generator. In this post I’ll be showing some truncated source from the C++ implementation. You can see the source repository here. The Typescript and Rust implementations are slightly different, and I’ll be going into those differences in future

Rough Naive Algorithm

Start at first cell
Pick a value that isn’t present in any neighbors
If at last cell, exit
Other wise go to next cell
Go to step 2

The above ended up being insufficient, but gave me a good starting point.

##Structure

Now that I had a goal in mind. I started laying out the structure of the algorith. I needed to have some notion of the a Sudoku board, it’s constinuent cells, and their relationships.

Coord

struct coord {
    int i = 0;
    int j = 0;
};

// equality comparison between two coors
inline bool operator == (const coord &lhs, const coord &rhs) {
    return lhs.i == rhs.i && lhs.j == rhs.j;
}

// less then comparison between two coords
inline bool operator < (const coord &lhs, const coord &rhs) {
    return lhs.i < rhs.i || (lhs.i == rhs.i && lhs.j < rhs.j);
}

The coord struct defines two members i, and j, to represent to two position dimensions.

After the struct the definitaions of the structs, we can see two functions that define comparison operations between two structs. Basically when using < and == the program looks at the dimensions of the struct instead of just the address.

SudokuCell and it’s neighbors

class SudokuCell {
public:
    coord pos;
    set<coord> neighbors{};
    int value = 0;

    // set position and generate neighbors
    void setPosition(coord pos);

private
    // create the set of coords associated with neighboring cells
    void generateAllNeighbors();
}

The SudokuCell has three members: position, value, neighbors. The position is the coord representing it’s place inside the board. The value of a cell is the number placed inside of it. The neighbors member is only slightly more complicated.

# neighbors
----------------------------------
|          | 7        |          |
|          | 7        |          |
|          | 7        |          |
----------------------------------
| 7  7  7  | 6  7  7  | 7  7  7  |
|          | 7  7  7  |          |
|          | 7  7  7  |          |
----------------------------------
|          | 7        |          |
|          | 7        |          |
|          | 7        |          |
----------------------------------

The neighbors are stored in an std::set. This data structure is nice because it doesn’t allow duplicate elements. sets are implemented as binomial search trees, and they utilize the < and = operators to define position and uniqueness. But why use std::set

In figure neighbors.txt, all of the cells marked with the value 7 are the neighbors of the cell holding the value 6 at position {3,3}. There are four cells (5 when counting {3,3}) that are in both the block and either a row or column. By using a std::set we get uniqueness for free.

SudokuBoard

using cellQueue = queue<shared_ptr<SudokuCell>>;

class SudokuBoard {
    cellQueue cells;

public:
    // initialize board
    SudokuBoard(){/* ... */}
    // fill the board with valid solution
    void fillCells(){/* ... */}

    // take index and return 2d coord
    coord resolveIndex(int index) {
        return coord{
                index / SIZE,
                index % SIZE
        };
    }

    //Take coord and return index
    int resolvePosition(const coord &position) {
        return position.i * SIZE + position.j;
    }

    // get cell by index
    shared_ptr<SudokuCell> at(int index) {/* ... */}
    // get cell by coord
    shared_ptr<SudokuCell> at(coord position) {/* ... */}
};

The SudokuBoard has one member property and multiple member methods.

The cellQueue is structured as a std::queue because the members are create in order when the board is initialized. Because the queue is fifo, taking the first element off until the cellQueue is empty will visit each node in order. Also the SudokuCell items are stored as std::shared_ptr meaning that the reference can be shared, and mutating the value will be reflected on the board.

I highly recommend looking at smart pointers if you haven’t look at C++ for a while. They are a massive improvement.

The fillCells method performs the algorithm above on the board.

There are two utility methods, and two access methods. The resolveCoord and resolvePosition methods convert between the 2D coord and the 1D index that can be used with std::queue. It is possible to use a 2D array, but they are highly inconvenient, and don’t provide the features and interoperability that the STL containers do. I found converting the coordinates to be easier than wrangling a multidimensional structure.

The two polymorphic at methods allow access via coord or usize index.

Now that all the ingredients are together, it’s time to start cooking.

Implementation Details

Any non trivial application has lots of interesting implementation details that come up. In this section I’ll be talking about two that stood out to me

----------------------------------
|          | 7        |          |
|          | 7        |          |
|          | 7        |          |
----------------------------------
| 7  7  7  | 6  7  7  | 7  7  7  |
|          | 7  7  7  |          |
|          | 7  7  7  |          |
----------------------------------
|          | 7        |          |
|          | 7        |          |
|          | 7        |          |
----------------------------------

Recall the position of neighbors relative to the cell. Finding the neighbors in a cells collumn is relatively easy. If the current cell is in position {i, j} simply visit each coordinate {i, n} and {n, j}, where n is in the set of integers between 1 and 9 (0 and 8 when 0-indexed). In the source below, we can see that these two groups are handled in a single loop.

The neighbors found in the same block are only slightly more complicated. We know that the blocks are 3 to a side, but what index should we start at.

The blocks start with every third row and every third collumn. We need get the first index of the block holding the current cell. To do that we need to know what block we are in. To get the index of the block relative to the other blocks, we divide the current cells position components by the block size, and leave off the remainder. Because the indexes are of type usize we can use the / operator. Now we multiply the components of resulting position by 3 to get the coordinates of first cell in the block.

Once we have the position of the first cell, we just have to include all the cells within the next two columns and next two rows.

class SudokuBoard {
    void generateAllNeighbors () {
        // generate row & col neighbors
        for (int n = 0; n < 9; ++n) {
            neighbors.insert({n, pos.j});
            neighbors.insert({pos.i, n});
        }

        auto iFloor = (pos.i / 3) * THIRD;
        auto jFloor = (pos.j / 3) * THIRD;

        // generate cell neighbors
        for (int n = iFloor; n < iFloor + 3; ++n) {
            for (int m = jFloor; m < jFloor + 3; ++m) {
                neighbors.insert({n, m});
            }
        }
    }
}

Available Cell Value Options

Determining what values are available is a two step process. We need to find out what values have been used by the neighbors. Then we need to find the set differnce between the full set of valid options, and the set of values used by the neighbors.

bool fillCells() {
    for(auto cell : this->cells()) {

        // get first cell and tail
        auto cell = remainingCells.front();
        remainingCells.pop_front();

        set<int> neighborValues = {};

        for(auto &neighbor : cell->neighbors) {
            auto value = this->at(neighbor)->value;
            neighborValues.insert(value);
        }

        vector<int> options;
        set_difference(VALID_VALUES.begin(), VALID_VALUES.end(),
                        neighborValues.begin(), neighborValues.end(),
                        inserter(options, options.begin())
        );
        random_shuffle(options.begin(), options.end());
        cell->value = options[0];
    }
}

I like this solution because of the economy of processing involed. Lines 11 through 22 succinctly pack two fairly complex operations into a relatively comprehensible form.

Lines 9 though 14 fills the set of neighborValue using a simplified iterator loop (introduced in C++11). Because neighborValue is an std::std, we don’t have to check for duplicates later!

Lines 16 through 21 show how a random value is selected. The values found in the set of neighbors is subtracted from VALID_VALUES using std::set_difference, which performs a set difference operation on two sets (defined by begin and end iterators) and stores the result in the third parameter (a set wrapped in an std::inserter function). This means an insert operation will be used to add the values to the esult set.

The STL takes a lot of time investment, but it is very flexible, and the but used in a lot of creative ways.

Failure

Dissapointingly, but not unexpectedly, the first interation of the algorithm is a failure. It terminates before reaching the final cell.

By printing out the state of the board after every cell is marked, we are able to see the intermediate state.

Looking at the output, it appears as if the algorithm works for quite a few iterations. The first 5 rows are filled correctly. But, once we reach the second column of the the sixth row we run into trouble. Why?

Looking at the neighbors of the cell at position {5, 1}: We get the values {9, 3, 6, 4, 1} from the column, the value {8} from the row, and the values {5, 4, 2, 6, 1, 7, 8} from the block. This means that the neighbors of this cell have exhausted the VALID_VALUES. There are no values left for the current cell.

So how can we fix the algorithm?

----------------------------------  ----------------------------------
| 7  9  1  | 6  8  4  | 3  2  5  |  | 7  9  1  | 6  8  4  | 3  2  5  |
| 4  3  8  | 5  2  7  | 9  1  6  |  | 4  3  8  | 5  2  7  | 9  1  6  |
| 2  6  5  | 3  1  9  | 8  4  7  |  | 2  6  5  | 3  1  9  | 8  4  7  |
----------------------------------  ----------------------------------
| 5  4  2  | 7  9  3  | 1  6  8  |  | 5  4  2  | 7  9  3  | 1  6  8  |
| 6  1  7  | 2  4  8  | 5        |  | 6  1  7  | 2  4  8  | 5  9     |
|          |          |          |  |          |          |          |
----------------------------------  ----------------------------------
|          |          |          |  |          |          |          |
|          |          |          |  |          |          |          |
|          |          |          |  |          |          |          |
----------------------------------  ----------------------------------

----------------------------------  ----------------------------------
| 7  9  1  | 6  8  4  | 3  2  5  |  | 7  9  1  | 6  8  4  | 3  2  5  |
| 4  3  8  | 5  2  7  | 9  1  6  |  | 4  3  8  | 5  2  7  | 9  1  6  |
| 2  6  5  | 3  1  9  | 8  4  7  |  | 2  6  5  | 3  1  9  | 8  4  7  |
----------------------------------  ----------------------------------
| 5  4  2  | 7  9  3  | 1  6  8  |  | 5  4  2  | 7  9  3  | 1  6  8  |
| 6  1  7  | 2  4  8  | 5  9  3  |  | 6  1  7  | 2  4  8  | 5  9  3  |
|          |          |          |  | 8        |          |          |
----------------------------------  ----------------------------------
|          |          |          |  |          |          |          |
|          |          |          |  |          |          |          |
|          |          |          |  |          |          |          |
----------------------------------  ----------------------------------

[1] 13821 segmentation fault (core dumped)

Backtracking Forward

Adjusting the algorithm

Now we know that the algorithm can work itself into blind alleys. Either we need to know how to avoid those situations, or we need to know how to retrace our steps and escape.

To resolve this situtuation, I decided to utilize backtracking and recursion.

When creating a recursive algorithm, it’s helpful to start by thinking about base cases, what. The base cases used in the this implementation as follows:

Finish Successfully if there are no remaining cells to mark.
Backup if there are no viable options.
Otherwise, continue to the next cell.

Adjusting the Implementation

Bellow are the updates to fillCells that were made to implement the recursive algorithm.

using cellQueue = std::deque<SudokuCell>;

class SudokuBoard {
    /*
     * fill the board with valid solution
     */
    void fillCells() {
        auto remaininCells = cellQueue{cells};

        if(!doFillCells(remaininCells)) {
            cout << "Unable to fill board" << endl;
        }
    }

    bool doFillCells(cellQueue &remainingCells) {
        // get first cell and tail
        auto cell = remainingCells.front();
        remainingCells.pop_front();

        // collected from all the neighbor values
        set<int> neighborValues = {};
        // produced by set difference VALID_VALUES \ neighbord values
        vector<int> options;

        for(auto option : options) {
            cell->value = option;

            // base case (no remaining cells)
            if (remainingCells.empty()) {
                return true;
            }

            // recursive step
            if (doFillCells(remainingCells)){
                return true;
            };
        }

        // base case (push cell back to set, and backtrack)
        cell->value = 0;
        remainingCells.push_front(cell);
        return false;
    }
}

The cellQueue type has been changed to a std::deque in order to support recursion.

doFillCells starting with a full copy of the remaining cells, the first thing done is to pop the first cell off the front of the deque. Then the value is set on that cell (same as before). But then we can see the first difference on line 25. Now we are going loop over and try (potentially) every one of the options.

On line 29 we can see one of the base cases. This will be triggered if we’ve hit the end of the board and were able to fill all the cells. Returning true here, will return true all the way back through the call stack.

On line 34, we can see the recursive step. doFill is called again, but without the current cell. If this step returns false, the loop will go around again. If the recursive step returns true then return true back up the call stack. If there are no more options to try, then the loop exits.

One line 38 the other base case. If there are no options left to try, the current cell is pushed back onto the deque, the value is reset to 0, and doFill returns false.

Lets see it in action.

We can see that at line 42 the algorithm backtracks from position {7,6} to {7,5}. And then on line 57 the algorithm backtracks three spots from {7,6} to {7,4} before continuing.

The final result of this run can be seen bellow.

----------------------------------
| 3  6  9  | 1  2  4  | 5  8  7  |
| 7  2  8  | 6  5  9  | 3  1  4  |
| 1  4  5  | 7  3  8  | 2  6  9  |
----------------------------------
| 2  9  7  | 3  6  1  | 8  4  5  |
| 5  8  3  | 9  4  2  | 6  7  1  |
| 6  1  4  | 5  8  7  | 9  2  3  |
----------------------------------
| 9  7  2  | 8  1  5  | 4  3  6  |
| 4  5  6  | 2  7  3  | 1  9  8  |
| 8  3  1  | 4  9  6  | 7  5  2  |
----------------------------------

Conclusion

So there we have it. We’ve worked from conception to first iteration, to second interation with recursion. We’ve seen how to create a program to fill a sudoku board. Interstingly, this will also solve any Sudoku board. Think about it! All that would change is that initially there will be some non zero neighbor values. The algorithm has no knowledge of the position of neighbor values. So, it can handle any Sudoku board without modification.

Future Topics

Implementation Comparison: As mentioned before, I have implemented this program in Rust, C++, and Typescript. In a future post I want to answer a couple questions. How the implementations differ? What parts felt natural and unatural in each language? And after figuring out a web assembly build, how do the build pipelines and performance compare?

Complexity and optimization: You may have guess this already, it’s not necessary for the algorithm to visit all the neighbor cells will filling. I’ll be talking about how this optimization affects the complexity of the algorithm.

Creating games with unique solutions: An interesting problem would be to extend this to create sudoku boards with unique solutions. That is, how can we remove cell values without changing the possible values that other cells can take on?