The sudoku doesn't fit in this scenario, the combinatorial complexity of sudoku is way too high for a neural network even if you add many layers to it, it is a totally different problem in its own right. You simple can't "regress" the right values of a perfect sudoku here, they are not numbers like "pixel " intensities in images.