Looking forward into the past
- Published on
- ∘ 34 min read ∘ ––– views
Previous Article
Intro
Conway's Game of Life is perhaps the most famous instance of a system of deterministic cellular automata. Defined by a few simple rules, the Game of Life gives rise to a depth of beautiful interactions.
A typical Game of Life interaction –for the uninitiated who don't look at binary grids in terms of potential glider factories– involves setting a (sometimes) intentionally designed initial configuration and then watching it grow and decay into the residual, if any, periodic blobs to sate the question "I wonder what this will turn into."
In this post, I'm interested in answering the reverse question: what did this originate from. This question proved to be surprisingly difficult.
Cellular Automata: Conway's Game of Life
The game is deterministic in the forwards direction, meaning you and I can start with two copies of the same board state and independently compute the next identical board states that follow.
The beauty of Conway's Game of Life lies in the simplicity of the rules, behind which lie intricate and powerful possibilities such as games of Tetris,1 Turing machines,2 simulating the Game of Life itself3 etc.
You can tinker around with it to get a feel for how it behaves here, or continue reading for a rote summary.
The rules are as follows:
- Sustain: [living] cells with 2-3 neighbors survive
- Succumb: cells with < 2 neighbors die out
- Starve: cells with > 3 neighbors die out
- Spawn: dead cells4 with exactly 3 neighbors die out
which can be simplified to just two rules:
- In order to remain living, a cell must have 2-3 living neighbors
- In order to be born, a dead cell must have exactly 3 living neighbors
We can observe these transition rules with some well-known CGoL constructs such as gliders:
This nifty lil shape is self-repeating on a 4-cycle which also shifts itself down and to the right by one unit. There are many such self-replicating configurations, many of which are far more complex, such as constructs which produce gliders5, and constructs which produces constructs which produce gliders6, etc.
so it's like sudoku?
Since the game is deterministic, every configuration has a singular successor state, but the transition rules are not bijective. In other words, multiple states might all produce the same successive state .
but if we wanted to go the opposite direction taking a step backward in discrete time , the problem is no longer deterministic since any number of previous configurations could have produced configuration not just making the problem probabilistic, but insanely dense:
This is a byproduct of the size of the problem space. We can visually enumerate all possible board states for various sizes of boards (or any system of binary variables for that matter) as graphs where vertices represent each cell's state.
For a 1x1 board, the "graph" is just a line connecting the unit on/off states:
For a 2x1 board (a system with two variables), we add another dimension and our line becomes a square:
A 3x1 board can be depicted as a cube:
We can repeat this pattern, adding a new dimension for each of the variables we wish to model, requiring vertices to enumerate all possible board states. The graph representation of even just a 4-variate system is a tesseract. Organizing the vertices in this manner helps illustrate their distance from one another as the number of edges separating any two vertices. Note that the delta between any two vertices connected by an edge is just one variable.
Can you guess what the graph of a system of five variables looks like?
And since finding a valid previous configuration in this search space might entail flipping the values of several of the variables, the relative distance between two states of Conway's Game of Life according to the reversed transition rules might be prohibitively large. Furthermore probing this search space is an all or nothing gambit: we get no indication about whether or not the flips we might make form a valid configuration until a solution is achieved, kind of like sudoku.
Okay, so it's like multidimensional sudoku
For dense and large hyper-problem space like this, we might take a random step to see if that brings us closer to a desired/valid target state and repeat exploration by just evaluating the "fitness" of a fraction of the total possible states sparing us several FLOP-years of computation. However, using a technique like gradient ascent fails for this search space because, while we might be able to arbitrarily flip some grid cells to optimize for a fitness score along the lines of where is the path connecting them on a system-graph, we can easily get stuck in local maxima without ever being able to traverse an immeasurably/unknowably large valley of low-fitness scores between an arbitrary starting state and any of the valid previous board states.
The analogy of sudoku is low-fidelity as well since that assumes that we can even know the correct path to take between states. In fact, the two-dimensional (or any -dimensional) projection of the configuration hyperspace for the Game of Life has so many local maxima that it's inordinately difficult/computationally prohibitive to explore the troughs separating maxima with an -greedy optimizations that don't regress to random walks.
While there are some mechanisms to eliminate swaths of problem space if our algorithm were to summit a local maxima that did not match the criteria of a target solution, these techniques are also highly susceptible to erroneously nuking the narrow paths between maxima in the process.
Okay so it's like multidimensional sudoku, but actually a lot harder
The problem is hard. NP-hard in fact, which though at first seems daunting, actually represents a glimmer of hope.
NP-Hardness
As a brief review of vs. (see also the lengthier corollary sections of this post), -hard problems are those that are upper-bounded by an exponential function (rather than a polynomial function).
If we compare the graphs of two functions on :
we can observe that, for some constants , the polynomial is larger than the exponential , but after some point in time (or for our case, some sufficiently large number of inputs), effectively becomes a vertical wall while 's slope climbs much more steadily.
The glimmer of hope arises out of the fact that all -complete problems are reducible to any other problem in in a polynomial amount of time. In other words, after a certain number of inputs, the polynomial cost of translating our problem into a known problem which has a well studied solution we can appropriate well become negligible.
And while we're still haunted by the specter of early-onset infinity in the limit of the exponential class of problems, we can do well in minimizing the constants parameterizing the exponential which can be interpreted as:
- : setup/translation time
- : time to verify a result
- : search efficiency
"but Peter, you said and are constant terms, but the definition of -completeness says a solution can be verified in polynomial time, so which is it?"
Choosing the right problem
When broaching a new topic such as this, it is useful to refer to Sacred Scripture for insight. That's right The Art of Computer Programming8 by the don himself:
The canonical NP-complete problem is boolean satisfiability which seeks to find a set of values for a given expression in propositional logic restricted to zeroth-order binary operators ( which yield ~all of Boolean algebra) which satisfies a set of constraints.
For example, consider a dinner party with twelve guests through that need to be distributed between two 6-tops, together with the following requirements:
Amalfitano is Borges' plus one, so they must be sat together. Similarly, Joyce and Kafka both selected each other for their responses to the icebreaker questionnaire about "who they'd want to sit next to at a dinner party." Erdős and Faulkner have to sit next to each other because it would be funny. Distributing the guests alphabetically around the two tables meets these requirements:
Golding and Hofstatder are Erdős and Faulkner's students who will cause a fuss if left to their own devices, so they have to sit at the same table as their patrons. Simple enough, we can just swap Clarke with Hofstatder, and Golding with Dante:
Italo has to sit at the first table due to a food allergy, and either Dante or Erdős has to accompany him at that table since Italo didn't wear pants with pockets, so they're each holding one of his epipens. Again we can make some swaps to accommodate his non-functional wardrobe decisions:
Dante can't sit at the same table as Borges because they'll curse at each other in Portuguese (and it's not the cussing that's so much the problem as the fact that it would exclude everyone else at the dinner party), so we have to move one of them, how about Dante:
And lastly, Dante, Erdős, and Italo can't all sit at the same table because of their clashing views on the Balkans which –no matter how unbelievable– would actually be harder to manage than an -hard problem:
So, we bend over backwards swapping Borges with Hofstatder, Golding with Joyce, Kafka with Erdős, and Amalfitano with Faulkner:
but in doing so, we've placed Dante within a cuspir distance Borges again. Better fix that before either of them pens a new circle of hell for the other:
And this gives us our final configuration:
Everyone happy?!
Solving problems like this by hand with more than a dozen variables becomes incredibly tedious (especially sans anthropomorphic variable mnemonics). Hell, it's tedious even with a computer, but luckily there are libraries9 which expedite the process of guess-and-checking. If we can find a polynomial time translation of our problem of reverse-stepping through the Game of Life into an instance of the Boolean Satisfiability problem, we can leverage such a library to find solution(s) for us.
Code
We'll start with a simple class modeling a grid system, a pretty representation of it, and some QoL helpers:
import numpy as np
class Board():
def __init__(self, width: int, height: int):
self.width, self.height = width, height
self._board = np.zeros((width, height))
self.t = 0
def __getitem__(self, x):
return self._board[x]
def __repr__(self):
padding = max(len(str(len(self._board))), 2)
result = "".ljust(padding + 1)
for i in range(len(self._board)):
result += f"{i}".center(padding)
result += f"\tt: {self.t}\n"
for x in range(self.width):
result += f"{x}".rjust(padding) + " "
for y in range(self.height):
if self[x][y] == 1: result += "■".center(padding)
else: result += "▢".center(padding)
result += "\n"
result += "\n"
return result
if __name__ == "__main__":
b = Board(3, 3)
print(b)
This produces a nice game state visualization for boards of various sizes:
$ python seagull.py
0 1 2 t: 0
0 ▢ ▢ ▢
1 ▢ ▢ ▢
2 ▢ ▢ ▢
and also works for huge boards:
We can add some static class methods to allow us to construct Board
s from existing np
arrays:
def from_array(arr):
w, h = arr.shape
b = Board(w, h)
b._board = np.array([row[:] for row in arr], dtype=int)
return b
def make_glider(n):
if n < 3: raise Exception("n too small to make a glider")
b = np.zeros(n * n, dtype=int).reshape((n, n))
b[1][0] = 1
b[2][1] = 1
b[0][2] = 1
b[1][2] = 1
b[2][2] = 1
return Board.from_array(b)
which produces some convenient starting states for us to work from:
$ python seagull.py
0 1 2 3 t: 0
0 ▢ ▢ ■ ▢
1 ■ ▢ ■ ▢
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
Implementing a forward state via the transition rules is simple enough as well:
def count_neighbors(self, x, y):
"""counts the number of living cells surrounding the cell at `x`, `y`"""
total = 0
for i in [-1, 0, 1]:
for j in [-1, 0, 1]:
if (x + i <= self.width and x + i >= 0) and \
(y + j <= self.height and y + j >= 0):
total += self[x + i][y + j]
return total - self[x][y]
def compute_next(self):
"""computes the next state for the current board"""
nxt = np.array(self._board, dtype=int)
for x in range(self.width):
for y in range(self.height):
state = self[x][y]
neighbors = self.count_neighbors(x, y)
if state == 0 and neighbors == 3:
nxt[x][y] = 1
elif state == 1 and (neighbors < 2 or neighbors > 3):
nxt[x][y] = 0
else:
nxt[x][y] = state
def step(self):
"""takes one step forwards"""
self._board = self.compute_next()
self.t += 1
def play(self, n):
"""takes `n` steps forwards"""
print(self)
for _ in range(n):
self.step()
print(self)
Now the hard part, taking steps backwards...
Z3
Let's quickly take a detour to crash course our way through Z3's python API. I'll re-use the above dinner party example problem to illustrate how we can use symbolic logic propositions to model the problem and then find a solution.
from z3 import Bools, Or, Not, And, Solver, PbEq
"""
A = B
J = K
E = F
G = F, G = E
H = F, H = E
I = True
D or E
not (D = B)
not (D and E and I)
(implicitly): A + B + C + D + E + F + G + H + I + J + K + L = 6 since there are only 6 seats at each table
"""
variables = Bools("A B C D E F G H I J K L")
A, B, C, D, E, F, G, H, I, J, K, L = variables
solver = Solver()
solver.add(A == B)
solver.add(J == K)
solver.add(E == F)
solver.add(G == F)
solver.add(G == E)
solver.add(H == F)
solver.add(H == E)
solver.add(I == True)
solver.add(Or(D, E))
solver.add(Not(D == B))
solver.add(Not(And(D, B, I))
solver.add(PbEq([(v, 1) for v in variables], 6))
print(solver.check(), solver.model())
This produces the following output where the truthiness associated with any individual corresponds to which table they can be seated at:
$ python seating.py
sat [C = False,
L = False,
K = False,
D = True,
G = True,
B = False,
J = False,
A = False,
E = True,
I = True,
H = True,
F = True]
But wait, this doesn't look like our solution from earlier, what's up with that? It's entirely possible (and indeed preferable) that Z3 might generate a different solution based on some seeded initial guesses it makes. This is desirable for us since, ideally, I'd like to compute a backwards step for a glider, but don't want one of the dumb trivial solutions.
We need a way to generate all possible solutions to a problem, and the straightforward way to do this is by adding the current solution as a constraint to be passed into the solver, thus preventing it from deeming the current solution as valid any more.
My initial naive implementation simply added every variable in the solution's model as another constraint and repeating till the solver failed to find a solution, but as Bjørner points out, this is inefficient since doing so discards several intermediate, internal lemmas which may still be useful for computing other solutions. In the book Programming Z3 which he coauthored, published by Microsoft Research (responsible for Z3),10 we can do better:
def all_smt(s, initial_terms):
def block_term(s, m, t):
s.add(t != m.eval(t))
def fix_term(s, m, t):
s.add(t == m.eval(t))
def all_smt_rec(terms):
if sat == s.check():
m = s.model()
yield m
for i in range(len(terms)):
s.push()
block_term(s, m, terms[i])
for j in range(i):
fix_term(s, m, terms[j])
yield from all_smt_rec(terms[i:])
s.pop()
yield from all_smt_rec(list(initial_terms))
We can also tidy up our evaluation to partition the two solutions by table, and omitting all such intermediary variables about which we don't necessarily care:
for i, sol in enumerate(list(all_smt(solver, variables))):
print("solution ", i)
print("\ttable 1: ", sorted(list(filter(lambda v: sol[v] == True, sol)), key=lambda c: str(c)))
print("\ttable 2: ", sorted(list(filter(lambda v: sol[v] == False, sol)), key=lambda c: str(c)))
and then observe that the second solution z3 generates matches the one in my diagram:
$ python seating.py
solution 0
table 1: [D, E, F, G, H, I]
table 2: [A, B, C, J, K, L]
solution 1
table 1: [C, D, I, J, K, L]
table 2: [A, B, E, F, G, H]
Using Z3 to time travel
Returning to our Game of Life reverse-solver, we can now implement the spicy method compute_previous
:
def compute_previous(self):
# an array of apt dimension containing symbolic variables for Z3 to compute a valid, previous board state
target = np.array([[Int(f"t_{r}_{c}") for r in range(self.width)] for c in range(self.height)])
# an "intermediate" array of similar dimension to contain the number of neighbors each cell is allowed to have per the transition rules
neighbors = np.array([[Int(f"n_{r}_{c}") for r in range(self.width)] for c in range(self.height)])
solver = Solver()
for x in range(self.width):
for y in range(self.height):
curr_cell = self[x][y]
target_cell = target[x][y]
num_nieghbors = neighbors[x][y]
# every cell can only be a 0 or a 1
solver.add(Or(target_cell == 1, target_cell == 0))
# compute allowable living nieghbors
neighbors_allowed = -target_cell # we don't want to count the current cell as a neighbor
for i in [-1, 0, 1]:
for j in [-1, 0, 1]:
if (x + i <= self.width-1 and x + i >= 0) and \
(y + j <= self.height-1 and y + j >= 0):
neighbors_allowed += target[x + i][y + j]
solver.add(neighbors[x][y] == neighbors_allowed)
# encode the transition rules as SL statements
if curr_cell == 1:
alive_rule = Or(num_nieghbors == 3, And(target_cell == 1, num_nieghbors == 2))
solver.add(alive_rule)
else:
dead_rule = Or(
And(target_cell == 0, Not(num_nieghbors == 3)),
And(target_cell == 1, Or(num_nieghbors < 2, num_nieghbors > 3))
)
solver.add(dead_rule)
# naively return the first satisfiable solution
if solver.check() == sat:
target_model = sorted(list(filter(lambda v: "t_" in str(v), sol)), key=lambda v: str(v))
target_vals = [sol[v].as_long() for v in target_model]
result = np.reshape(np.array(target_vals), (self.width, self.height))
print(result)
return result
For our first pass, we'll just return the first valid solution that Z3 generates. But we can see it shits out something that doesn't look particularly glider-y:
δ python seagull.py
taking 1 steps forward from t=0
0 1 2 3 t: 0
0 ▢ ▢ ■ ▢
1 ■ ▢ ■ ▢
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
0 1 2 3 t: 1
0 ▢ ■ ▢ ▢
1 ▢ ▢ ■ ■
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
taking 1 steps backwards from t=1
0 1 2 3 t: 1
0 ▢ ■ ▢ ▢
1 ▢ ▢ ■ ■
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
0 1 2 3 t: 0
0 ■ ■ ▢ ▢
1 ▢ ▢ ▢ ■
2 ■ ▢ ■ ▢
3 ▢ ■ ▢ ▢
Ideally, we'd like to filter out that solution. To do so, we can throw that hoe into the all_smt
block and print all possible solutions:
def __all_previous(self, s, initial_terms):
def block_term(s, m, t):
s.add(t != m.eval(t))
def fix_term(s, m, t):
s.add(t == m.eval(t))
def all_smt_rec(terms):
if sat == s.check():
m = s.model()
yield m
for i in range(len(terms)):
s.push()
block_term(s, m, terms[i])
for j in range(i):
fix_term(s, m, terms[j])
yield from all_smt_rec(terms[i:])
s.pop()
yield from all_smt_rec(list(initial_terms))
and then back at the bottom of compute_previous
, we'll iterate over all possible solutions. But we still need to choose one. A heuristic I selected which works for this goal is to select the candidate which has the minimum number of living cells:
def get_best(candidates):
best = None
best_sum = np.Inf
for c in candidates:
curr_sum = np.sum(c)
if best is None or curr_sum < best_sum:
best = c
best_sum = curr_sum
return best
initial_terms = target.flatten().tolist() + neighbors.flatten().tolist()
models = list(self.__all_previous(solver, initial_terms))
candidates = [self.model_to_board(b) for b in models]
best = get_best(candidates)
return best
and now we can step forwards and backwards through time to our heart's content:
if __name__ == "__main__":
b = Board.make_glider(4)
b.play(1)
b.rewind(1)
b.play(1)
b.rewind(1)
produces:
$ python seagull.py
taking 1 steps forward from t=0
0 1 2 3 t: 0
0 ▢ ▢ ■ ▢
1 ■ ▢ ■ ▢
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
0 1 2 3 t: 1
0 ▢ ■ ▢ ▢
1 ▢ ▢ ■ ■
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
taking 1 steps backwards from t=1
0 1 2 3 t: 1
0 ▢ ■ ▢ ▢
1 ▢ ▢ ■ ■
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
0 1 2 3 t: 0
0 ▢ ▢ ■ ▢
1 ■ ▢ ■ ▢
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
taking 1 steps forward from t=0
0 1 2 3 t: 0
0 ▢ ▢ ■ ▢
1 ■ ▢ ■ ▢
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
0 1 2 3 t: 1
0 ▢ ■ ▢ ▢
1 ▢ ▢ ■ ■
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
taking 1 steps backwards from t=1
0 1 2 3 t: 1
0 ▢ ■ ▢ ▢
1 ▢ ▢ ■ ■
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
0 1 2 3 t: 0
0 ▢ ▢ ■ ▢
1 ■ ▢ ■ ▢
2 ▢ ■ ■ ▢
3 ▢ ▢ ▢ ▢
For larger boards, we might want to save each verboten board state so we don't have to compute them on the fly. For the sake of saving space, we can also minify each board state into a single number whose binary representation expands to the desired board. We can then save and reference these during the "ban" phase of compute_previous
:
def get_id(board: np.ndarray):
return str(board.flatten().dot(2 ** np.arange(board.size)[::-1]))
def decode_id(id: str, w: int, h: int):
b_id = int(id)
leading_zeros = w * h
bin_str = format(b_id, f"0{leading_zeros}b")
as_arr = np.array([int(bit) for bit in bin_str]).reshape((w, h))
return as_arr
def save_states(boards, t:int, path="forbidden_states/", debug=True):
if debug:
for board in boards:
w, h = board.shape
b_id = Board.get_id(board)
fname = f"t{t}_({w}x{h})_{b_id}.npy"
np.savetxt(f"{path}{fname}", board, fmt="%s")
else:
w, h = boards[0].shape
fname = f"t{t}_({w}x{h})_all.txt"
with open(f"{path}{fname}", "a") as f:
for board in boards:
b_id = Board.get_id(board)
f.write(f"{b_id}\n")
def load_states(self, path="forbidden_states/"):
w, h = self._board.shape
boards = [i for i in range(10)]
for fname in os.listdir(path):
if f"({w}x{h})" in fname and "_all" in fname:
t = int(fname[1:2])
boards[t] = []
with open(f"{path}{fname}") as f:
lines = f.readlines()
for line in lines:
if line.strip() != "":
boards[t].append(Board.decode_id(line, w, h))
return boards
And we can drop these into the constuctor and compute_previous
methods respectively:
def __init__(self, width: int, height: int):
self.width, self.height = width, height
self._board = np.zeros((width, height), dtype=int)
self.t = 0
self.forbidden = self.load_states()
def compute_previous(self):
# ...
initial_terms = target.flatten().tolist() + neighbors.flatten().tolist()
models = list(self.__all_previous(solver, initial_terms))
candidates = [self.model_to_board(b) for b in models]
def get_best(candidates):
best = None
best_sum = np.Inf
for c in candidates:
# print("curr_sum: ",curr_sum)
curr_sum = np.sum(c)
if best is None or curr_sum < best_sum:
best = c
best_sum = curr_sum
return best
# add the rejected candidates to the blacklist
best = get_best(candidates)
best_id = Board.get_id(best)
rest = [c for c in candidates if Board.get_id(c) != best_id]
Board.save_states(rest, self.t-1, debug=False)
return best
naturally, we might want different previous states than those selected by the simplistic get_best
heuristic. Alternate iterations of my implementation involved displaying each candidate and having the user select which one they wanted, but that too became tedious for larger boards. The logical next step would be to build out a DP solution which backtracks through candidates according to the maximum number of non-cyclic previous states (see also Eden states11) to see how far backwards in time one could go, but I'm satisfied with this solution and also too depressed to keep throwing brain power at this.
source code here: https://github.com/MurphyPone/game-of-life-reverse
Footnotes
Footnotes
https://codegolf.stackexchange.com/questions/11880/build-a-working-game-of-tetris-in-conways-game-of-life ↩
- ↩
Dead cells mentioned
https://www.offconvex.org/2018/11/07/optimization-beyond-landscape/ ↩
Knuth E., Donald. "The Art of Computer Programming Volume 4B: Combinatorial Algorithms Part 2." Addison-Wesley, 2015. ↩
I'll use Z3, an SMT solver ↩
Bjørner, Nikolaj et al. "Programming Z3." Microsoft Research, 2019. ↩
https://en.wikipedia.org/wiki/Garden_of_Eden_(cellular_automaton) ↩