Continuity

Consider an N × N N \times N grid, initially completely unpopulated. We can mark squares in this grid in at least 2 ( N N ) 2^{(N \cdot N)} ways, such as for a 2 × 2 2 \times 2 grid:

Of these 16 16 grids, precisely 13 13 are continuous . That is, there exists a path, using edge adjacency (no corners/diagonals) from every black square to every other black square, and there is at least one black square.

Of all 3 × 3 3 \times 3 grids, 218 218 are continuous.


How many 10 × 10 10 \times 10 grids are continuous?


The answer is 284374318545830329487309785.

This section requires Javascript.
You are seeing this because something didn't load right. We suggest you, (a) try refreshing the page, (b) enabling javascript if it is disabled on your browser and, finally, (c) loading the non-javascript version of this page . We're sorry about the hassle.

1 solution

Daniel Ploch
Aug 24, 2014

This is an exceptionally difficult Dynamic Programming problem. Brute force works well enough for N 5 N \leq 5 , but immediately becomes intractable afterwards. What's worse, searching on OEIS only gives the answer up to N = 7 N = 7 , and links to no papers or formulas! How do you count continuous grids?

One such solution is to go column by column, multiplying results. Consider, for example, the graph below:

Imgur Imgur

Suppose we only know the number of ways there are to fill the blue squares, such that the middle column appears as it is. We only need that knowledge of the middle column's appearance, in this case, to compute how many ways there are to fill the green squares - then, we can just multiply that result by the number of ways there are to fill the blue squares.

This works in any case where the middle column is just one continuous segment of black squares. If we face a situation like the one below, we need more information:

Imgur Imgur

We need to know how the segments are connected, if at all, as that will govern how we can fill out the rest of the grid. It turns out this information, the connectivity of the segments, is really easy to normalize: Consider all the connected components in the blue squares, and list them in order of height (that is, the highest y-coordinate in the middle column that they touch). Using that order, assign an identifier to each connected component, and then simply label each segment in the middle column by that identifier.

This connectivity information is all we need to keep advancing. We can consider, for any possible 'middle' column, the set of all 2 N 2^N possible 'next' columns, and for each one, compute the new connectivity state. There are, approximately, 2 1.5 N 2^{1.5N} connectivity states, (identifiers + middle-column state), so it would take about 2 2.5 N 2^{2.5N} operations to compute the number of ways there are to transition to any connectivity state, from any prior connectivity state. For N = 10 N = 10 , this is on the order of 2 25 2^{25} , which is doable.

The empty case - when the middle column is empty - is a little ugly. We could add another bit to our connectivity state, to keep track of whether we've "already built" the blob or not. Or, we could add a little bit of complexity to our counting routine at the end instead.

Let C C be the set of all possible connectivity states. Let M C , C M_{C,C} be the matrix, which defines the number of ways there are to transition from any connectivity state c 1 c_1 to c 2 c_2 , storing 0 0 if there are none. We can compute the total number of continuous N × N N \times N grids in the following pseudo-code manner:

# We don't let the empty state count here
map<C, num> init = {possible_first_C_states} -> 1 # 2^N - 1
sum = sumValidEndingCStates(init) * N
map<C, num> last_map = init
for i in 2..N:
  map<C, num> next_map = new map(default = 0)
  for c1 in last_map.keys:
    for c2 in M.row(c1):
      next_map[c2] += last_map[c1] * M.val(c1, c2)
  sum += sumValidEndingCStates(next_map) * (N + 1 - i)
  last_map = next_map

print sum

The 'sumValidEndingCStates' routine merely sums the values in the given map, for where the keys yield only one identifier in the connectivity state. (Meaning that we could stop adding columns then, and what we've built so far would be fully connected).

The full algorithm runs in time O ( 2 2.5 N + N M C , C ) O(2^{2.5N} + N \cdot \left| M_{C,C} \right| ) . It turns out that M C , C M_{C,C} is, predictably, a relatively sparse matrix, with much fewer than 2 2.5 N 2^{2.5N} elements, and so the major runtime of the algorithm is approximately O ( 2 2.5 N ) O(2^{2.5N}) .

The above routine can be examined in the main method of the source code . It can be verified that the output of the code does in fact match the OEIS sequence up through N = 7 N = 7 , as well.


And running it with N = 10 N = 10 yields the enormous result:

284374318545830329487309785 \boxed{284374318545830329487309785}

Well said, Daniel.

I'm guessing that your program was able to run, for N=10, in a fraction of a minute?

Peter Byers - 6 years, 9 months ago

Log in to reply

UPDATE: Did clock it - takes about 28s consistently.

Stats:

  • Default JVM Settings (Java 8)
  • Eclipse Kepler SR2 Runtime
  • Intel i7-3770 (4 cores @3.40GHz)

I didn't clock it, but yes, it ran somewhere in the 10-50 seconds range. There is some room for optimization in a few of the routines to make it faster as well (not to mention that I didn't parallelize anything, when there's clear room for such, and so I took no advantage of my 4 cores either).

This is definitely the longest-running solution I have to a CS problem I've posted; everything else can be done in mere milliseconds. I try to keep below Project Euler's standard 1 minute limit for everything at a minimum.

Daniel Ploch - 6 years, 9 months ago

Log in to reply

A speculation ...

As you pointed out, that website stopped at N=7. If they didn't have a sophisticated program like the one you wrote, another possibility is that they used a modified-brute-force method. What I mean is, there are only about 5 trillion solutions for N=7, out of about 500 trillion grids to check. Thus, compared with simple brute force, run time can be cut down significantly just by checking along the way for "dead ends". (For example, starting a grid with

XO

OX

OX

Etc.

is a dead end since that upper left X will be isolated regardless of what columns are placed to the right.)

Of course, I'm just speculating. It is also possible that they did have a sophisticated program like the one you wrote, and just decided to stop reporting at N=7 for some reason.

Peter Byers - 6 years, 9 months ago

0 pending reports

×

Problem Loading...

Note Loading...

Set Loading...