The Lost Woods - Evil Trees

Computer Science Level 5

The darkness of the woods continue, though nothing new or dangerous confronts the hero in these narrow, more compact realms of the wood. Full of magic and confident in his planning skills, Link now decides to tackle two challenges he's already faced at once: Collect all the Heart Pieces in the wood using as little magic as possible to remove Cuttable Trees. He quickly finds that, despite the apparent simplicity of the task, it's much harder than he first realized.

This link (1) contains a zip file, with a single file in the Input Format specified for this collection of problems, of $1000$ mazes.

Let $S$ be the sum of the score for each maze. We define the score to be the index of the maze (1-based) multiplied by the minimum number of Trees Link must cut if he wishes to obtain all of the Heart Pieces in the maze. There will be at least one Heart Piece, and at most three Heart Pieces in each maze, and all of them will be accessible. Note, however, that when Link cuts down a Tree, it stays cut down - he need not expend extra magic to pass through the same Tree in the same maze twice.

Find $\boxed{S}$ .

Example:

Input:

3
5 5
TT.TH
.#T#T
.T.#.
T#.#T
LT.T.
5 3
H.T.H
T###T
T.L.T
9 9
####H####
#.T...T.#
#T##T##T#
#.##.##.#
L.T.T.T.H
#.##.##.#
#T##T##T#
#.T...T.#
####H####

Output:

Output Explained:

In the first maze, there is only one Heart Piece, so we need only find the shortest path (in terms of number of trees) between Link and the Heart Piece. The best route is to go right, then all the way up, then right again, hitting only $3$ trees, so the score for the maze is $1$ (the index) times $3$ .

In the second maze, Link can cut two trees to his left, and two trees to his right, to retrieve each of two Heart Pieces, but only one Tree separates the Heart Pieces them selves, so it is better to take that route after fetching the first Heart Piece. The score is thus $2$ times $3$ .

In the third maze, each of three Heart Pieces, and Link, are separated by two trees in a clockwise fashion, so Link could go around the perimeter, cutting down $6$ trees, to collect all of them. However, in the center of the maze, there is a configuration of $5$ trees that separates everything. While it takes more magic to go from Link to any one Heart Piece through the middle than through the sides, it is more efficient to clear out the center trees, since he can then reach all of the Heart Pieces using only $5$ cuts, instead of $6$ .

$1 \times 3 + 2 \times 3 + 3 \times 5 = 24$

Details:

Link will always appear once per maze, as an 'L'.
There will be at least one, and at most three Heart Pieces per maze, as 'H'.
Open ground is given as '.'.
Impenetrable trees are given as '#'.
Cuttable trees are given as 'T'.
No other characters will be present in each maze.
Smaller mazes this time: $1 \leq W,H \leq 10$ in all mazes.

(1) - This link currently points to my Dropbox account, since there is currently no way to upload non-image content to Brilliant. Admin, feel free to delete this footnote if this is changed.

This is a part of the collection entitled The Lost Woods , a series of problems in CS / Graph Theory.

Image Credit: http://www.sodahead.com/fun/do-you-like-to-walk-in-the-woods/question-2168027/

The answer is 5843193.

1 solution

Daniel Ploch
Jan 17, 2016

As the title, the narrative, and (if you got here from the note) the problem note warns, this is a very difficult problem, and deceptively so. It's not a shortest path problem , since there could be more than one destination, and it's not a Travelling Salesman Problem either, because our maze, unlike most graphs, has state - once Trees are cut, they stay cut, so walking through the same squares twice need not yield the same cost each time. With a little thinking, and research, we can come to identify this particular challenge as a Minimum Spanning Tree problem, or more specifically, the Steiner Tree Problem .

Trees are tricky to model as part of a graph. If each Tree is adjacent to precisely two passable nodes, then reducing the representation of the Tree to an edge with weight $1$ gives the desired effect, but of course this is no guarantee, as the examples show. What we can do, though, is reduce the graph to a set of connected components , with some representing nodes, and others representing edges, to form a hypergraph representation of the maze. While this might seem a little unnecessary, the analysis that follows makes simpler solutions seem quite unlikely: If you have one, share it!

The overall approach we'll take to solve this problem is:

Break the maze down into connected components of interest.
Assemble the components into a hypergraph.
Compute the weights of the edges of the hypergraph.
Compute the MST of the hypergraph.

Assembling the Hypergraph

First, we identify connected components to reduce the problem to a graph we can properly work with. Since the goal is to make all of the Heart Pieces accessible, we'll start with identifying these nodes - Link, and the Heart Pieces - and grouping them as connected components where open ground can connect them. In this manner, the cost of wandering to any node in these connected components (if already in them) is always zero.

Now, we remove all of the nodes in the components we've identified thus far from consideration, and group the rest of the passable nodes - that is, open ground, and cuttable Trees - into connected components. We'll refer to this second set of connected components as 'edge components', and note that the ones of interest are precisely those which touch at least two primary connected components. There is no reason for Link to ever enter any other edge component.

Once we have the primary and secondary components, we have the data we need to assemble the hypergraph structure (but we don't have the edge weights yet). Every edge component represents an edge between all of the primary components it touches. Note that, at this step, we are also dealing with a multigraph as well as a hypergraph, since not only may edges touch more than two nodes, there may be multiple edges for certain sets of nodes. For clarity, consider the example graph below:

9 9
..T.H.T..
..#...#..
T#T...T#T
...#T#...
L..T.T..H
...#T#...
T#T...T#T
..#...#..
..T.H.T..

On our first pass, we identify the connected components with Link and the Heart Pieces. Those components we'll call 1, 2, 3, and 4, and they look like this:

9 9
..T222T..
..#222#..
T#T222T#T
111#T#333
111T.T333
111#T#333
T#T444T#T
..#444#..
..T444T..

Note that it's not important which node has Link in it. All of the nodes need to be connected regardless, so the starting position doesn't matter. On the second pass of identifying connected components, we'll spot 8 edge components of degree 2, and 1 hyper edge of degree 4, labeled ABCDEFGH and I respectively:

9 9
AAA222CCC
AA#222#CC
A#B222D#C
111#I#333
111III333
111#I#333
E#F444H#G
EE#444#GG
EEE444GGG

We'll need to remember the contents of the edge components, of course, but with the break down, we're ready to start computing edge weights. Since the edge components contain all of the trees, we'll be able to properly encapsulate the cost of cutting a Tree into our graph representation in a way that makes MST compute what we want.

Computing Edge Weights

Since the problem is limited to three Heart Pieces, there are at most four nodes of interest, so we need only consider edges of three different degrees: $2$ , $3$ , and $4$ . Computing the edge weights for such edge components is easy, tricky, and hard, respectively.

First, what is the edge weight, precisely? We define the weight of an edge connecting nodes $N$ to be the minimum number of Trees that must be cut in the corresponding edge component to render all the nodes in $N$ to be connected to each other through that edge. Note that, for edges of degree $\gt 2$ , there are multiple node sets one might want to connect through that edge component. This is fine, we will simply compute edge weights for all node sets we might connect through that component, and we won't worry about inaccuracies or duplicate edges in the resultant graph. The MST algorithm will ensure we get an optimal solution, so solutions that double-count trees by way of using overlapping edges will get filtered out by better solutions that don't use overlapping edges.

In the case of a degree 2 edge, computing the edge weight is simple: Run Dijkstra's Algorithm from one primary component to the other, modeling Trees as having incoming edges with weight $1$ , and all other edges with weight $0$ . This is the core of the solution to Cutting Trees , and it works just fine.

In the case of a degree 3 edge, computing the edge weight is trickier. In fact, on inspection, it seems no easier than the problem we first started out with! We are, once again, looking for a Steiner Tree, just on a subset of the graph. However, we can see than any such spanning tree must, when all nodes of degree 2 are reduced to edges, look like one of the following trees:

No other tree configuration makes sense. The thing both trees have in common is a central node , and it is easy to see that the path cut from each primary node to the central node must necessarily be a shortest path for the cost of the tree to be optimal. In the former case, one of the primary nodes is the center, and in the latter case, some point other than the three primary nodes is the center. By running Dijkstra, sourced at each of the primary nodes, over the domain of the primary nodes and the edge component, and using some careful arithmetic for tree counting, we can find the minimum cost of connecting the three nodes in $O(N)$ time, where $N$ is the number of nodes in the edge component, bounded above by $W \cdot H$ .

In the case of a degree 4 edge, things get trickier. There are many more tree type cases to consider that might be optimal:

However, once again, there is a common property to all of these being minimal: A pair of central nodes, forcibly joined, with all other nodes taking the shortest path to the closest one, produces each of these graphs. Red and Blue are the central nodes in the top left graph, Red and any other color form them in the next one. The central node and any primary node would form the bottom-left graph, two points in space would define bottom-center, and on the right, Red, and the arbitrary node between red, blue, and green, would induce such a graph.

We can run the Floyd-Warshall Algorithm , or just Dijkstra $N$ times, in order to precompute the distances between all of the nodes in the degree 4 edge component. After that, it is an $N^2$ operation to find the optimal pair of central nodes that produces the lowest cost spanning tree when connected to. And with that edge weight computed, we're ready to run MST.

Computing the MST

The scale of the problem is small enough that we don't actually need to run Prim's or Kruskal's algorithms to efficiently compute an MST, and in fact those algorithms don't even apply completely due to the presence of hyper edges. While we could be lazy and simply treat the whole maze as a degree $M$ hyper edge, and compute the weight of it using the algorithms described above, this could take unnecessarily long for simpler cases, and perhaps even impractically long on weaker computers (I have not tested how slow this is). More importantly, such a brute-force approach teaches us very little, and is almost guaranteed not to scale to larger graphs if we're interested in them.

To compute the MST of the hypergraph, we first independently consider any available hyper edges, since those greatly simplify the remainder of the process. We simply need to consider, on a case by case basis, the following options, choosing only the one that produces minimum cost:

1 primary component:
- A trivial edge case, but we can't forget it. In this case, the cost is $0$ .
2 primary components:
- There should be one edge component of degree 2 with minimal weight. Return that.
3 primary components:
- Either...
  - Use the degree 3 hyperedge, if there is one, or
  - Use the two smallest weight degree 2 edges
4 primary components:
- Either...
  - Use the degree 4 hyperedge, if there is one, or
  - Use a degree 3 hyperedge, if there is one, plus the cheapest degree 2 edge that connects one of the three primary components to the last one, or
  - Use three degree 2 edges, a la Kruskal's or Prim's algorithm (which is trivially hard-coded for so few nodes)

And that's it! Once you have the weight of the MST, multiply it by the index, add it to the total, and proceed.

The overall runtime of this algorithm is $O(W \cdot H)$ to determine connected components, and for each connected component of $C$ nodes, it is $O(C)$ for edges of degree 2, $O(C)$ for edges of degree 3 as well, but $O(C^2)$ for edges of degree 4. Since $C$ may be $O(W \cdot H)$ , the overall runtime is $O({(W \cdot H)}^2)$ in the worst case. For $W \cdot H \leq 100$ , this is generally fine. For larger graphs, such as $1 \leq W,H \leq 100$ , the problem quickly becomes slow to solve.

Click this link to view the scores for each maze from the input in order, sans multiplication by the index. If you were not able to get the right answer, consulting these individual results may help you find the problem (or, if I got something wrong, the bug in my own solution!)

Not sure if it's only me, but I can't get the zip file

Trevor Arashiro - 5 years, 4 months ago

Whoops - typo'd the link, good catch. Should be fixed now.

Daniel Ploch - 5 years, 4 months ago

Just curious, how did you make all these graphs? I'm guessing you didn't hand type them. Also, very nice problem.

Trevor Arashiro - 5 years, 4 months ago

@Trevor Arashiro – Thanks! And yes, all the problem sets are generated. I basically just figure out a handful of 'templates' that cover different edge cases and test algorithm performance, spruce in some RNG, then loop 1000 times, choosing a random template each time to create them.

I'll probably upload the code somewhere once I finish the set (my goal is 15 problems, 5 at each difficulty tier). It keeps changing right now because I'm introducing new graph algorithm methods and refactoring things all the time :)

Daniel Ploch - 5 years, 4 months ago

The Lost Woods - Evil Trees

This is a part of the collection entitled The Lost Woods , a series of problems in CS / Graph Theory.

Image Credit: http://www.sodahead.com/fun/do-you-like-to-walk-in-the-woods/question-2168027/

The answer is 5843193.

1 solution

0 pending reports