Programming Task 4: Championship Dungeon Rats

Reviewed December 3rd, 2014


Guidelines


The environment for this assignment:


In this task you must genetically engineer a dungeon rat. You have control over its 290 character genome which is used to dictate the personality of a rat. You have until presentation day to optimize the genome of your rat and present your champion rat to me along with an idea of how you went about optimizing its performance (the number of turns before the rat dies). The format of your genome will be a 290 length ascii string with ascii values between '*' and 'z' (42 to 122).

FORMATING NOTE '\' is an allowable character which can be interpreted as an escape character in your code, so in C++11 you can use a Raw string format (e.g. R"(yourstringhere)") or escape '\' by using '\\' instead. In the javascript simulator I provide the input box can accept '\' without problems, but if you hard code your javascript you'll want to use '\\' instead of '\'.

The rat will be placed in a default dungeon at coordinates 12,12. Extra points if your rat does well in a variety of randomly generated maps. The default map is available to you in the code I provide below, its seed is: 25:25:..$.$.X.............X....$X.X*..X$..X...*X$..$...X$.$......X.$.X...XX.$.X*.*.*..X..X.**.......X..$$$...........XX.....................$...X...*.$..X..$X..........$.*..X.....$.X..$*.$X......$...X.*X$......$.**.X.X..XX$X..*....*..X.X....$...X...X........$.X....$...*...X$*........X..$*$$......$$...$*..X.$.$......$.$.$...$..X.*.....X..$......$.XX*..X.$.X......X$*.**.....X*...$..XX..X.....$....X....X...X....X.$X$..X..........$...*.X$..X...$*...........*....XXX$$.$.$..*$XX..XX..*.....$......X.XX$..$$..X$.XX.$$..X.*..*......X......$..$.$$..*...X.........$X....$X.$$.*.$.$.$..**.....X.$.$X.*.$.........$**..X.X.X$X.$.*X.X*..$*.. The dungeon is a 25 by 25 square grid composed of four distinct types of tiles. A food tile (marked $), an obstacle tile (* (think "fire")), a pit (marked X), and an empty tile (marked .).

The rat begins with 30 energy and loses 1 energy per turn, and if it has 0 or less energy it is considered dead. At that point the number of turns that it survived is the rat's score. Now, every time it encounters a food tile its energy is increased by 10, and the food tile is emptied. Every time it runs into an obstacle it does not move and its energy is decreased by 10. If the rat enters a pit tile then it is instantly killed.

To help you in your search I have provided you with a rat simulation library in two programming languages. The simulators should give the same results given the same inputs, they might be subject to bugs but they are likely to be the same bugs. The library which I intend for you to primarily work with is in C++ (for optimizing your rat using higher speed code) and is available on IDEONE AT THIS URL or on as a file on this server at this URL.

The other simulation library is in Javascript so that you can input the genome of a rat, the seed of a dungeon, and starting coordinates and watch the rat's dungeon run. Here is a JSFIDDLE with the animated dungeon run.

The javascript version is designed for visualizing the run of a specific rat (the code uses EaselJS and BackboneJS to create a Model and a html canvas animated View, to learn more sign up for 474 Advanced Web Technologies in the spring).

You do not need to understand the rat simulator to do well at this project but it might help. In theory you can optimize the genome via any number of methods using the simulation as a black-box. I leave it to you to find a method that works for this optimization problem.

About the simulator

The genome that you provide is used to set the weights of a multi-layer perceptron (a simple artificial neural net). Each turn the rat is going to be made aware of 25 input boolean values which his brain will translate into four parallel decisions: should I go down a row, up a row, right a row, left a row?

The rat "perceives" a 7 by 7 grid centered at its current location. If the rat were to be at position 0,0 then there are four quadrants the NW, the NE, the SW, and the SE. Each quadrant is split into close and far zones, where close zones can be reached in one move, far zones can be reached in 2 or 3 moves. Below is a picture of the NW far zone, the NW close zone, and the six squares which are in multiple zones.

The 25 input values given to the rat's brain in each turn are as follows:

  1. Am I in pain? (Did it hit an obstacle on the previous turn?)
  2. Is there an obstacle in the close NW zone?
  3. Is there an obstacle in the close NE zone?
  4. Is there an obstacle in the close SW zone?
  5. Is there an obstacle in the close SE zone?
  6. Is there an obstacle in the far NW zone?
  7. Is there an obstacle in the far NE zone?
  8. Is there an obstacle in the far SW zone?
  9. Is there an obstacle in the far SE zone?
  10. Is there food in the close NW zone?
  11. Is there food in the close NE zone?
  12. Is there food in the close SW zone?
  13. Is there food in the close SE zone?
  14. Is there food in the far NW zone?
  15. Is there food in the far NE zone?
  16. Is there food in the far SW zone?
  17. Is there food in the far SE zone?
  18. Is there a pit in the close NW zone?
  19. Is there a pit in the close NE zone?
  20. Is there a pit in the close SW zone?
  21. Is there a pit in the close SE zone?
  22. Is there a pit in the far NW zone?
  23. Is there a pit in the far NE zone?
  24. Is there a pit in the far SW zone?
  25. Is there a pit in the far SE zone?

The values of your genome are mapped from ASCII to a double value between -1 and 1 by subtracting 82 and dividing by 40 ('*' maps to -1 and 'z' maps to 1). The first 250 values are used to fill a 25 x 10 matrix, W_ih. The input vector of 25 0-1 values is treated as a 1 x 25 matrix and left multiplied onto W_ih (dot product with each column). This gives a 1 x 10 matrix which is the sum of the activated inputs effect on the hidden neurons. This is mapped to a boolean 1x10 matrix by deciding if each value is positive and if so making it a 1, otherwise a 0. Then the remaining 40 genome characters are used to fill a 10x4 matrix W_ho and the hidden neurons are fired in the same way as above to get the final 0/1 output matrix which has dimensions 1x4.

The above argument can be reverse engineered to know which of the genome characters listens to which neurons and impacts which other neurons. This might be a useful insight for some optimization techniques. You may also consider 'training' the neural net directly by using the C++ neural net class I provide and adjusting weights until on certain input vectors it gives your desired output vectors. (Google "training a neural network" if you are interested.) The NeuralNet class I provide has utility functions for translating between a vector of doubles and the ascii genome format I specify above.

A sensible training choice might be: if there is a pit in 3 of the close zones train the rat to chose to walk toward the opposite safe area.

Of course many other ways exist to optimize for this problem.