Deprecated: Assigning the return value of new by reference is deprecated in /f2/blendedtechnologies/public/wp-content/plugins/pmetrics.php on line 1192
Blended Technologies » Blog Archive » Machine Learning and Dragons - a Game

Machine Learning and Dragons - a Game

Here is a fun little game I wrote using Python and pygame.

The Story:


You’re a knight and your job is to kill as many dragons as you can. The twist is that the dragons use Genetic Programming to learn from every encounter. (You can optionally have them use Reinforcement learning instead too.)

Dragon Fighter Pygame - Machine Learning - Genetic Programming

You can download the source code and all files needed to run the game here (8.6 MB, GPL).

To run it, you just need Python and pygame installed.

Game Play:
Kill as many dragons as fast as you can.
The arrow keys move ye.
The space bar swings your sword.
ctrl+c to quit.


When using Genetic Programming, the dragons basically learn like this. Every dragon gets assigned a randomly created program tree. Here’s an example in LISP like syntax:


Which means this:

if knight is facing me:
    if 20< =distance to knight:
        Breath Fire
        move away from knight
   move towards knight

Dragons get points the longer they survive in a level, and they get a whole lot of points for killing the knight. After the end of a level, the dragons’ program trees are mixed up with each other to create new programs. program trees associated with the most points, get chosen more frequently to reproduce.

Sadly this scheme turned out not to work too well, though it does work once in a while. If you play say 20 games (not levels), you’ll probably see an instance of the dragons learning to chase you. But normally they don’t learn much of anything.

Here are some of the reasons I think the learning doesn’t go better:
1. The population is too small for the search space.
2. Even good dragons can get killed, just by standing too close to another dragon. They have no means to learn not to do this.
3. Fitness isn’t fair. If a dragon starts out far away from the knight (placement is random) he will probably live longer than a dragon closer to the knight.

If anyone wants to fix these things, please send me your patches.

I also tried Reinforcement learning for the dragons. This actually made them good at turning around when the knight comes near, and shooting fire at him. But they could never learn how to move.

Please try it out and let me know your feedback. Do you have any ideas on how to scale up this idea and make it into a fun game? Any ideas how to improve the AI? |  Digg |  FURL |  Yahoo! My Web 2.0 |  Reddit

18 Responses to “Machine Learning and Dragons - a Game”

  1. Alex Says:

    I think one of the contributing problems might be the coarse grain of actions you’re allowing the dragons. If you gave them direction vectors for movement and amounts of fire rather than an on/off state you might start to evolve more interesting and organic behaviors.

  2. Alex Says:

    I’m assuming that the number of iterations can make a huge difference, and a human will get bored before really interesting algorithms show up.

    This sounds like a perfect web project. I know it’s a lot more work, but did you consider setting up a web database to collect and redistribute the results?

  3. Simon Wittber Says:

    You might want to change your bz2 file to unpack its contents into a subdirectory, rather than splat all over the current directory. It’s very annoying when that happens.

  4. Harrison Says:

    Evaluating dragons by the time that they survived does not seem like a good idea to me, because there are too many arbitrary factors that affect it. For example, the order in which the player kills the dragons does not reflect at all the effectiveness of the dragons in their environment, and yet it affects the process greatly.

    Have you thought of creating a completely mechanized environment and just letting the user watch? An army of knights vs an army of dragons would be interesting. The evolutionary process can then be based on a joint time-survived/soldiers-killed algorithm. Of course, the knight AI should just be a generic algorithm, since the knights would really be just another environmental factor. Results from that would be interesting.

    And I agree with Alex, using directional vectors to represent the actions of the dragon would be a good idea.

  5. Greg - CEO/Founder Says:

    Excellent ideas guys.

    Sorry about the bz2 setup. I’ll fix that.

    That’s a great idea to make it a web project. I’ll put it on my list of potential web projects.
    I figured the dragons would learn faster with a small set of actions, actions that already do useful things like move toward knight, so they don’t have to learn that. But directional actions might be worth a try.

    You are right about the survival time. I think that is messing things up. When I designed it like that, I figured it would just be an instance of learning in a noisy environment, but perhaps it’s too much noise. Great idea about having an army of knights be run by the computer. Another idea would be to have them evolve against each other which would be pretty cool.

  6. Chuck Says:

    Agreed with Harrison — a human player is going to learn how to “game” the limited set of responses that the dragons have, which means the set of possible responses has to expand much more widely. If the repertoire of the dragons was expanded to allow them different movement directions (absolute and relative to other objects, including terrain and other dragons) and more importantly, to become aware of the actions of other dragons, then you could end up teaching them to gang up on the hapless knight. At which time you start introducing more knights :)

    Individual lifespan isn’t necessarily the best fitness criteria either, since it would reward the most cowardly at the expense of the population. If the dragons actually had to meet and mate to produce offspring, their fitness is also tied to their ability to defend at least one other dragon.

  7. Greg - CEO/Founder Says:

    Chuck, those are fascinating ideas. I can’t wait until I have time to work on this game again. I’m going to take everyone’s advice here. Volunteer to help now :-)

  8. Mike J Says:

    Cowardice is actually a valid survival mechanism too. Evolving a species that survives by running from danger is just a realistic as one that seeks it out to try and kill it.

    But yes, in genetic algorithms ( as in neural nets ) the number of iterations is the key - repetition works. You can’t find a semi-optimal solution to the TSP in only 5-10 generations, especially if not all the cities were getting the feedback from the system ( ie, not all dragons will have a chance to choose fight or flight ).

    In addition, you could work in some steering behaviors for the dragons to give them a larger palette of things to choose from, so they could “wander” if their fight / flight responses aren’t triggered. In addition, you may want to make your rulesets a bit different, so that there are things that become tradeoffs. For instance, a dragon who learns to move and turn faster may not be able to shoot fire as far or for as long - it would give some more tangible benefits in this type of setup that you could see how a particular population would survive.

  9. Greg - CEO/Founder Says:

    Good points, Mike J. It is neat to see cowardice evolve, though it’s unbecoming of dragons :-)

    There definately are not enough iterations to make much of a dent in the search space of behaviors as you say. I’ll look into modifying the available actions on the next update to the program.

  10. Mihai Campean Says:

    Perhaps it would be a good idea to add more variables to the genetic algorithm, besides lifespan. So if you imprint other information in a dragon’s genes, such as speed, firepower, lifespan and other things and when randomly creating dragons you give some default values to the genes, it will surely be more interesting to see ho things will play out. Also ideas like mating and awareness of other dragons’ actions, would surely add some spice in the mix.

  11. Greg - CEO/Founder Says:

    Neat ideas, Mihai. there’s a lot of fun to be had here.

  12. Mihai Campean Says:

    Well, the general idea is that a behavior is as complex as you make it. It is the same as with any other problem, it depends on how well it’s domain is defined. If the domain of the problem is well captured and contains all the important aspects, you should be able to obtain fairly good results in implementing an algorithm for solving the problem.

  13. Harrison Says:

    I was sitting there reading some Hofstadter book when I suddenly realized the problem with your model. I’m quite sure now that the ineffectiveness of the genetic programming lies mainly in using survival time as a factor of evolution. I’ll show you why, and forgive me for using this analogy: If two humans were fighting to the death - let’s assume that they have no weapons - the fight is a long, drawn out process. Much timing, physical reaction speed, and eyesight coordination is involved. When one wins and the other loses, you can at least tell that it was a “close fight”, and that the loser “put up a good fight” by looking at how long the fight went on. In short, fighting in real life is a process, and thats the assumption you are making.

    But in our game model, fighting is much simpler, There is no “process” involved. You press ’space’, guy raises his sword, dragon dies in one hit. There is no way the dragon can defend from the attack. The dragon cannot even run away fast enough during the short interval in which the knight raises his sword and slashes down. The only way for the dragon to surive longer is cowardice (running away beforehand), the action of which is part of a completely different purpose (survival, versus killing the knights (which I am assuming is what you want)). Basically what this means is that survival time has nothing to do with how well the dragon ‘fights’, since there is no fighting going on.. just killing; therefore using it as a factor will not help kill the knight.

    There are a few ways I have in mind which you go about this problem: 1) make fighting a drawn out process with attacks, counterattacks, counter-counter attacks, blocks, etc ; 2) dont use survival time as a factor; 3) create an automated ‘knight army’ with generic (not genetic) algorithm so that more kills can be awarded to the dragons, which can then be used as factor.

    So thats what I was thinking… I am not sure of what your exact purpose is in creating this game. An AI simulation perhaps, or for viewer entertainment? But you might like to fix things accordingly. But as a simulation, things should be kept simple, in my opinion.

  14. Greg - CEO/Founder Says:

    Harrison, neat ideas. Where does Hofstadter come into this? :-)

    That’s an excellent point about survival time only being helped by cowardess, and luck (starting distance from knight). Maybe some sort of health measure would help with that. So the knight has to hit a dragon multiple times, and vice versa. The knight army is a neat idea too.

    The thing I don’t like about having only killing the knight giving a dragon fitness in the current setup is that it makes the fitness landscape very non-smooth. I.e, a dragon can be very aggressive and have excellent strategies but if another dragon kills the knight first, he doesn’t get rewarded. (Of course survival time probably doesn’t help with that either and probably actually discourages aggressive behavior.) So I’m not sure actually.

    I’m not sure what my purpose in creating the game was. I think to see machine learning algortihms in an interactive environment. And secondly to make an entertaining game.

  15. davidg Says:

    I might download the source … but I was wondering how you seeded your dragon population initially?

    Did you give them an initial traits? Such as “forage for food / look for opponents / sleep / drink”. Initial seeding, at random, of the population to give them “some” intelligence might be useful to kickstart the evolution process … you are God anyways in this small world.

    Then you might be able to gauge the effectiveness of traits a dragon has in its lifetime, instead of the raw length of their lifetime.

    I haven’t written GPs before, but I have dabbled in GAs (particulary the N-Queens problem in back when I was in school) and I know the paradigms are siblings.

    I also agree with the post above that you could make this a web-app. You could distribute many worlds and have newcomers face evovle(ing) generations of dragons trained from both AI bred breeding arenas and human-vs-dragons bred breeding arenas :)


  16. Greg - CEO/Founder Says:

    Davidg, go ahead and download the source, it will make things much clearer.

    The only actions the dragons have in their repetriore (sp) are attack, move toward knight, and move away from Knight. Then they also have some conditional expressions to apply to those actions, e.g., is knight facing me, how far am I from knight, etc.

    The intial programs that control the dragons’ bevhaviors, made up of those actions and conditions, are generated randomly to start off with. Hopefully having a look at the source will make the idea clearer to you.

  17. » Blog Archive » Machine Learning and Dragons Says:

    […] October 24th, 2007 in Links Here’s a short article about a machine learning project written in pygame. It’s an interesting approach with some intriguing possibilities (yes, I’m tempted). […]

  18. Brian Says:

    Hi Greg, I like the game. Maybe you should add some blue fire, because its hotter.