Project One

Design due 10 June and Program Due 15 June

You are going to write a program that will explore the N-bandits problem. In this problem, there are N one armed bandits. A given AI player will then be given X pulls on any combination of the machines. The goal is to write an AI player that on average will maximize the amount of money garnered after N pulls.

The basic one armed bandit has a probability p ( between 1 and 100 ) that it will pay off. Each pull of an arm costs some money. For a basic bandit, this is 1 dollar. When an arm is pulled, the bandit generates a random number between 1 and 100, if this number is less than or equal to the pay off probability, the player wins and gets 4 dollars. Each bandit has its own p that is a random number between p_min and p_max ( see below ).

Your job is to write a program that runs the N bandit simulation on one of three AI players that you design. The student whose AI player consistantly performs the best in my tests will receive extra credit. In fact, the top 3 students will get extra credit.

Program Details

  1. Your program should prompt the user for the following information:
    1. N, the number of bandits
    2. X, the number of pulls you get
    3. p_min, the min pay off probability
    4. p_max, the max pay off probability
    5. P, the id of the player you want to go (1,2 or, 3)
    6. M, the starting dollar amount
  2. You must use objects, inheritance, one abstract class, and no object should have a default constructor! (Remember a default constructor is a constructor with no parameters.)
  3. The agent is allowed to have negative dollars (ie: its allowed to go into debt).
  4. The agent takes each pull one at a time. That is, the agent selects which lever to pull, pays the starting fee, if there is a pay off, the agent gets money. It then gets a chance to figure out what next lever it should pull.
  5. You should write three agents to play this game.
  6. There are three types of bandits in this game. When you generate your array of N bandits, populate each element of the array with one of three random bandits. Each bandit gets a probability payoff p, between p_min and p_max.
    1. The basic bandit. Each pull costs 1 dollar, and the pay off is 4 dollars.
    2. The better bandit. Each pull costs 2 dollars, and the pay off is 4 dollars. However, if the random number is below half the pay off probability, then the user gets 15 dollars.
    3. The jack pot bandit. The jack pot bandit only pays off its random number is below 1/10 of the pay off proability. In this case, the jack pot pays off 150 dollars. It costs 1 dollar to play the jack pot bandit.
  7. Main should be small. No function should be larger than one page.
  8. Comment!!!!
  9. All members should either be private or protected. NO PUBLIC MEMBERS!!! That means your objects must be encapsulated!!!
  10. In your design document, indicate which objects are children of which other objects. For children objects, only specify those members or methods that are different from the parent.
  11. For extra credit, I will hold a competition. I will have a run of 33 bandits. The starting probability will be median ( between 30 and 40 percent ). The agents that consistantly perform the best will win extra credit. I will start the agents with 100 dollars and give it 1000000 pulls.