The main idea for our project revolves around archery. More specifically, we would like to train an agent, via reinforcement learning, to accurately hit a target in Minecraft using the bow and arrow. When shooting an arrow in Minecraft, the player has control over the angle at which they aim the arrow, and the amount of force applied to that arrow (via the “charging” mechanism in the game). Similar to how a real-world user would be presented with this situation, we want the agent to have the ability to “see” the target in front of them, and will thus feed the agent input on the target’s location in reference to itself. However, it will have to gradually learn what combinations of angles and force it must use in order to hit a specified target, and will produce the best result as output.
As we progress through this project, we plan on introducing the agent to new (and more difficult) obstacles. For instance, though our primary goal is to have the agent hit a stationary target (at various distances and heights), we would like to use moving targets as we move forward. In the later stages of the project, we also aim to introduce obstacles that would affect the terrain (such as a wall of water, since the drag of an arrow is greatly increased when it travels through it). As a final goal, we hope to have the agent eventually progress towards hitting hostile targets that are moving towards it.
We plan to implement reinforcement learning through Q-learning with a neural network as a function approximator (and may experiment with SARSA as well if we present the agent with hostile targets in later stages of the project).
The basic metrics by which we will be evaluating success will be accuracy (how close the agent’s final shot was to the actual target), and precision (the distance between each of the shots)—for instance, analyzing whether the final shots were randomized or whether they were closely grouped together. We will also be keeping track of how many different combinations of actions the agent must perform until these two metrics begin to gradually increase. A round will consist of the agent attempting to shoot a target. If our agent hits a desired target, we will provide it with a reward, but if it misses, the agent will recieve a smaller reward or none depending on how far the arrow landed from the target. So, by utilizing reinforcement learning, we expect the metrics to appear somewhat random at first (since the agent is exploring its options in an attempt to receive a reward), but once the agent begins to converge towards some optimal policy and successfully hits a target, we believe its accuracy and precision will be able to steadily increase from thereon out.
Visually, we will be able to observe the game and see whether the agent is consistent with its actions. For instance, if our agent is firing arrows randomly at first, but gradually begins to land its arrows towards the desired target, we will be able to verify that the internals of our algorithm works. However, if the agent is inconsistent and is firing its arrows randomly, hitting the target sometimes, but not seeming to make any real efforts in consecutively aiming towards it, then we will know there is most likely something wrong with the algorithm. As a dream case, we would love for our agent to be able to efficiently fend off approaching enemies while, perhaps, having multiple influencing factors like water, height, etc.
Overall, since some of the mechanics our agent will be utilizing are not as easily-perceived by humans (i.e. a real-world user may not be able to charge a bow by the desired amount as efficiently, or may become visually distracted by other objects in the scene when aiming), we believe our agent will make for a very intriguing artificial player.