This exercise involves modeling a sequential decision-making problem. We consider the Taxi
Domain problem that involves controlling a taxi agent that can pick up and drop passengers in a
grid world. We will implement algorithms for arriving at a good policy offline using dynamic
programming and later implement basic reinforcement learning methods for online learning. For more details about the problem statement see Problem_statement.pdf. For the results and analysis see Taxi_Planning.pdf
- Simple command
python A3.pycan be used to run the file and see the simulations. - We showed the progress of the game by showing ’t’ as the taxi without passenger, ’T’ as the taxi with passenger, ’S’ as the location of the passenger and ’D’ as the destination of the passenger and the rest locations as ’0’. The walls are denoted by '|'