reinforcement-learning
QLearning and never-ending episodes
Let\'s imagine we have an (x,y) plane where a robot can move. Now we define the middle of our world as the goal state, which means that we are going to give a reward of 100 to our robot once it reache[详细]
2022-12-13 06:29 分类:问答Alpha and Gamma parameters in QLearning
What difference to the algorithm does it make having a big or small gamma value? In my optic, as long as it is neither 0 or 1, it should work exactly the same. On the other si开发者_运维问答de, whatev[详细]
2022-12-13 04:19 分类:问答Negative rewards in QLearning
Let\'s assume we\'re in a room where our agent can move along the xx and yy axis. At each point he can move up, down, right and left. So our state space can be defined by (x, y) and our actions at eac[详细]
2022-12-13 01:02 分类:问答What are the uses of recurrent neural networks when using them with Reinforcement Learning?
I do know that feedforward multi-layer neural networks with backprop are used with Reinforcement Learning as to help it generalize the actions our agent does. This is, if we have a big state space, we[详细]
2022-12-12 01:19 分类:问答Improving Q-Learning
I am currently using Q-Learning to try to teach a bot how to move in a room filled with walls/obstacles. It must start in any place in the room and get to the goal state(this might be, to the开发者_如[详细]
2022-12-11 04:13 分类:问答Generalization functions for Q-Learning
I have to do some work with Q Learning, about a guy that has to move furniture around a house (it\'s basically that). If the house is small enough, I can just have a matrix that represents actions/rew[详细]
2022-12-08 06:50 分类:问答Why does AlphaZero perform better than vanilla MCTS? [closed]
Closed. This question is not about programming or software development. It is not currently accepting answers.[详细]
2022-12-07 18:04 分类:问答
加载中,请稍侯......