Highest Voted 'value-iteration' Questions

112

votes

5 answers

What is the difference between value iteration and policy iteration?

In reinforcement learning, what is the difference between policy iteration and value iteration? As much as I understand, in value iteration, you use the Bellman equation to solve for the optimal policy, whereas, in policy iteration, you randomly…

asked May 22 '16 at 02:43

Arslán

1,411
2
9
14

4

votes

1 answer

Dynamic Programming of Markov Decision Process with Value Iteration

I am learning about MDP's and value iteration in self-study and I hope someone can improve my understanding. Consider the problem of a 3 sided dice having numbers 1, 2, 3. If you roll a 1 or a 2 you get that value in $ but if you roll a 3 you loose…

algorithm reinforcement-learning markov-decision-process value-iteration

asked Aug 26 '17 at 02:24

Sam Hammamy

10,125
8
43
84

2

votes

1 answer

Is there a clever way to get rid of these loops using numpy?

I'm reaching the maximum recursion depth and I've been trying to use np.tensordot() I couldn't really get an insight into how to use it in this case. def stopping_condtion(a,V,V_old,eps): return np.max(la.norm(V - V_old)) < ((1 - a) * eps) /…

python numpy tensor operations-research value-iteration

asked May 12 '21 at 17:35

Max

425
1
3
7

2

votes

1 answer

Why is Policy Iteration faster than Value Iteration?

We know that policy iteration gives us the policy directly and hence is faster. But can anyone explain it with some examples.

value-iteration

asked Nov 24 '19 at 23:33

shmi

23
4

2

votes

1 answer

Is Monte Carlo learning policy or value iteration (or something else)?

I am taking a Reinforcement Learning class and I didn’t understand how to combine the concepts of policy iteration/value iteration with Monte Carlo (and also TD/SARSA/Q-learning). In the table below, how can the empty cells be filled: Should/can it…

reinforcement-learning q-learning temporal-difference monte-carlo-tree-search value-iteration

asked May 07 '18 at 18:28

Johan

703
1
11
24

2

votes

2 answers

How to Solve reinforcement learning Grid world examples using value iteration?

I find either theories or python example which is not satisfactory as a beginner. I just need to understand a simple example for understanding the step by step iterations. Could anyone please show me the 1st and 2nd iterations for the Image that I…

reinforcement-learning value-iteration

asked Mar 03 '18 at 12:15

Ahasan Ratul

35
1
10

2

votes

0 answers

Modelling profitability of credit card by Markov Decision Process.

This is with reference to a paper published on Modelling the profitability of credit cards by Markov Decision processed.I am trying to implement the same in python using Mdptoolbox but not getting the output in the format expected. My states are the…

credit-card markov-decision-process value-iteration

asked Nov 21 '17 at 07:11

AnkitPandey

21
2

1

vote

2 answers

Population growth math issue in c

I have looked this over and am wondering where my math issue is. I believe that it should be calculating correctly, but the floats do not round up, .75 to 1 to add to the count for births/deaths. I am a novice to c. Here is the code I have so…

c math value-iteration

asked Mar 30 '21 at 06:48

Lee

11
3

1

vote

1 answer

why are policy-iteration and value-iteration methods giving different results for optimal values and optimal policy?

I am currently studying dynamic programming in reinforcement learning in which I came across two concepts Value-Iteration and Policy-Iteration. To understand the same, I am implementing the gridworld example from the Sutton which says : The…

python dynamic-programming reinforcement-learning policy value-iteration

asked Sep 08 '19 at 18:37

POOJA GUPTA

2,049
6
25
50

1

vote

0 answers

Faster accessing 2D numpy/array or Large 1D numpy/array

I am performing prioritized sweeping for which I have a matrix which has 1000*1000 cells (gridworld) whose cells I have to access repeatedly in a while true loop for assignment (I am not essentially iterating over the list but all cells are accessed…

python numpy value-iteration

asked Apr 16 '18 at 17:35

SH_V95

151
1
3
11

0

votes

2 answers

Declare a javascript object between brackets to choose only the element corresponding to its index

I found this sample in a book and this is the first time that I see this notation. Obviously it's a thousand times shorter than making a switch; but what is it? When I do typeof(status) it returns undefined. I would like to understand what it is so…

javascript ecmascript-6 iteration value-iteration

asked Mar 25 '21 at 09:35

Karleen-Bx

35
7

0

votes

0 answers

RL value iteration, gridworld multi action problem

I am just starting to study reinforcement learning and trying to get my head around the basics. I understand policy eval, policy and value iteration algorithms and can solve a simple gridworld optimisation problem with two terminal states -5 or +5.…

python reinforcement-learning gridworld value-iteration

asked Feb 21 '21 at 18:54

student200

11
2

0

votes

1 answer

Are these two different formulas for Value-Iteration update equivalent?

While studying MDP via different sources, I came across two different formulas for the Value update in Value-Iteration algorithm. The first one is (the one on Wikipedia and a couple of books): . And the second one is (in some questions here on…

formula mdp value-iteration

asked Dec 10 '19 at 01:17

jaja360

13
1
3

0

votes

6 answers

Iterate through all distinct dictionary values in a list of dictionaries

Assuming a list of dictionaries, the goal is to iterate through all the distinct values in all the dictionaries. Example: d1={'a':1, 'c':3, 'e':5} d2={'b':2, 'e':5, 'f':6} l=[d1,d2] The iteration should be over 1,2,3,5,6, does not matter if it is a…

python dictionary value-iteration

asked Jul 12 '18 at 10:24

Krzysztof Słowiński

2,959
5
27
47

0

votes

4 answers

How to avoid creating unnecessary lists?

I keep coming across situations where I pull some information from a file or wherever, then have to massage the data to the final desired form through several steps. For example: def insight_pull(file): with open(file) as in_f: lines =…

python file list-comprehension string-iteration value-iteration

asked Dec 08 '17 at 20:38

blackmore5

73
1
6

Questions tagged [value-iteration]