Section 19 Probabilistic Dynamic Programming .


60 views
Uploaded on:
Category: Food / Beverages
Description
Section 19 Probabilistic Element Programming. to go with Operations Research: Applications and Calculations fourth version by Wayne L. Winston. Copyright (c) 2004 Streams/Cole, a division of Thomson Learning, Inc. Portrayal.
Transcripts
Slide 1

Section 19 Probabilistic Dynamic Programming to go with Operations Research: Applications and Algorithms fourth version by Wayne L. Winston Copyright (c) 2004 Brooks/Cole, a division of Thomson Learning, Inc.

Slide 2

Description In deterministic element likelihood, a determination of the present state and current choice was sufficient to let us know with assurance the new state and the expenses amid the present stage. In numerous down to earth issues, these elements may not be known with conviction, even us the present state and choice are known. This part discloses how to utilize dynamic programming to tackle issues in which the present time frame\'s cost or the following time frame\'s state are arbitrary. We call these issues probabilistic element programming issues (or PDPs)

Slide 3

19.1 When Current Stage Costs Are Uncertain, yet the Next Period\'s State is Certain For issues in this area, the following time frame\'s state is known with conviction, however the reward earned amid the present stage is not known with assurance. At the cost of $1/gallon, the Safeco Supermarket chain has bought 6 gallons of drain from a nearby dairy. Every gallon of drain is sold in the chain\'s three stores for $2/gallon. The dairy must purchase back 50 ¢/gallon any drain that is left toward the day\'s end.

Slide 4

Unfortunately for Safeco, interest for each of the chain\'s three stores is questionable. Past information demonstrate that the day by day request at each store is appeared in the book. Safeco needs to apportion the 6 gallons of drain to the three stores in order to augment the normal net day by day benefit earned from drain. Utilize dynamic programming to decide how Safeco ought to distribute the 6 gallons of drain among the three stores.

Slide 5

except for the way that the request is unverifiable, this issue is fundamentally the same as the asset assignment issue. Watch that since Safeco\'s every day buy expenses are dependably $6, we may focus our consideration on the issue of dispensing the drain to expand day by day income earned from the 6 gallons. Characterize R t (g t ) = expected income earned from g t gallons appointed to store t F t (x) = greatest expected income earned from x gallons relegated to stores t , t +1… ,3

Slide 6

For t = 1, 2, we may compose were g t must be an individual from {0,1,… , x }. We start processing the r t (g t ) \'s. Finish calculations can be found in the book.

Slide 7

19.2 A Probabilistic Inventory Model In this area, we change the stock model from Chapter 6 to take into consideration indeterminate request. This will show the troubles required in fathoming a PDP for which the state amid the following time frame is unverifiable.

Slide 8

Each period\'s request is similarly liable to be 1 or 2 units. In the wake of taking care of the present time frame\'s demand out of current creation and stock, the company\'s end-of-period stock is assessed, and a holding expense of $1 per unit is evaluated. As a result of constrained limit, the stock toward the finish of every period can\'t surpass 3 units. It is required that all request be met on time. Any stock available toward the finish of period 3 can be sold at $2 per unit.

Slide 9

At the start of period 1, the firm has 1 unit of stock. Utilize dynamic programming to decide a creation strategy that limits the normal net cost acquired amid the three time frames.

Slide 10

Solution Define f t ( i ) to be the base expected net cost brought about amid the periods t , t +1,… 3 when the stock toward the start of period t is i units. At that point where x must be an individual from {0,1,2,3,4} and x must fulfill (2 - i ) ≤ x ≤ ( 4 - i ).

Slide 11

For t = 1, 2, we can infer the recursive connection for f t ( i ) by taking note of that for any month t generation level x , the normal expenses caused amid periods t , t+1 , … ,3 are the whole of the normal expenses acquired amid period t and the normal expenses brought about amid periods t+1 , t+2 , … ,3 . As some time recently, if x units are delivered amid month t , the normal cost amid month t will be c(x) + (½) (i+ x - 1)+ (½)(1+ x - 2). In the event that x units are delivered amid month t , the normal cost amid periods t+1 , t+2 , … ,3 is figured as takes after.

Slide 12

Half of the time, the request amid period t will be 1 unit, and the stock toward the start of t+1 will be i + x – 1. In this circumstance, the normal expenses brought about amid periods t+1 , t+2 , … ,3 is f t+1 ( i + x - 1). Also, there is a ½ chance that the stock toward the start of period t +1 will be i + x – 2. For this situation, the normal cost brought about amid periods t+1 , t+2 , … ,3 will be f t+1 ( i + x - 2). In outline, the normal cost amid the periods t+1 , t+2 , … ,3 will be (½) f t+1 ( i + x - 1) + (½) f t+1 ( i + x - 2).

Slide 13

With this at the top of the priority list, we may compose for t = 1,2 where x must be an individual from [0,1, 2, 3, 4} and x must fulfill (2-i ) ≤ x ≤ (4-i ). Summing up the reason that prompted to the above condition yields the accompanying vital perception concerning the detailing of PDPs.

Slide 14

Suppose the conceivable states amid period t +1 are s 1 , s 2 , … sn and the likelihood that the period t +1 state will be s i is p i . At that point the base expected cost brought about amid periods t +1, t +2,… , end of the issue is the place f t+1 (s i ) is the base expected cost caused from period t +1 to the finish of the issue, given that the state amid period t +1 is s i .

Slide 15

We characterize x t (i) to be a period t generation level accomplishing the base in (3) for f t (i) . We now work in reverse until f (1) is resolved The applicable calculations are compressed in the tables in the book. Since every period\'s closure stock must be nonnegative and can\'t surpass 3 units, the state amid every period must be 0,1,2, or 3.

Slide 16

( s, S ) Policies Consider the accompanying adjustment of the dynamic parcel measure show from Chapter 6, for which there exists an ideal generation strategy called a ( s,S) stock approach: The cost of delivering x >0 units amid a period comprises of a settled cost K and a for each unit variable creation cost c . With a likelihood p(x) , the request amid a given period will be x . A holding expense of h per unit is gotten to on every period\'s end stock. On the off chance that we are short, a for each unit lack cost of d is acquired.

Slide 17

The objective is to limit the aggregate expected cost caused amid periods 1,2,… , T . All requests must be met before the finish of the period T . For such a stock issue, Scarf utilized element programming to demonstrate that there exists an ideal generation approach of the accompanying structure: For every t ( t =1,2,… T ) there exists a couple of numbers ( s t , S t ) with the end goal that in the event that i t-1 , the entering stock for period t , is not as much as s t , then a sum S t - i t-1 is delivered; on the off chance that i t-1 ≥s t , then it is ideal not to create amid period t . Such an arrangement is called a (s,S) approach .

Slide 18

19.3 How to Maximize the Probability of a Favorable Event Occurring There are many events on which the\'s chief will likely amplify the likelihood of an ideal occasion happening. To tackle such an issue, we appoint a reward of 1 if the great occasion happens and a reward of 0 on the off chance that it doesn\'t happen. At that point the amplification of expected reward will be equal to boosting the likelihood that the great occasion will happen.

Slide 19

Martina\'s ideal technique relies on upon the estimations of the parameters characterizing the issue. This is a sort of affectability examination reminiscent of the affectability investigation that we connected to straight programming issues.

Slide 20

19.4 Further Examples of Probabilistic Dynamic Programming Formulations Many probabilistic element programming issues can be illuminated utilizing recursions of the accompanying structure (for max issues):

Slide 21

Probabilistic Dynamic Programming Example When Sally Mutton touches base at the bank, 30 minutes remain n her meal break. In the event that Sally makes it to the leader of the line and enters benefit before the finish of her meal break, she gains compensate r . Be that as it may, Sally loathes holding up in lines, so to mirror her aversion for holding up in line, she brings about a cost of c for every moment she holds up. Amid a minutes in which n individuals are in front of Sally, there is a likelihood p(x|n) that x individuals will finish their exchanges.

Slide 22

Suppose that when Sally arrives, 20 individuals are in front of her in line. Utilize dynamic programming to decide a methodology for Sally that will augment her normal net income (remunerate holding up expenses).

Slide 23

Solution When Sally touches base at the bank, she should choose whether to join the line or surrender and leave. At any later time, she may likewise choose to leave in the event that it is improbable that she will be served before the finish of her meal break. We can work in reverse to take care of the issue. We characterize f t (n) to be the greatest expected net reward that Sally can get from time t to the finish of her meal break if at time t , n individuals are in front of her.

Slide 24

We let t =0 be available and t =30 be the finish of the issue . Since t =29 is the start of the last moment of the issue, we compose This takes after on the grounds that if Sally leaves at time 29, she wins no reward and brings about no more expenses.

Slide 25

On the other hand, on the off chance that she remains at time 29, she will bring about a holding up cost of c ( an income of – c ) and with likelihood p(n|n) will enter benefit and get a reward r . Accordingly, if Sally stays, her normal net reward is rp(n|n)- c . For t <29, we compose

Slide 26

The last recursion takes after, on the grounds that if Sally stays, she will gain a normal reward (as in the t =29 case) of rp(n|n)- c amid the present moment, and with likelihood p(k|n) , there will be n-k individuals in front of her; for this situation, her normal net reward from time t +1 to time 30 will be f t+1( n-k

Recommended
View more...