Survey
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
* Your assessment is very important for improving the workof artificial intelligence, which forms the content of this project
Chapter 3 STATISTIC DECISION THEORY 3.1. Problem presentation Real situations need some decision to be taken that must be determined using various criteria. Unlike classic mathematic statistics, which is used in developing some theories and inference techniques of the θ parameter, using only selection information, the statistic decision theory combines the selection information with other two important aspects regarding the possible consequences of taking a decision and the a priori information about the θ parameter. Knowing the possible consequences of adopting different decisions assumes a quantitative approach of winning or loosing products for each possible decision and for various possible values of the θ parameter. This function, that depends on the decision adopted and on the θ parameter, appears in the technique literature under different names like winning function or utility function. The previous information regarding the θ parameter is obtained from other sources than the static ones, which implies a problem of the respective decision. It is an information that can be obtained from a previous experience about similar situation involving the θ parameter. This a priori information is quantified by a probability distribution involving the θ parameter. The problem of existing of such distribution is a very disputed problem. Using decision theory through an a priori distribution corresponds to a Bayesian approach. The problems involving decision theory can be placed inside the game theory domain, which justifies the fact that this chapter belongs after the chapter corresponding to the one about game theory. The user can or cannot have the possibility of having such a priori experiences before taking the decision. In this case, we have decisions without a priori experience, 35 36 STATISTIC DECISION THEORY decisions with unique experience (or decisions with a given sample number, in which the decision is made after making all observations) and sequential decisions (based on the observation after which we can decide making a new observation or taking a proper decision). The paragraphs 4.2 and 4.3 of this chapter talk about decision taking algorithms in situation in which we have no a priori experiences, but the situation of a fixed volume experience. The decisions can be taken in uncertainty conditions. Paragraph 4.4 represents a series of criteria used for choosing the optimal decisions in case of uncertainty. 3.2. Bayes strategies and minimax strategies Consider that the state parameter set (which is unknown) θ is finite Θ = {θ1 , . . . , θm }. Let us assume that we have an a priori information about θ given by the a priori probability distribution ξ(θ). This distribution is called a mixed strategy. Let us assume that we have the following pure strategies: A = {a1 , ..., an }. We note by L(θ, a) the value of the loss function: L : Θ × A → R+ , which is the loss obtained if the decision a is taken and the state of the parameter is θ. The medium loss is defined as: X L(θ, a)ξ(θ), for every a ∈ A. L(ξ, a) = M [L(θ, a)] = θ∈Θ We call a Bayes strategy the action a∗ that is the most favorable for minimizing the medium loss, for which: L(ξ, a∗ ) = min L(ξ, a). a∈A We call a minimax strategy the most favorable action a∗ that minimizes the maximum loss, for which: max L(ξ, a∗ ) = min max L(ξ, a). θ∈Θ a∈A θ∈Θ If we do not limit ourselves only to pure strategies, then we will use a combination of pure strategies chosen by using a probability law. This strategy is called a mixed strategy. THE CASE OF CONDUCTING A SPECIFIC EXPERIENCE 37 Consider η(a) = (η(a1 ), ..., η(am )) the probability distribution that defines the probability of which we can use pure strategies a1 , ..., am . Generally, we have the following mixed strategies: H = {η1 (a), ..., ηp (a)}. Then, the medium loss is given by: L(ξ, η) = Mθ,a (L(θ, a)) = XX L(θ, a)ξ(θ)η(a). θ∈Θ a∈A In this case, we must find a mixed strategy η ∗ ∈ H for which the medium loss is minimum: L(ξ, η ∗ ) = min L(ξ, η), η∈H or a mixed strategy η ∗ ∈ H for which maximum loss is minimum: max L(ξ, η ∗ ) = min max L(ξ, η). θ∈Θ 3.3. η∈H θ∈Θ The case of conducting a specific experience Suppose that the decision maker, in order to broaden his knowledge of the state of nature, chooses to make an unique experience (for example, the influence estimation of an certain type of drug over a category of patients involves measuring a daily concentration for a period of several months, of a particular protein compound for m patients from the specific category). Consider Z the result space z1 , ..., zl of the experiment. For each result z ∈ Z obtained when the state θ ∈ Θ corresponds a determined probability of p(z|θ), which satisfies the relations: ( p(z|θ) ≥ 0, for every z ∈ Z P p(z|θ) = 1. z∈Z Definition 3.3.1. The triplet formed from the experience result Z, state of the nature space Θ and the conditioned distribution p(z|θ) defined over Z, for each θ ∈ Θ, it is called a sampling space. We will note it with E = (Z, Θ, p). Definition 3.3.2. It is called a decision function, the function d : Z → A, that associates to each result zk ∈ Z an action aj ∈ A, j = 1, n. 38 STATISTIC DECISION THEORY The loss suffered in the case in which the θ parameter state is θi , i = 1, m is given by: L(θi , aj ) = L(θi , d(zk )) = Lzk (θi , d). For a given θ, the result z of the experience will be an aleatory variable determined by the conditioned probability p(z|θ), meaning that the loss Lz (θ, d) is also a random variable and will be accomplished within the same probability. Definition 3.3.3. We call a risk function the function ρ : Θ × D → R given by: X ρ(θ, d) = M [Lz (θ, d)] = Lz (θ, d)p(z|θ), z∈Z which represents the medium loss value within the result space Z. We reach the conclusion that the decision space D plays an unique experience in the strategy problem, in case of conducting it; it has the same role as the space A in the strategy problem without experience. The resolving methods of the two problem types will be the same. Mixed strategies η(d) defined over D can be used. The risk function is in this case equal to: X ρ(θ, d)η(d) ρ(θ, η) = M [ρ(θ, d)] = = XX d∈D Lz (θ, d)p(z|θ)η(d). d∈D z∈Z The minimax principle consists in choosing the proper η ∗ (d) strategy, for which the medium risk is the smallest, in the case in which the θ parameter state is the most unfavorable. The strategy η ∗ (d) is chosen so that: ρ(θ, η ∗ ) = min max ρ(θ, η), η θ∈Θ and in this case the proper value of the risk is called the minimax risk. The Bayes principle minimizes the medium risk defined by: X ρ(ξ, d) = ρ(θ, d)ξ(θ). θ∈Θ The pure strategy d∗ is chosen so that: ρ(ξ, d∗ ) = min ρ(ξ, d). d∈D OPTIMAL DECISIONS IN UNCERTAINTY CASES 3.4. 39 Optimal decisions in uncertainty cases Until now we presented the decision taking process in risk conditions, which represents the fact that the probabilities proper to parameter θ ∈ Θ are known (or can be determined). In the next paragraphs we will deal with the decision taking processes conducted in uncertainty conditions, when probabilities associated with θ are not known. The fact that the attitude toward decision is subjective, makes that, within decision theory, we do not not always find an universal applicable criteria. We will present some of the choosing criteria in case of uncertainty, with the mention that their usage can led to different outcomes. A way of choosing a decision could be the one of choosing the indicated strategy of an result from applying multiple criteria. Consider the situation in which we use m pure strategies a1 , . . . , am , and for the θ parameter we consider n possible states θ1 , . . . , θm . The element qij represents the win accomplished if the action ai is adopted and the θ parameter state is θj , i = 1, m, j = 1, n. 3.4.1. Hurwicz criterion Hurwicz criterion or the optimism criterion chooses an optimal strategy that has the action ai which corresponds to: max[εQi + (1 − ε)qi ], i where ε ∈ [0, 1] is called the chooser optimism, with qi = min qij a̧nd Qi = max qij . j 3.4.2. j Savage criterion Savage criterion or the regret criterion chooses as an optimal strategy the action ai which corresponds to: min max bij = max min bij , j i i j (if a pure strategy cannot be chosen, meaning that the matrix does not have a saddle point, an optimal mixed strategy will be used), where the regret matrix (the difference between the win gained by taking a decision without knowing the nature state and the win accomplished gained if those states were known) is defined as: bij = max qkj − qij , i = 1, m, j = 1, n. k 40 STATISTIC DECISION THEORY 3.4.3. Bayes-Laplace criterion Bayes-Laplace criterion chooses as an optimal strategy the action ai which corresponds to: n 1X max[ qij ], i n j=1 meaning that it is supposed that all the nature states have the same probability. 3.4.4. Wald criterion Wald criterion chooses, if the game has a saddle point, as an optimal strategy, that action ai which corresponds to the minimax principle. If the game does not have a saddle point, then the optimal mixed strategy is determined with the probabilities x1 , ..., xm , that maximizes: m X min[ qij xi ]. j 3.5. i=1 Applications Exercise 3.5.1. A cement fabric line can use as raw material three types of sand, whose parameters are θ1 , θ2 , and θ3 . We know that the production line can use about 60% from the first type, 30% from the second type and 10% from the third type and can function in three ways a1 , a2 and a3 . The losses are given in table 4.1. TABLE 4.1 Θ θ1 θ2 θ3 ξ(θ) 0, 6 0, 3 0, 1 a1 0 0, 5 1 a2 α 2α α a3 2 1 3 Compute the medium loss according to the a priori repartition. What is the optimal strategy in the Bayes way? Solution. The medium loss corresponding to the a priori probability is: L(ξ, a1 ) = 0 · 0, 6 + 0, 5 · 0, 3 + 1 · 0, 1 = 0, 25 L(ξ, a2 ) = α · 0, 6 + 2α · 0, 3 + α · 0, 1 = 1, 3 · α APPLICATIONS 41 L(ξ, a3 ) = 2 · 0, 6 + 1 · 0, 3 + 3 · 0, 1 = 1, 8. The Bayes strategy is the one that minimizes the medium loss, meaning the 5 5 strategy a1 if α ≥ or strategy a2 if α ≤ . 26 26 5 We can see that for α = we have two optimal strategies a1 and a2 . 26 Exercise 3.5.2. The same problem as 4.5.1, only this time we have a mixed strategy with the frequency of 20%, 30%, and 50%. Exercise 3.5.3. Consider the game against the nature, in the matrix form: TABLE 4.2 A/B a1 a2 a3 a4 θ1 3 2 4 5 θ2 −5 3 −1 0 θ3 4 1 1 3 θ4 2 0 0 2 θ5 1 −1 3 4 The game matrix is created in rapport to the statistic player A (maximizing) and represents the win accomplished by him. Apply the Hurwicz, Savage, Bayes and Wald criterion for choosing the optimal decision. Solution. i) For applying Hurwicz criterion: TABLE 4.3 a1 a2 a3 a4 θ1 3 2 4 5 θ2 −5 3 −1 0 θ3 4 1 1 3 θ4 2 0 0 2 θ5 1 −1 3 4 Qi 4 3 4 5 qi −5 −1 −1 0 εQi + (1 − ε)qi 9ε − 5 4ε − 1 5ε − 1 5ε we determine: max[εQi + (1 − ε)qi ] = 5ε i where ε ∈ [0, 1]. This means that the optimal strategy is a4 . ii) For applying the Savage criterion we determine the regret matrix: 42 STATISTIC DECISION THEORY TABELUL 4.4 (bij ) θ1 θ2 θ3 θ4 θ5 min bij a1 a2 a3 a4 max bij 2 3 1 0 3 8 0 4 3 8 0 3 3 1 3 0 2 2 0 2 3 5 1 0 5 0 0 1 0 5\1 i j We can see the fact that the game does not have a saddle point and we will determine the mixed strategy using the basic or dual simplex algorithm. The solution for this problem can be solved by using the MAPLE software, and the syntax for it is: > with(simplex) : > minimize(x + y + z + w, 2 ∗ x + 3 ∗ y + z >= 1, 8 ∗ x + 4 ∗ z + 3 ∗ w >= 1, 3 ∗ y + 3 ∗ z >= 1, 2 ∗ y + 2 ∗ z >= 1, 3 ∗ x + 5 ∗ y + z >= 1, N ON N EGAT IV E); with the problem solution x = 0, y = 1/4, z = 1/4, w = 0. This leads us to the optimal mixed strategy: 1 1 x1 = 0, x2 = , x3 = , x4 = 0. 2 2 iii) For applying the Bayes-Laplace criterion we consider that the states θi have the same probability: TABLE 4.5 a1 a2 a3 a4 θ1 θ2 θ3 θ4 θ5 3 2 4 5 −5 3 −1 0 4 1 1 3 2 0 0 2 1 −1 3 4 1 n n P j=1 qij 1 1 7/5 14/5 meaning that we choose the a4 strategy because it maximizes: n 1X qij . n j=1 APPLICATIONS 43 iv) For applying the Wald criterion, because the game does not have a saddle point, we will choose a mixed strategy using the probabilities x1 , ..., xm , which maximizes: m X min[ qij xi ]. j i=1 The optimal strategy is determined by using the simplex algorithm. The MAPLE syntax for resolving such linear programming problem is: > with(simplex) : > minimize(x + y + z + w, 2 ∗ x + 2 ∗ y + 4 ∗ z + 5 ∗ w >= 1, −5 ∗ x + 3 ∗ y − z >= 1, 4 ∗ x + y + z + 3 ∗ w >= 1, 2 ∗ x + 2 ∗ w >= 1, x − y + 3 ∗ z + 4 ∗ w >= 1, N ON N EGAT IV E); and the solution to this is x = 0, y = 2/5, z = 0, w = 3/5. This leads to the following optimal strategy: 2 3 x1 = 0, x2 = , x3 = 0, x4 = , 5 5 the value x3 = 0 is explained by the fact that the strategy a3 ≺ a4 . Exercise 3.5.4. Consider the game against the nature, in the matrix form: TABLE 4.6 A/B a1 a2 a3 a4 θ1 3 2 4 5 θ2 −5 3 −1 0 θ3 4 1 1 3 θ4 2 0 0 2 θ5 1 −1 3 4 The game matrix is created in rapport with player A (maximizing) and represents the win accomplished by it. We have the following a priori distribution over the θ parameter: (0, 2; 0, 1; 0, 1; 0, 1; 0, 6). What is the medium loss for the a priori distribution? What is the optimal Bayes action? Exercise 3.5.5. (Two envelopes paradox.) Consider the following situation: you are asked to select at random one of the two identically envelopes which contain a certain sum of money. One envelope contains twice as much as the other. After you pick at random one of the enveploes you may keep that envelope or is given to you the possibility to take the other envelope instead. At this time you make the following reasoning: 1. Denote by X the amount of money from the selected envelope; 44 STATISTIC DECISION THEORY 2. The probability that X is the smaller amount is 1/2, and that it is the larger amount is also 1/2; 3. The other envelope may contain either 2X or X/2; 4. If X is the smaller amount, then the other envelope contains 2X; 5. If X is the larger amount, then the other envelope contains X/2; 6. Thus the other envelope contains 2X with probability 1/2 and X/2 with probability 1/2; 7. So the expected value of the money in the other envelope is: 1 1X 5 2X + = X 2 2 2 4 8. This is greater than X, so I gain on average by swapping; 9. After the switch, I can denote that content by Y and reason in exactly the same manner as above; 10. I will conclude that the most rational thing to do is to swap back again; 11. To be rational, I will thus end up swapping envelopes indefinitely; 12. As it seems more rational to open just any envelope than to swap indefinitely, we have a contradiction. Find the the flaw of reasoning above. Solution. X stands for different things at different places in the expected value calculation, step 7 above. In the first term X is the smaller amount while in the second term X is the larger amount. To mix different instances of a variable in the same formula like this is said to be illegitimate, so step 7 is incorrect, and this is the cause of the paradox. The corrected step is: 1 3 1 X + 2X = X 2 2 2 According to the correct calculation there is no need to swap envelopes, and certainly no need to swap indefinitely. Exercise 3.5.6. (Monty Hall problem.) In Monty Hall problem we have a prize (a trip to Las Vegas) which is placed by the organizer of the game into a box. Also the organizer places, into two similar boxes, two baby prizes: a pencil. We are asked to select one box (not opening it yet). After the selection the organizer opens one of the remaining boxes which has a pencil inside. At this time, the organizer gives us two options: changing the decision (selecting the other box) or not changing and keeping the initial box. What shall we do to increase our chances: change or not? APPLICATIONS 45 Solution. If we change our choise we double the probability of win from 33% to 66%. Exercise 3.5.7. (Birthday paradox.) You are a teacher of a class of 23 students. Compute: a) The probability that at least two students have the same birthday; b) The probability that at least one student is born in the same day with you. Solution. a) µ ¶ 365 23! · 23 ≈ 0.507 P =1− 23 365 b) µ P =1− 364 365 ¶23 ≈ 0.061 46 STATISTIC DECISION THEORY