Decomposition Algorithm - Decision-making Strategies

5.2 Decision-making Strategies

5.2.3 Decomposition Algorithm

Although Nash equilibrium solves the problem of selecting optimal joint decisions for all the players in a game Gj, there is still a huge amount of computation due to plenty of players and the possible decisions of each player. Specifically, the space complexity of this gameO(Gj)is linear to the actual number of the players defined as N_j, and the number of decisions of each player defined as|di,j|, for all i∈ΩNj={1,2,· · ·,N_j} ⊂ ΩN. The polynomial of the space complexity of each game G_j and the lane-changing system can be expressed as in (5.11) and (5.12), respectively:

O(Gj) =|d_1,_j| · |d_2,j| · · · |dN_j,j|=|di,j|^∑ⁿⁱ⁼¹^Cel^i,j,∀i∈ΩN_j,∀j∈ΩJ (5.11)

O(Sys) =

∑

J j=1

O(Gj) =

∑

J j=1

|di,j|^∑ⁿⁱ⁼¹^Cel^i,j,∀i∈ΩNj,∀j∈ΩJ (5.12) where |di,j| is the number of the decisions which can be got by looking up Table5.1 (i.e., 7 ), so the decisions for each player are the same, i.e.,|d1,j| · |d2,j| · · · |dNj,j|=|di,j|.

Since the number of players in each game is dynamic, N_j is determined by the sum of the existing players of game G_j, i.e.,∑ⁿ_i=1Cel_i,j. As is shown in (5.11), the size of space complexity expands exponentially as the number of players increases. In Fig. 5.3 on the left side, connections among players in this system can be demonstrated in a graph.

Nodes represent players corresponding to the vehicles in the same game. The edges connected between players mean that the decision made by one player influences the other’s utility function. Therefore, the size of the space complexity can be calculated by (5.11) as O(Gj) =|d_1,_j| · |d_2,_j| · · · |d_5,_j| =7⁵. It will cost a lot of computational time with such a large complexity. However, the computational time complexity is quite tricky to state since the computational time of finding an equilibrium solution is unknown. The time complexity of finding an equilibrium solution remains an important and long-standing open problem, which is determined by many factors such as game types, the number of players and the decision combination (Daskalakis et al.,2009).

Thus, to reduce the complexity and improve the computational efficacy, the game G_j could be decomposed into several smaller subgames when the number of players exceeds a certain number. Similarly, the subgames for each game G_j can also be for- mulated as 3-tuple SG_k,_j=<SP_k,_j,SD_k,j,SU_k,_j>where SP_k,j={· · ·,p^k_i,j,· · ·} ⊂P_j is a finite set of players in the k^th subgame SG_k,j, SD_k,_j = (· · ·,d^k_i,_j,· · ·) is a vector of joint decisions of all the players in the k^th subgame SG_k,_j, SU_k,_j= (· · ·,u^k_i,_j,· · ·)is the

utility vector of all the players in the k^th subgame SG_k,_j, i∈Ω_N^k

j ⊂ΩNj,k∈ΩKj =

{1,2,· · ·,K_j}. According to the definitions in (Iwase and Shiga, 2016), the following condition should be satisfied:

d_i,_j=d_i,^k_j; u_i,_j(SD_k,_j) =u^k_i,_j(SD_k,_j),∀i∈Ω_N^k

j ⊂ΩNj,∀j∈ΩJ,∀k∈ΩKj (5.13)

P_j=SP_1,j∪SP_2,j∪ · · · ∪SP_K_j_,_j (5.14) where u^k_i,j(SDk,j)is the utility value of player p^k_i,j with taking a joint decision. All the subgames{SG1,j,SG_2,_j,· · ·SG_K_j_,_j}constitute a whole game G_j.

As shown in Fig. 5.3 on the right side, the whole game G_j is decomposed into three subgames {SG1,j,SG_2,_j,SG_3,_j}, and each subgame contains three players, i.e., SP_1,j={p¹_1,_j,p¹_2,j,p¹_3,_j},SP_2,_j={p²_2,j,p²_3,_j,p²_4,_j},SP_1,_j={p³_3,_j,p³_4,_j,p³_5,j}respectively.

Thus, the total size of space complexity can be obviously obtained from the decomposed subgames and (5.11), i.e.,

O(Gj) =

k=1

∑

(

∏

|d_i,^k_j|),∀i∈Ω_N^k

j ⊂ΩNj,∀j∈ΩJ,∀k∈ΩKj (5.15) In this system, the vehicles as players with the same properties are connected to each other in the whole game, so subgames in the decomposition are equal and symmetrical.

Attribute to this property, the number of the players m_kin each decomposed subgame of game G_jis equal, i.e., m₁=m₂=· · ·=m_k=· · ·=m_K_j=M where M is a constant. Then Kjis determined by M in this case, i.e., Kj=Nj−M+1. Therefore, the computational complexity in (5.15) can be simplified and converted into:

O(Gj) =|d_i,j^k |^M∗K_j=|di,j|^M∗(

∑

n i=1

Cel_i,j−M+1) (5.16)

where N_j=∑ⁿ_i=1Cel_i,_j aforementioned and d_i,_j=d_i,j^k according to (5.13). In this case shown in Fig.5.3, O(Gj) =7³∗3 can be got by replacing the exact number|di,j|=7 and M =3 into (5.16), which is decreased dramatically compared to the size of the space complexity of a single game G_jin (5.11), as well as the space complexity of the system by replacing (5.16) into (5.12).

Selecting the decisions from the subgames and combining them into a whole de- cision combination for all the players in the whole game is a challenging task in the decomposition. The decomposition happens when the number of players in a whole

P_2,j

P_1,j

P_3,j

P_4,j P_5,j

P_1,j

P_2,j P_3,j

P_4,j

P_3,j

P_2,j P_3,j

P_4,j P_5,j

|d_Úá•^Ú|

|dÛ^Úá•|

|d_2.j|

|d_1,j|

|d_Üá•^Ú|

|d_3,j|

|d_4,j| |d_5,j|

= |d1,j| |d_2,j_« _G_5,j|

SG_1,j

SG_2,j

SG_3,j Decomposition

|dÛ^Ûá•| |dÜ^Ûá•|

|d_Ýá•^Û| G_j

|d_Üá•^Ü|

|d_Ýá•^Ü| |d_Þá•^Ü|

Fig. 5.3: The graphic process of decomposing games.

Fig. 5.4: The schematic of decomposing games in the lane-changing system.

game exceeds a certain number, i.e., 3 in this case. As shown in Fig. 5.4, the whole game with four players is decomposed into two subgames (i.e.,{SG1,jand SG_2,_j}), and the whole game with five players is decomposed into three subgames (i.e.,{SG1,j,SG_2,_j and SG_3,_j}). For instance, the Nash equilibrium solutions for each subgame SG_k,_jcan be obtained from (5.9) to (5.10), defined as(· · ·,d_i,^∗,k_j,· · ·)and got by solving:

u^k_i,j(· · ·,d_i,j^∗,k,· · ·)≥u^k_i,j(· · ·,d_i,j^k ,· · ·),∀i∈Ω_Nk

j ⊂ΩNj,∀j∈ΩJ,∀k∈ΩKj (5.17)

where(· · ·,d_i,j^∗,k,· · ·)is the Nash equilibrium solution of the k^thsubgame. It is a nonlin- ear system due to the uncertainty of the collision constraints so that the situation can be discussed in three cases: (a) completely consistent equilibrium. (b) partially consistent equilibrium. (c) inconsistent equilibrium.

Definition 5.1 [Completely consistent equilibrium]. Let {SG1,j,SG_2,_j,· · ·,SG_K_j_,_j} be the decomposition of game G_j, and SD^∗_k,_j = (· · ·,d_i,^∗,k_j,· · ·) be the Nash equilib- rium of decomposed subgame SG_k,j for all k∈ΩK_j. A Nash equilibrium combination (SD^∗_1,_j∪SD^∗_2,_j∪ · · · ∪SD^∗_K_j_,_j) = (d_1,j^∗,1,d_2,^∗,2_j,· · ·,d_K^∗,K_j_,_j^j,d_K^∗,K_j_+1,j^j ,· · ·,d_K^∗,K_j_+M−1=N^j _j_,_j) of all decomposed subgames{SG_1,j,SG_2,_j,· · ·,SG_K_j_,j}is a Completely Consistent Equilib- rium (CCE) iff

d_i,^∗,1_j =d_i,^∗,2_j =· · ·=d_i,^∗,k_j =· · ·=d_i,^∗,K_j ^j,∀i∈ΩNj,∀j∈ΩJ,∀k∈ΩKj (5.18)

Theorem 5.1 [CCE theorem]. If CCE of all decomposed subgames{SG1,j,SG_2,j,· · ·, SG_K_j_,_j}exists, then CCE is also a Nash equilibrium of the whole game G_j, i.e.,(d_1,j^∗ ,d_2,j^∗ ,

· · ·,d_N^∗_j_,_j) = (d_1,^∗,1_j,d_2,^∗,2_j,· · ·,d_K^∗,K_j_,_j^j,d_K^∗,K^j

j+1,j,· · ·,d^∗,K_K ^j

j+M−1=Nj,j).

Proof. Now that SD^∗_k,_j = (· · ·,d_i,^∗,k_j,· · ·) is a Nash equilibrium of the subgame SG_k,_j, SD^∗_k,_j maximizes the utility value u^k_i,_j of all the players in each subgame as in (5.17),

∀i∈Ω_N^k

j,∀k∈ΩKj. Thus, CCE is a Nash equilibrium combination of all the subgames which also maximizes all the utility value of all the players in game Gjif the condition of Definition5.1is satisfied, i.e.,

u_i,_j(d_1,^∗_j,d_2,^∗_j,· · ·,di,j, ...,d_N^∗_j_,_j)≤u_i,_j(d^∗,1_1,_j,d_2,^∗,2_j,· · ·, d_K^∗,K_j_,j^j,d_K^∗,K^j

j+1,j,· · ·,d_K^∗,K^j

j+M−1=Nj,j),∀i∈ΩN_j,∀j∈ΩJ

(5.19)

Fig. 5.4 shows the example of a game with 5 players in this lane-changing sys- tem, the game G_jis decomposed into three subgames (i.e.,{SG1,j,SG_2,_jand SG_3,_j}).

Each subgame SG_k,_j,k=1,2,3 has a solution of Nash equilibrium, i.e.,(d_1,j^∗,1,d_2,j^∗,1,d_3,^∗,1_j), (d_2,j^∗,2,d_3,^∗,2_j,d_4,^∗,2_j), (d_3,^∗,3_j,d^∗,3_4,_j,d_5,^∗,3_j)for SG_1,_j,SG_2,j and SG_3,j, respectively. According to Definition5.1,(d^∗,1_1,_j,d_2,^∗,2_j,d_3,^∗,3_j,d_4,j^∗,3,d_5,^∗,3_j)is a CCE of game G_jiff

d_1,^∗,1_j, d_2,^∗,1_j =d_2,^∗,2_j, d_3,j^∗,1=d^∗,2_3,_j =d_3,^∗,3_j, d_4,j^∗,2=d_4,^∗,3_j, d_5,^∗,3_j (5.20)

Definition 5.2 [Partially consistent equilibrium]. Let{SG1,j,SG_2,_j,· · ·,SG_K_j}be the decomposition of game G_j, k_s∈ΩKj and k_e∈ΩKj be the start index (the first) and end index (the last) of the partial subgames, respectively. A Nash equilibrium combination (SD^∗_k_s_,_j· · · ∪SD^∗_k,_j· · · ∪SD^∗_k_e_,_j) = (d_k^∗,k^s

s,j,· · ·,d_i,^∗,k_j,· · ·,d_k^∗,k^e

e+M−1,j)of partial decomposed subgames{SG_k_s_,_j,· · ·,SG_k,_j,· · ·, SG_k_e_,j}is a Partially Consistent Equilibrium (PCE) of the whole game G_jiff

d_i,^∗,k_j^s =· · ·=d^∗,k_i,_j =· · ·=d^∗,k_i,_j^e and 1≤k_e−k_s<K_j,

∀i∈ΩNj,k_s≤i≤k_e+M−1,∀j∈ΩJ,∀k∈ΩKj,k_s≤k≤k_e

(5.21)

Obviously, PCE can determine the decisions of the players in partial subgames. The decisions maximize the utility value in the partial subgames. The combination of these decisions also maximizes the utility value of the corresponding players in the game Gj, which is proved and deducted from Theorem 5.1. It is a symmetry system, so PCE solutions can be multiple shown in Fig. 5.5, E_pc^∗ is defined as the optimal equilibrium of PCE with including the most subgames (i.e., k_e−k_s =max). There are three cat- egories of PCE which is classified by the starting point of the subgame, defined as E^y_pc,y=1,2,3. E.g., the procedure of updating E¹_pc (i.e., the first type of PCE) starts from the first subgame, i.e., k_s =1 and satisfies (5.21), so E¹_pc can be represented by (d_1,j^∗,1,· · ·,d_i,^∗,k_j,· · ·,d_k^∗,k^e

e+M−1,j),∀k∈ΩKj,2≤k_e <K_j. Generally, the optimal E¹_pc ex- pressed as E^∗,1pc with maximum k_e will be chosen to start the procedure, these decisions are selected to dominate the whole game Gj since it maximize the utility value for the players p1,p2,· · ·,p_k_e_+M−1. Then the optimal decisions of the rest players in game G_j will be selected in a certain direction based on E_pc^∗,1, which means the neighbour player will update the decision with the collision constraints prorogated by the previous subgames. The whole decision updating process can be expressed as follows:









 d_N^∗,K^j

j,j =arg max u^K_N^j

j,j(d^K_N_j^j_,j|E^∗,1_pc), If k_e=K_j−1 (5.22a) (d_k^∗,k_e_+M,j^e⁺¹,· · ·,d_N^∗,K^j

j,j) =NashE(u^k_k^e_e⁺¹_+M,_j,· · ·,u^K_N^j

j,j|E^∗,1_pc), If K_j−M≤k_e<K_j−1

(5.22b)

see in (5.23), If 2≤k_e<K_j−M (5.22c) where NashE(α|β)is defined as the calculational function of conditional Nash equi- librium with two partsα ^and β. This function is to find Nash equilibrium among the rest players corresponding to the utility combination which is the first partα ^{based on}

SGÚá•

P1,j

d^ÛáÚ_Úá•

P_2,j d^ÛáÚ_Ûá•

P_M,j d^Ûá_yá•^Ú

SG_Ûá_• P_2,j

d^Ûá_Ûá•^Û

P_3,j d^ÛáÜá•^Û

P_M+1,j d_y^Ûá_EÚá^Û_•

SG_w

•FÚá•

P_N-M,j P_N-M+1,j

d_z^Ûá^w^•^F^Ú

•FyEÚá•

P_z

•FÚá•

d^Ûáw_z^•^FÚ

•FÚá•

« SG_‘á•

SG_w

•á á•

P_N-M+1,j

d_z ^Ûá^w^•

•FyEÚá•

P_N-M+2,j P_z

•á•

d^Ûáw_z^•

•á•

d_z ^Ûá^w^•

•FyEÛá•

d^Ûá_z^w^•^FÚ

•Fyá•

The updating direction of E_–‰^Ú The updating direction of E_–‰^Û The updating direction of E_–‰^Ü The subgames of E_–‰^Ú The subgames of E_–‰^Û The subgames of E_–‰^Ü

Fig. 5.5: The schematic of the updating process of PCE.

(5.9), given that another event β occurred which is E^∗,1_pc in this case. The operation arg max u^K_N^j

j,j(d_N^K_j^j_,_j|E^∗,1pc) is also a conditional operation which is similar to the calcu- lational process of conditional Nash equilibrium. The decision is updated in (5.22a) for maximizing the utility with the determined E^∗,1_pc if only the last player remains (i.e., k_e=K_j−1). If the number of remained players who have not updated the decisions satisfies the range[2,M](i.e, K_j−M≤k_e<K_j−1), then a conditional Nash equilib- rium is considered among these players with E^∗,1pc in (5.22b). The last case (5.22c) is for updating the decisions when the remained players exceed M, and each step updates the decisions with the maximum number M (i.e., unit number of players in a subgame).

Theorem 5.2 [PCE theorem]. If E_pc¹ = (SD^∗_k_s_=1,_j· · · ∪SD^∗_k,_j· · · ∪SD^∗_k

e,j)of partial de- composed subgames exists, then the decision updating process of the remained players with a number more than M satisfies the recursion property iff 2≤k_e<K_j−M and∀ integer x, 0<x<(Kj−k_e)/M, i.e.,

(d_k^∗,k^e^+(x−1)M+1

e+xM,j ,· · ·,d_k^∗,k^e^+xM

e+(x+1)M−1,j) =NashE(u^k_k^e^+(x−1)M+1

e+xM,j ,· · ·,u^k_k^e^+xM

e+(x+1)M−1,j|(E^∗,1_pc, NashE(u^k_k^e^+(x−2)M+1

e+(x−1)M,j ,· · ·,u^k_k^e^+(x−1)M

e+xM−1,j|(E_pc^∗,1,NashE(...|E^∗,1_pc)))))

(5.23)

Proof. It is obvious that the first step in (5.24a) can be got for updating M decisions based on E^∗,1pc by referring to Fig.5.5. With the updating direction of E^∗,1pc in Fig.5.5, the second step in (5.24b) for updating another M decisions can also be got based on

E^∗,1pc and the result of the first step. The next step is updated on the basis of the previous step, and this updating process continues until (5.24d). Thus, the expression of (5.23) can be got by replacing (5.24a) into (5.24b), (5.24b) into (5.24c), · · ·, until (5.24d).

Finally, the number of remaining players who have not updated the decisions decreases as (5.22c) implements, and (5.22a) and (5.22b) will be used until the number of the remained players is reduced to satisfy the corresponding condition.











(d_k^∗,k_e_+M,^e⁺¹_j,· · ·,d_k^∗,k^e^+M

e+2M−1,j) =NashE(u^k_k^e_e⁺¹_+M,_j,· · ·,u^k_k^e^+M

e+2M−1,j|E^∗,1_pc) (5.24a) (d_k^∗,k_e_+2M,^e^+M+1_j ,· · ·,d_k^∗,k^e^+2M

e+3M−1,j) =NashE(u^k_k^e_e^+M+1_+2M,_j,· · ·,u^k_k^e^+2M

e+3M−1,j|(E^∗,1_pc, (d_k^∗,k_e_+M,^e⁺¹_j,· · ·,d_k^∗,k^e^+M

e+2M−1,j)))

(5.24b)

· · · (5.24c)

(d_k^∗,k_e_+xM,^e^+(x−1)M+1_j ,· · ·,d_k^∗,k^e^+xM

e+(x+1)M−1,j) =NashE(u^k_k^e_e^+(x−1)M+1_+xM,_j ,· · ·, u^k_k^e^+xM

e+(x+1)M−1,j|(E^∗,1_pc,(d_k^∗,k^e^+(x−2)M+1

e+(x−1)M,j ,· · ·,d_k^∗,k^e^+(x−1)M

e+xM−1,j )))

(5.24d)

Corollary 5.1 [PCE corollary]. If E²_pc of partial decomposed subgames exists, i.e., (SD^∗_k_s_,_j· · · ∪SD^∗_k,_j· · · ∪SD^∗_k_e_=K_j_,_j), then the decision updating process of the remained players with a number more than M satisfy the recursion property iff M+1<k_s ≤ K_j−1 and∀integer x, 0<x<(ks−1)/M, i.e.,

(d_k^∗,k^s^{−(x−1)M−1}

s−(x−1)M−1,j,· · ·,d_k^∗,k^s^−xM

s−xM,j) =NashE(u^k_k^s^{−(x−1)M−1}

s−(x−1)M−1,j,· · ·,u^k_k^s^−xM

s−xM,j|(E^∗,2_pc, NashE(u^k_k^s^{−(x−2)M−1}

s−(x−2)M−1,j,· · ·,u^k_k^s^−(x−1)M

s−(x−1)M,j|(E^∗,2_pc, NashE(...|E_pc^∗,2)))))

(5.25)

Proof. As shown in Fig. 5.5, E^∗,2pc is a inverse of E^∗,1pc if the number of subgames in- cluded by them are same (i.e., k_e=K_j−k_s + 1), so (5.25) can be deduced from (5.23) and (5.24).

Meanwhile, the whole decision updating process can also be deduced from (5.22), expressed as follows:











d_1,^∗,1_j =arg max u¹_1,_j(d_1,j¹ |E_pc^∗,2), If k_s=2 (5.26a) (d_1,^∗,1_j,· · ·,d_k^∗,k^s⁻¹

s−1,j) =NashE(u¹_1,_j,· · ·,u^k_k^s⁻¹

s−1,j|E^∗,2_pc), If 2<k_s≤M+1 (5.26b) see in (5.25), If M+1<k_s≤K_j−1 (5.26c)

Similarly, Fig.5.5shows the updating process of E^∗,3pc is a combination of updating process of E^∗,1pc and E^∗,2pc, so it can refer to (5.22) and (5.26).

The next problem is selecting the E^∗,y_pc properly to have the most advantage, i.e., contributing to the most payoff among the players. Thus, the most potential E^∗,ypc corre- sponding to the maximum summation of the utility function will be selected, expressed as follows:

y^∗=arg max

ke+M−1 i=k

∑

u^k_i,j(E^∗,y_pc),∀i∈ΩNj,∀j∈ΩJ,∀k∈ΩKj (5.27)

where y^∗is the optimal sequence to determine the optimal E^∗,y_pc, i.e., E_pc^∗,y^∗. Once E^∗,y_pc^∗ is decided, the direction of updating decisions is also determined to the complete selection.

For instance, Fig.5.4shows the specific case in this lane-changing system. Accord- ing to Definition5.2, it can seen that (d_1,j^∗,1,d_2,^∗,2_j,d_3,^∗,2_j,d_4,^∗,2_j) is a PCE of game G_j, i.e., E^∗,1_pc in specific iff

d_1,^∗,1_j, d_2,^∗,1_j =d_2,^∗,2_j, d_3,^∗,1_j =d_3,^∗,2_j, d_4,^∗,2_j (5.28) In the same way, E^∗,2_pc can also be obtained if it exists. Next, the best E^∗,y_pc (i.e., E^∗,y_pc^∗) is selected based on (5.27). Finally, the reminded player will update the decisions based on (5.22) and (5.26), which depends on which one it is.

Definition 5.3 [Inconsistent equilibrium]. Let {SG1,j,SG_2,_j,· · ·,SG_K_j_,_j} be the de- composition of game G_j. A single Nash equilibrium SD^∗_k,_j= (d_k,^∗,k_j,d_k+1,^∗,k _j,· · ·,d_k+M−1,^∗,k _j) of each decomposed subgame{SGks,j,· · ·,SG_k,_j,· · ·SGke,j}is an Inconsistent Equilib- rium (ICE) of the whole game Gj, iff none of a single Nash equilibrium SD^∗_k can form a PCE with another Nash equilibrium of adjacent decompoded game SD^∗_k±1,_j, i.e.,

(d_k+1,^∗,k _j6=d_k+1,^∗,k+1_j kd_k+2,j^∗,k 6=d_k+2,^∗,k+1_j k · · · kd^∗,k_k+M−1,_j6=d_k+M−1,^∗,k+1 _j) &&

(d_k,^∗,k_j 6=d_k,^∗,k−1_j kd_k+1,^∗,k _j6=d_k+1,^∗,k−1_j k · · · kd_k+M−2,^∗,k _j6=d_k+M−2,^∗,k−1 _j),∀j∈ΩJ,∀k∈ΩK_j

(5.29)

where && is a logical operator "and", andkis a logical operator "or". It is different from PCE that the k^thICE of the game G_jdefined as E_ic^k =SD^∗_k,_j,k∈ΩKj is a decision vector corresponding to SG_k,j. The best E_ic^k^∗ is selected in (5.30) based on (5.27), expressed as follows:

k^∗=arg max

k+M−1 i=k

∑

u^k_i,_j(E_ic^k),∀i∈ΩNj,∀j∈ΩJ,∀k∈ΩKj (5.30)

Algorithm 5.2 Game theory-based decomposition algorithm

Initialization: Initialize the number of lanes n, the number of players N_j in the game Gj, the number of games J, and the constant number of players M in each subgame;

for each game G_j,∀j∈ΩJ={1,2,· · ·,J}do

Calculate the number of players N_j=∑ⁿ_i=1Cel_i,_j;

Decompose game G_j into a quantity of K_j = N_j − M + 1 subgames {SG1,j,SG_2,_j,· · ·,SG_k,j,· · ·,SG_K_j_,_j};

for each subgame SG_k,j,∀k∈ΩKj ={1,2,· · ·,K_j}do Calculate SD^∗_k,_j=NashE(SGk,j);

end

if(SD^∗₁∪SD^∗₂∪ · · · ∪SD^∗_K

j) ==CCE (see in Definition5.1) then (d_1,^∗_j,d_2,^∗_j,· · ·,d_N^∗

j,j) = (d_1,^∗,1_j,d_2,^∗,2_j,· · ·,d_K^∗,K^j

j,j,d_K^∗,K^j

j+1,j,· · ·,d_K^∗,K^j

j+M−1=Nj,j);

else if(SD^∗_k_s_,_j· · · ∪SD^∗_k,_j· · · ∪SD^∗_k

e,j) ==PCE (see in Definition5.2) then Choose E^∗_pc= (d_k^∗,k^s

s,j,· · ·,d_i,^∗,k_j,· · ·,d_k^∗,k^e

e+M−1,j)when k_e−k_s=max;

Determine E^∗,y_pc^∗ from (5.27), y^∗∈ {1,2,3};

Update other decisions(d_1,^∗,1_j,· · ·,d_k^∗,k^s⁻¹

s−1,j)and(d_k^∗,k^e⁺¹

e+M,j,· · ·,d_N^∗,K^j

j,j)based on the updating process of conditional Nash equilibrium in (5.22) and (5.26);

else if SD^∗_k,_j==ICE (see in Definition5.3),∀k∈ΩK_j={1,2,· · ·,Kj}then Choose the best E_ic^k^∗ in (5.30);

Update other decisions referring to the decision updating process of PCE in (5.22) and (5.26) by only replacing E^∗,y_pc^∗ with E_ic^k^∗;

end

Combine joint decisions from decomposed subgames and denote them to game Gj; end

Then the process of updating decisions of ICE is almost the same as the decision updating process of PCE, which refers to (5.22) and (5.26) by only replacing Epc^∗,y^∗ with E_ic^k^∗. The only difference between Epc^∗,y^∗ and E_ic^k^∗ is the number of associated subgames, i.e., ke−ks≥2 for E^∗,ypc^∗ and ks=kefor E_ic^k^∗.

Above all, the theoretical analysis provides the foundation for constructing a heuris- tic decomposition algorithm, verified in the experimental results by comparing different methods in the following section. The potential suboptimal solution is complicated to be quantified since it is affected by the nonlinear lane-changing system’s unregular con- straints (i.e., collision cases). Specifically, the collision case in game G_j with decisions may not happen in decomposed subgames SG_k,_jwith the same combination of decisions since the decomposed games have fewer limitations due to fewer players. That directly causes inconsistent Nash equilibrium solutions between decomposed games SG_k,_j and

game G_j(i.e., cases of PCE and ICE). Therefore, the collision cases are compatible in both decomposed games SG_k,_jand game G_jonly when it is a CCE case. The incompat- ible collision cases relatively account for a small proportion of all the collision cases, resulting in more CCE cases than the other two. The hierarchical concept is introduced to update the cases of PCE and ICE. I.e., the part of consistent equilibrium with the most significant utility value can be considered as the leader, and the rest considered as the followers will be updated based on the selection of the leader. Thus, the actions (i.e., decisions) of vehicles (i.e., players) in a lane-changing system can be selected with a game theory-based decomposition algorithm (i.e., Algorithm5.2) according to the sur- rounding information. The decisions generated are either optimal (i.e., case of CCE) or close-to-optimal (i.e., case of PCE or ICE).

In document Multi-agent traffic control using game theory and reinforcement learning (Pldal 82-91)