Truel problem analysis

Author: gmnenad

Summary

Abstract

Truel problem is an old problem which can be summarized as :
Several people are doing duel.
Given their probabilities to hit, what are probability of each of them to win and who should they choose as optimal initial target ?

This document presents solutions and analysis of different scenarios for "classical truel" problem - one where players can have different probabilities to hit, shoot sequentially with infinite number of bullets, and where missing is allowed. It contains several novel elements:

Novel part about solution is that it is fast enough to be interactive even for visualization ( where millions of individual pixels/solutions need to be calculated in short time to present AB graph with all possible hit probabilities for players A and B ). That speed was achieved by implementing default solution using mathematical calculations, while still providing option for conventional solution using simulation. Solutions support not only finding win probabilities for players with predefined initial targets, but also finding actual optimal targets - which is another novel element not found in most previous truel related papers. Mathematical solution is around 400 times faster than simulation when solving for predefined targets, and over 2000 times faster when finding optimal targets in truel scenarios. Another novel aspect of the solution is that is support more than three players, both for solutions and for visualization ( AB graphs still show all hit probabilities for players A and B, with fixed but configurable hit probabilities for players C,D... ) - although such variants are not technically "truel". To further improve speed and allow interactivity, Numba accelerated Python functions are used, along with 'smart' optimizations for AB graph that use dynamic interpolation/calculation.

Novel part about truel analysis is that interactive solution was used to demonstrate situations where it is optimal for some players to choose "random" target, as opposed to choosing to target other players or to miss. Document also demonstrate that self targeting is not optimal choice in any situation, and confirms that intentionally missing is optimal choice in large number of cases. While analysis of intentional misses and even self targeting was done previously, it generally lacked proof or demonstration in which situation those options would be optimal choice - because most previous analyses did not solve for optimal targets.

Code for this document is available at GitHub repository, or in ZIP file at gmnenad.com site ( which also include precalculated cache file, to save around 45 min of initial execution )



Problem definition

This document solves generalized version of truel problem:

Original problem:
Three players are doing truel, where each can choose who to try to shoot, then next player plays and so on until only one winner is left. If first player has 1/3 chance to hit, second has 2/3 and last has 100% chance to hit - what is the chance of first player to win and who he should target at start ? ( uses target= Miss )

Alternative problem:
If first player has 75% chance to hit, second has 55% and last has 100% chance to hit - what is the chance of first player to win and who he should target at start ? ( uses target= Random )

Assumptions for problems:

1. Solutions

There are two types of solutions to Truel problem in this document: exact matematical calculation and simulation. Both support solving for win probabilities and for optimal target order, and both support advanced targeting options like Miss, Self or Random targets. Simulation is slower to execute by factor 200-3000x, but even mathematical calculation can be slow for very large number of points ( in 2D graphs ) or very large range of targets ( especially for high random levels).

While different functions will be used ( simulation, mathematical calculation, AB graphs), they all share certain input parameters - same way to define problem :

function ( pHits, targets, search, ... )

Explanation on main shared parameters:

Possible target choices

Options when defining predefined targets or possible targets in optimal searches. Each constant is usable by tg.Name, or directly by name. There are three places where these constants are applicable:

Constants defining possible targets are listed below. They could be used as members of tg class, or as direct variables ( so tg.Miss == Miss , tg.Optimal==Optimal, etc ) :

1.1 Simulation as approximate solution

Python function that simulate truel problem for arbitrary number of players, and finds not only winning chances for each player but also optimal targets.


If we have N players with given targets and hit probabilities for each player, simulate until one player is eliminates, and then:

Complexity of solution differs depending on supplied targets:

Finding who to target ( optimal targets ) :

Search for optimal targets is needed if any of initial 'targets' was 'Optimal' ( for example targets= [Optimal, C, B], asking solution to find optimal target for player A ; or targets= [Optimal, Optimal, Optimal] to find optimal targets for each player ), or in case of 4+ players when we want to keep searching for optimal targets even when some players are eliminated ( specified in 'targetLater' parameter, which default to 'Optimal' , although other options like 'Strongest' would not require increased time ) Finding optimal solution with simulations requires simulating every specific combination of targets ( ie, Random, Miss, A, B, C...). If we have N players, and all possible targets are allowed ie serach=[Others,Self, Random, Miss] , then each player has (N+2) possible targets, and total number of target combinations is: $ nTargetCombinations = (N+2)^N $ For 3 players that amounts to 5^3 = 125 combinations, which all need to be simulated before algorithm for optimality search can be applied ( explained in section 1.3 ). Therefore if we are solving for targets= [Optimal, Optimal, Optimal] , it will take 125 times longer execution than if we are solving for specific targets = [ Miss, C, B ] . For 4+ players, that time increases exponentially for optimality search.


There is one limitation for simulation solution : when used for 4+ players, with targets=[Optimal,...] to find optimal targets, it does not propagate optimal search to cases when players are eliminated - instead, it default to targeting 'Strongest' opponent in those reduced cases. For example, when one player in 4-player duel is eliminated and it become 3-player truel, remaining 3 players will not use their true optimal targets. While in many cases targeting 'Strongest' opponent is optimal, it is not always - this mean that mathematical calculation should be used to find optimal targets in 4+ player cases, instead of simulation.

nb_truel_sim() function calculates win probabilities using simulations. It will select either optimized function for 3 players, or slower generic simulation for N players.

Starting parameters:

Result:

1.2 Mathematical calculation for exact solutions

Python function that solves truel problem for arbitrary number of players, and finds not only winning chances for each player but also optimal targets.


Solution reasoning:

if we have N players, with probability of i-th player (0..N-1) in pHit[i], then if i-th player is on move and he is targeting T:
    pwin(i first out of N)= pHit[i]* pwin(i+1 first out of N-1, T is removed ) + (1-pHit[i])* pwin(i+1 first, out of N)
Assuming we can get exact result for reduced number of players (N-1), we can express solution for N players with i-th player on move as function of solution for (i+1)th player on move:
     pwin(i)= v[]+ k*pwin(i+1) , where v[]= constant vector with N elements = pHit[i]*pwin(i+1 first out of N-1, T is removed ) , and k= 1-ph[i] ... so we can express this as tuple pwn=(v[], k) so that pwin(i)=pwn*pwin(i+1)

Infinite loop:
Recursively calling pwin(i+1) will eventually get to last player, who will have to rely on first player result calling pwin(0) - which would result in infinite loop
That infinite loop can be avoided by passing function that express result for win probabilities if initial player is on move as function of win probabilities if (i+1)th player is on move : pwin(0)=pw0= f( pwin(i+1) ) = pwn * pwin(i+1) , with pwn= (v[],k )
That pw0 can be modified by each successive player, and when we get to last player and he recalculate pw0, he will have pw0= f ( pwin(N+1) ) = f ( pwin(0) ) ... or pw0 = f(pw0 ) which can be solved directly, avoiding infinite loop
Given pwn=(v[],k) and pw0= pwn*pw0= v[]+k*pw0 -> pw0(1-k)= v[] -> pw0[]= v[]/(1-k) ... resulting in probability to win for each player, if first to play is initial player
That exact result for pw0 can be now returned back to exactly calculate pwin(N-1), pwin(N-2)... all way back to pwin(0) which will get same result as pw0.
Finding who to target ( optimal targets ) :
above calculation assume that each player knows in advance who is best target for him to shoot at (T). If he does not know that, he need to apply same calculation to every possible target (including no target) and select one which give him best chance to win
when 'no target' is selected, his chance to win is slightly different formula: pwin(i)= pwin(i+1) - or if we want to present in same format as before ( pwin(i)= v[]+ k*pwin(i+1) ), then v[]=[0,0..0] and k=1, or in tuple for ([0,0...0], 1.0)
special case is if everyone decide to skip (no target), which would lead to infinite loop, so need to be forbidden in some way. For example - last player is not allowed to skip if everyone else did it.

nb_truel_math() function calculates win probabilities using mathematical solution (as opposed to simulations). It will initialize dictionary and check parameters before calling recursive truelC.
Function truelC is main mathematical solution to recursively finds probability to win and optimal targets for each player.

Starting parameters:

Result:

1.3 Optimal targets

Finding optimal targets and win probabilities is more complex problem than just finding win probabilities for predefined initial targets.

What are "predefined targets" ?

Lets say that we have three players, customarily labeled as A (first to shoot), B (second to shoot), and C(third to shoot). And lets use probabilities to hit from original problem, so pA=1/3, pB=2/3 and pC=1.
That could be written as pHits=[1/3,2/3,1] , where size of list determine number of players (here three), and players shoot in order of list ( so player A is represented by pHits[0], player B by pHits[1] etc ).

Now let's say we want to solve problem where all players will always shoot at highest threat ( which is often optimal strategy ). In this case it means A and B will target C, and C will target B. That could be written as targets=[C,C,B] . And that 'targets' list would represent "predefined targets" for this problem setup.
But it is important to understand how and where those targets are used. Using them in initial phase of the game ( "game" refers to one duel/truel match until final winner is known ) is straightforward - and "initial phase" means that all players are still in game.
In initial phase, player A will target C, when he miss then player B will target C, when B miss player C will target B, and when/if player C miss (not possible in case when pC=100%, but in general) then player A will again target player C and so on.
What happens when one player is eliminated? Let's say player A hits player C, and only players A and B remain with player B on the move. It is obvious that player B can not follow 'targets' any more, since playe C is eliminated.
But what if player could follow 'targets' in next phase (after one player is eliminated)? Let's say our predefined targets were [Miss, C, B] , meaning player A would intentionally miss and players B and C shoot each other. Assuming player B hits player C, should player A still shoot to miss? Obviously not, if playe A wants to win.

Conclusion is that:

'targets' only define what players will shoot in initial phase of game, while all players are still in game. As soon as one player is eliminated, 'targets' are not applicable any more.

Note that this holds true both for problem input ( predefined targets ) and problem result ( optimal targets ) - since "optimal targets" (as solution result), can be viewed as "predefined targets" that give best chance to win to all players.

This is often glossed over in Truel analyses, since for three players it is obvious that, when reduced to two players, optimal strategy is to shoot each other. Therefore specifying "targets" for truel completely define how each player will behave during entire game.
But it is not so for more players. If we have four players with targets=[ D,D,D,C ] and player D is eliminated, what will remaining three players target in next phase? Clearly, old targets are not applicable, but neither is simple 'shoot each other' as in duel case.
Therefore, predefined 'targets' parameter does not define all player choices in games with four or more players. To fully specify how four players should play it would need initial 'targets' while all are in game, and at least four additional 'targets' for each of four possible three player cases, and possibly even more than four - if player D is eliminated by player A (so now player B is on move), new targets could be [Miss,C,B,-]. But if player D was eliminated by player B instead ( so now player C is on move), it is possible that players would choose different target, for example [C,C,B,-]. At maximum that would mean 12 different possible 'targets' choices for three remaining players, depending on which player was eliminated and by who. This only gets more complex with 5+ players. So initial targets alone can not uniquely define rest of the game, unless we assume that, as soon as one player is eliminated, players will select targets for next phase in some predefined way. Such predefined way could be "select highest threat", or "keep old target if alive, otherwise Miss" or even "all suicide" etc ... but most of them would not be logical choices since players would tend to select targets that maximize their chance for victory. In any case, choice for targetsLater, or targets to apply after one player is eliminated, must be something applicable to all remaining players regardless of their number and of which players were eliminated. So it can not be "target player C" ( C could have been eliminated), or "Miss" (since it is pointless if all players choose to miss).

Valid options for targetLater would be:

It follows that:

initial predefined 'targets' can define entire game only if we assume that, after one player is eliminated, remaining players will all select same type of target, specified in 'targetLater' - 'Optimal' should be logical choice

What are **"optimal targets"** ?

From above discussion, it follows that:

"optimal targets" are predefined targets that would maximize victory chance of all players.

Since we have shown that 'predefined targets' only define initial phase ( while all players are still in game), that also holds true for 'optimal targets'. Similarly, if new 'optimal targets' are selected once one player is eliminated ( for example, from 4 to 3 players ), that new optimal targets will be valid only until next player is eliminated.

How to "maximize victory chance of all players" ?

Truel game is inherently zero-sum game ... if one player increase their chance to win, it must reduce some others player chance. Ultimately, sum of all players chance to win must be one:

$$ \sum_{player} pWin\, (player) = 1 $$

That presents problem to maximize victory chance for all players - if we maximize for player A, it will not be optimal for player B. But it is still possible to determine "optimal" targets if we make following assumptions:

We need some way to reduce complex problem of 'find optimal targets' to simpler problem of 'find probabilities to win for predefined targets', using above assumptions.

finding optimal targets can be solved using form of **minimax approach**:

Last player to shoot (player C in three players example) will see/know targets of A and B ( as per assumptions ), and would be able to calculate his optimal target by calculating win probabilities for series of predefined targets, where those predefined targets will always have same targets selected by A and B, and his own targets will be his every possible target option. Let's say that A selected tgA=B, B to shoot tgB=A ... and possible targets for C are A/B/Miss. Player C will calculate three times:

Player C will obviously select for his target option where his pwC is maximal - in this example, one marked with 'C+' where pwC=100% - so he will Miss, and all targets are [ B, A, Miss] ). He needed 3 calculations for that.

How does that influence selection process of player B?
When player B is on the move, he knows what is target of player A (tgA=B in above example), and he also can analyse each of his possible choices. But for each of his choices, he needs to analyse potential responses of player C, in the same way that C would reason it from above. Player B will need 9 calculations (3x3):

Same logic now extend to player A. He needs to evaluate each of his target options tgA in { Miss, B, C }, and for each of them repeat logic that B would apply, so he will need to do 27 calculations (3x3x3).
After determining what targets would B and C choose if he select each of his targets, he select one where he has maximal win probability. Note that it may not ( and in many cases will not ) be situation where player A has best win probability of all players. It will merely be one of three options where he has largest probability. In above example, player A options are:

What if different targeting options are available ?

It should not change logic, it will only change number of available target options for each player. For example, if players can target themselves (with Self option), they will have 4 target options at each step, so player C will examine 4 cases, player B 4x4 and player A 4x4x4
In general, there will be $ T^N = TargetChoices ^ {Players} $ cases .

If 'Random' option is available, where players can randomly shoot at any other player each round, it can be considered just as another option - as long as we know how to calculate win probabilities when Random is included in predefined targets, for example targets=[ Random, C, B ]

Same approach can be applied for cases where only some players are allowed to search for optimal target, and other players are fixed to predefined targets. Lets say that we have:

In this case, player A and C can choose any target, but player B must shoot at player C. It only changes step for player B ( who does not have anything to choose ), and player A will need smaller number of total calculations to find optimum (3x1x3=9).
Basically, different options will change total number of possible 'predefined targets', but logic of choosing optimal targets after that should be same application of minimax.

Conclusion:

Optimal targets can be found by using minimax approach on all possible combinations of specific predefined targets


Is same approach to find optimal targets applicable both in mathematical calculations and in simulations ?

In essence, yes.
Simulations need to use this approach just as explained, by simulating all possible predefined targets combinations (27 in above example), and then applying this approach to find optimal target.
Mathematical calculation can do the same, but also has an option to optimize calculation and at same time calculate for optimal and for specific targets, as explained in later section.

So simulating for targets=[Optimal, Optimal, Optimal ] will generally takes 27x more time than for specific predefined targets=[ Miss, C, B ]. Even more for more target options, for example if target options are { Others, Self, Miss, Random }, which means 5 target options ( A,B,C,Miss,Random), then it needs 5^3= 125 more calculations to find optimal solution than to find one specific solution using simulations. Using optimized mathematical calculations instead of simulation can avoid calculation of some of those Targets^Players, but would still need more time than single calculation needed for predefined specific targets .

allTargetOptions is function that generate all possible specific targets , based on predefined 'targets' and 'search' options for valid targets, and probability to hit of each player

Function selectOptimalResult is used to select optimal targets among all possible

truel() function calculates win probabilities using mathematical solution (as opposed to simulations). It enumerate all possible target combinations based on inputs, and then call numba function to actually produce result. Function nb_truel is numba accelerated version which calls either simulation or math calculation, based on specified algorithm.

Starting parameters:

Result:

2. Visualization

Multiple visualization procedures use similar parameters to define one 'test case', with given number of players, their probabilities to hit etc. This is Test class which contain all those parameters:

2.1 ptruel - to find and print result for given hit probabilities and predefined or optimal targets


Helper ptruel (pHits < , targets, search, targetLater,searchLater, ifEQ, alg, iterations, timeit > ) function used to invoke single truel() calculation and pretty print result.
It accept same starting parameters as truel(), and return same result (but only for single case , ie single set of probabilities ) In addition it display result as : pHit%= [,,..] pWin%= [,,..] targets= [,,..] tm= h:mm:ss.ms        ; win probabilities are rounded to one decimal, and targets are final/optimal targets for each player (tw[]) zero based. Asterix (*) denote that target was forced as parameter.


Solution to original problem, where first player can hit in 1/3 cases, second player in 2/3 cases and last player always hit - and program is allowed to find optimal targets for each player:

  • best for first player is to intentionally miss ( 'm' as first value in targets )
  • his probability to win is 39.7% ( first value in pWin%[] ) ... in fact, first player has highest win probability of all 3 players
  • best target for other two players are each other ( second player shoot third and vice-versa )



Example that demonstrate how truel() can find optimal solution for generalized problem with N players - here with 4 players, each 25% better shot than previous.

  • best for first player is to shoot last player, and he has 22.2% chance to win
  • best for second player is to shoot first player, and he has 41.8% chance to win - most of all players
  • best for third player is to shoot last player, and he has 27.8% chance to win
  • best for last player is to shoot third player, and he has only 8.2% chance to win - least of all players, even if he is best shot


Same example as above, but we force all to shoot at highest shot, so B and C shoots D, D shoots C, allowing only first player A to find optimal target. We also do not find optimal targets when 3 players remain, instead forcing them to shoot at strongest.
Now first player is better ( at 32% chance to win ) but second player is worse and now best chance to win has third player with 35%. Note how slower simulation is in this case, even if we did optimal search only for one player.

2.2 ptruelWhy - to demonstrate "why" is target list returned in result really 'optimal'

Function ptruelWhy (pHits <, search, targets,ifEQ, alg > ) to generate all possible target combinations and select optimal targets among them , while printing all options and reasoning behind choices.

Example of minimax selection process for Optimal result

2.3 testAB - to display results as colored graph image for any hit probability of first two players (A and B) on Y and X axis

Functions in this area are intended to create AB image/graph showing how win probabilities or optimal targets change for all possible combinations of hit probabilities for first two players, A and B ( hence 'AB' in names )
They can work for any number of players, using fixed hit probabilities for player C and above (players 3+), and exploring entire [0..1] range of probabilities for first two players A and B. On all images Y-axis represent every possible probability 0%..100% to hit for player A (0% chance is bottom, 100% chance for A to hit is top ) and X-axis represent players B chance to hit ( 0% is left, 100% is right)
Most functions share same parameters, that allows different data to be shown. Main AB functions are :

Common parameters, either passed directly or as members of typed class test:

Initial three parameters are standard parameter passed to truel function:

Next three parameters are also passed to truel(), but are rarely changed from default. They define search/target behaviour after one player is eliminated :

Following are ImageAB specific parameters:

Calculate AB result matrix : makeMatAB

Function that calculate result matrix [ points, points, 2, N ], with win probabilities and target orders for every mat[x,y] point:

makeMatAB : create full matrix ( size x size ) with probabilities and targets for entire AB image area ( where Y axis represent probabilities of player A to hit, from 0.0 to 1.0 , and X-axis is same for player B ), using one of:

Make AB result image : makeImageAB

Functions that create image from result matrix:

Main AB functions : testAB and diffAB

Functions that allow user to calculate and display test results in AB image format :

Additional formats for above functions or useful helper functions :

For each test case, parameter colors can also be list, again generating multiple side-by-side images with different coloring schemes.

3. Analysis

3.1 Comparison between "smart" and "normal" AB methods

There are two methods available for every 'AB' function:

Performance advantage range from moderate for small images (200 x 200), to dozens or hundreds of times for larger images ( 1000 x 1000 ) - depending not only on size of image, but also on parameters of AB function like search options etc.
But any optimization has potential to introduce errors, so here we will use those same AB functions to analyze difference between 'smart' and 'normal' methods. It will show that smart method is acceptable approach with very low result difference.

Above comparison is for standard AB image that cover any win probability for A and B ( 0<= pA,pB <= pC ; pA is on Y axis and pB is on X axis ), while pC is fixed at 100%.
It also use standard parameters for initial targets [ Optimal, Optimal, Optimal ], meaning solution is finding optimal targets at every point, among default options [ Others, Miss ] ( so players can shoot at other players or intentionally miss )

Same pattern is visible on first two AB DIFF images in third row:

Conclusion:

Smart AB calculation is acceptable alternative to full size x size calculation. It has an order of magnitude smaller error margin ( compared to exact mathematical solution ) than simulation solution with 1 million rounds per each point, which itself is still acceptable approach used in many analyses.

3.2 Comparing simulations at different iteration counts

Simulations are inherently approximate solutions with variability in results even when run twice with same input parameters. That variation, and related simulation errors, are reduced when number of iterations for each result ( each pixel in AB images ) is increased. Default number of iterations here is 1 million simulations per pixel. Precision of simulation result could be estimated either by analysing that variability in results over same inputs , or by difference in results to exact solutions ( when available ) . In this case, both approaches are used :

Another area of interest related to simulation is its performance - in other words, execution time. Execution time of simulation should increase linearly with increase in 'iterations' ( thus simulation with 100k iterations should be 10x faster than one with 1 million iterations ). When simulation is also searching for optimal targets, it needs to repeat simulation for every valid target combination. For example, if players can choose each other as targets then each of three players has only 2 target options initially. But if players can also choose to Miss, then nTargetOptions=3 , if they can also shoot Self then nTargetOptions=4, and if they can also choose Random target then maximal nTargetOptions=5 . And if solution is searching optimal target for every player, number of possible targets combinations is nTargetOptions ^ nPlayers , or 5^3 = 125 in this worst case for three players. Therefore execution time for simulation can increase hundreds of times depending on optimal search options.

Below comparison uses predetermined (fixed) target option without search for optimal targets, targets= [Miss, C, B ] , to compare variability of simulation results:

Random calculation have small variations in results when rerun with same parameters ( as expected). It can be seen in "Sim_1M vs Sim_1M" images above, and it amount to 0.03% average difference in win probabilities of players ( second row, first image, pWim average DIFF). Those small differences can, in rare cases, even result in different 'win order' (order of players by their win chances), as seen in third row where every point with different win order is colored (red when even best player has changed, purple for 2nd, orange for 3rd). That naturally happens on borders between areas with different win orders, as seen in first row of images depicting 'win order' in different colors.

Smaller number of simulation iterations ( 100,000 in Sim_100k as compared to 1,000,000 in Sim_1M ) result in higher average difference in win probabilities ( again, as expected ). In this case it amount to 0.07% average difference in win probabilities of players, or over 2x more difference compared to natural variation in randomness (ie compared to just rerunning Sim_1M again ). But 0.06% is still acceptably small, and its difference in win order is similar in scale and position (along borders ) compared to Sim_1M.

Compared to exact mathematical calculations, Sim_1M has similar level of differences ( ~ 0.02% average difference in win probabilities of players between Sim_1M and Math_calc) as when compared to random variations ( ~0.03% when Sim_1M is compared to another Sim_1M rerun ). That further confirms suitability of Sim_1M for analysis.

Execution time for simulation in above example is kept low due to use of specific target ( targets= [Miss, C, B ] ) - therefore simulation did not need to calculate multiple cases for each pixel, and in general Sim_1M was 10x slower than Sim_100k ( as expected), while mathematical calculation was over 400x faster than Sim_1M. But asking simulation to find optimal targets ( targets= [Optimal, Optimal, Optimal ] ) increases its time by the factor nTargets ^ nPlayers . If allowed targets are other players and Miss, then nTargets=3 for each player, so expected time increase would be 3^3= 27. But if we also allow self targeting and random targets ( 5 options in total), that factor is at least 5^3 = 125 . That is confirmed in graphs below, where all 3 players are searching optimal target among all 5 possible target options ( 2 other players, self, miss, random ) :

Comparing execution times of simulation with fixed targets (top left image) and when searching for optimal targets (top right image) shows that optimal search is over 170x slower, which is in line with "at least 125". On the other hand, mathematical calculation for optimal results is just 15-60x slower than calculating for fixed results, which is much less than factor 125. Therefore performance advantage of math calculation increases from x400 with fixed targets to over x3000+ with search for all optimal targets.

Above comparisons are done with only 15x15 points, in order to limit execution time of simulations. Below is example of that same image in higher 300x300 resolution, using only math calculation:

Conclusion:

3.3 Investigate for three players: A, B and C

Analyze every combination for hit chances of first two players ( A and B ), so pA=[0..1] and pB=[0..1], with fixed hit probability of third player C ( usually at pC=100%). Results in square picture.
In some cases analyze more constrained scenario where 0 < = pA < = pB < = pC , and pC==1 . In other words, first player to shoot (A) has lowest chance to hit, and last player (C) always hit. Results in triangle picture due to pA<=pB .

Plot will show pA on Y-axis (0% at bottom .. 100% at top ) and pB on X-axis (0% left .. 100% right ), while pC is constant at 100%
Standard colors are darker for higher chance to win, and lighter for lower chance to win, and RGB color determine person who has highest chance to win ( the "winner" at that point ) :

Above is solution to original problem with fixed hit probabilities pA=1/3, pB=2/3 and pC=100%, which finds :

Above is default image comparison for different hit probabilities of players A and B ( Y and X axis) while player C has fixed hit probability at 100% :

It does not show what are optimal targets for players at each point ( that would be color="p" option ) , nor what are win probabilities, but it shows that player A has highest change to win in most cases under above conditions (red color wins in almost 42% of cases )

We can also investigate cases where players are choosing fixed instead of "optimal" targets.

Above is case when all players are shooting at highest threat - so A and B target C, and C target B. We see that player A can win in very small number of cases ( 4.4%, red color for player A is dominant only in small area around pA ~ 40% and pB ~ 40% )

It also shows win probabilities for original problem ( pHitA=1/3, pHitB=2/3 and pHitC=1 ) when all players target strongest player ( so targets=[C, C, B] ): player A wins in 31% cases, which is less than in 'optimal' solution when he choose to Miss, and also less than player B win chances.


When B and C are shooting at highest threat (each other) and A intentionally miss :
Here player A can win in much more cases - in fact, red area is largest at 41.1%, meaning he wins more than B or C.


Above is case when B and C are shooting at highest threat (each other) and A always intentionally miss :
Here player A can win in much more cases - in fact, red area is largest at 41.1%, meaning he wins more than B or C.


Above is case when B and C are shooting at highest threat (each other) and A chooses optimal target
This does not improve much, just by 0.3% - obviously shooting to miss was optimal tactic for A in most cases anyway. Only place it improves is small additional red area around pA ~ 30% and pB ~30%

To demonstrate that small improvement, here is calculation at pA=pB=30%, when A must miss (then C has highest chance to win, not A) and when A can choose optimal target ( targeting C, and A has highest chance to win)


Above is again situation initially shown, when all players are allowed to select optimal target:
This is exactly same as when only A was allowed to select optimal target, indicating that for B and C it is always optimal to shoot each other.


But what if we look at entire area, including cases when pA > pB ?


Will "B and C target each other and A always miss" still be similar to "all players choosing optimal target" ?

This is case when "B and C target each other and A always miss" : player A wins in vast majority of cases (61.7%) - because others ignore him even when pA>pB


This is case when "all players choosing optimal target" : player A now wins much less (38.9%), almost tied with B (36.7%) - since now all choose optimal target, and thus B and C fare bit better
Another effect is that area is not so uniformly separated - now there are patches where different players have highest chance to win.

Above shows that, while optimal targets result in very similar winners to fixed [Miss, C, B] targets in lower right triangle ( where pHitA< pHitB ), graph with optimal targets significantly differ from fixed one for cases when pHitA > pHitB ( upper left triangle) - which was expected, since in that area we would expect player B to start behaving like player A , meaning we would expect graph in upper left side to be similar ( almost symmetric ) to one in bottom right side, only with win colors swapped between player A (red) and player B (green).

Indeed, those two triangles are almost symmetrical, with swapped red/green colors. Reason why they are not fully symmetrical is due to the fact that shooting order did not change and player A stil shoots before player B. That result in visible asymmetry, mainly in areas where pHitA is close to pHitB, especially around pHitA ~ pHitB ~ 30%.

3.4 Original problem - hit probabilities [ 1/3, 2/3, 1 ], and allowed targets Others/Miss

With above tools, it is easy to solve original problem:

Three players are doing truel, where each can choose who to try to shoot, then next player plays and so on until only one winner is left.
If first player has 1/3 chance to hit, second has 2/3 and last has 100% chance to hit - what is the chance of first player to win and who he should target at start ?

Specific solution for first player is : Player A should target to Miss at start, and his win chance is 39.7%

Solution by default search for optimal targets, and it found here that targets= [ m C B ] , which means it is optimal for player A to intentionally miss, for player B to target player C, and for player C to target player B - until one of them is elininated, after which it is optimal for both remaining players to target each other. Another default in solution is that allowed targets ( examined by optimal search ) are Others ( other players ) and Miss. Targets that were not allowed in default search are Self ( self targeting, suicide ) or Random.

We can also demonstrate why those targets are optimal in this case:

When it come to player C turn, he will always select target option marked as "C+", depending on target choices made by players A and B before him. So in BA case ( A was targeting B, and B was targeting A, rows 1-3), best win probability for player C is if he intentionally miss (row #3, with 100% win chance). In BC case ( rows 4-6) , best for player C is to target player B ( row #5, with 48.1% win chance). In mC case ( when player A select to miss, and player B to target player C), best choice for player C is again to target player B (row 23, with 22.2% win chance).

When it is player B turn, he applies same logic - he will always select target option marked as "B+", depending on target choice made by A, and on what he knows player C will decide ( as described above). So if player A choses to miss, player B can choose to target either players A, B or to miss himself. If he chose player A (mA case), he knows player C will choose to miss (mAm case, row #21), and so his own win chance would be 0%. If he chose player C as target (mC case), he knows that C will chose B as target (mCB case, row 23), and his own win chance would be 38.1%. And lastly, if player B chooses to miss (mm case), he knows player C would target B (mmB, row 26), where player B again has 0% win chance. Logically, if player A selected to miss, player B will select to target player C - since that gives him highest win chance at 38.1% (among rows 21,23,26 where his win chances were 0%,38%,0% respectively). So row 23 is marked "B+" , indicating it would be choice for player B ( and player C after him ) if player A select 'miss' as target. Same logic is applied if player A select C as target, resulting in player B also targeting player C ( row 14) ... and same logic leads to row 5 as player B choice if player A targets player B.

So, when it is player A turn, he knows what options would players B (and C) select depending on his target choice - those rows marked with "B+". So if he target B, it will be row 5 (where his own, player A, win chance is 0%); if he targets C it will be row 14 (where his win chance is 31.2%); and if he intentionally Miss, it will be row 23 (where his win chance is 39.7%). Logically, he will chose to Miss, since it result in largest win chance for him (39.7%). It result in player B choosing to target player C, and player C choosing to target player B ( row 23 )

Note that this optimality selection assume that ALL players were allowed to select same targets, so players B or C could have also selected to Miss - which is visible in rows 3,9,21 where player C would chose to miss.

We can demonstrate that same result using simulation instead of mathematical calculations:

Although simulation is some 250 times slower than calculation ( 194ms vs 0.75ms ), it result in practically identical solution as mathematical calculation - as expected, with only minimal variations due to simulation being approximate.

It is notable that optimal strategy for player A was to miss , which not only improved his win chances compared to selecting regular target like "shoot at player C", but also resulted in player A having highest chance to win ( 39.7% vs 38.1% of player B or just 22.2% for player C) - their win chances are in reverse compared to their hit chances.

It is easy to see how it would all look if players were not allowed to miss:

Much smaller number of target combinations (2^3 vs 3^3 when Miss is allowed), but same logic for selecting optimal targets can be applied, and it would results in "Player A should target C at start, and his win chance is 33.3%". It is notable that not only player A has lower win chance without Miss option (33% vs 40%), he also does not have highest win chance ( that would be player B here, with 54%)

3.5 MISS target option - in what cases it is optimal strategy

As we could see in solution to original problem above, it can be optimal strategy for some players to select to miss in some situations. It does necessarily mean that selecting Miss as initial target will result in them haveing highest chance to win ( although in original problem, that is exactly what happened to player A when he select to Miss ). But "Miss" is an optimal choice for player even if it does not make him top winner - as long as it increase his win probability.

It would be interesting to see how often ( and in which situations ) selecting "Miss" would be an optimal strategy. To do that, we will use "interesting targets" visualization option:

Third graph ( unusual targets) demonstrate that "Miss" was optimal target for at least one player in almost 84% of cases ( cases being all combinations of hit probabilities for player A and player B, with player C fixed at 100%). Rightmost graph ( unusual targets for winner, where 'winner' is person with highest win probability at that point ) shows that 'Miss' was optimal target for winners in smaller percentage ( 41% vs 84% ) when compared to cases when it was optimal to anyone - which was expected. But it is interesting that in almost half of the cases (41%) optimal strategy for winner was to select 'Miss'.

We can do similar comparison, but allowing player C to have less than ideal hit chances: 10%, 25%, 50% and 100%.

"Unusual targets" ( first row of graphs above ) shows only most unusual target selected for given pixel/scenario, in order : green=Random, red=self, blue=miss, black= low threat, grey= rest. Since we only allowed [Others,Miss] as targets, there could not be green(Random) or red(Self) targets here. In cases when both "low threat" target ( someone who is not strongest) and "miss" targets are optimal for some players, graph shows blue(miss) color, as it consider Miss more "unusual" than targeting weaker player. Bottom line is that, on these graphs, whenever optimal choice for some player was "Miss" it is represented with blue color.

We can see that "Miss" is an optimal target for at least one player in most cases ( over 50% of cases ), across all combinations of hit probabilities - not only all player A and B hit chances ( represented by Y and X axis on graphs), but also across all player C hit chances ( leftmost graph above is for pHitC=10%, followed by 25%, 50% and rightmost picture for standard pHitC=100% ).

But when looking at "Unusual targets for winner" ( second row of graphs ), it shows that winners would choose to 'Miss' as optimal choice in significantly smaller percentage of cases as we reduce probability of player C to hit.

3.6 Alternative problem - hit probabilities [ 75%, 55%, 100% ], and allowed Random targets

Alternative problem is almost same as original, with only difference in hit probabilities for players A and B ( while player C remains at 100% hit chance ):

If first player has 75% chance to hit, second has 55% and last has 100% chance to hit - what is the chance of first player to win and who he should target at start ?

If we allow same targets as for original problem ( targets=[Others,Miss] ), solution is obtained in similar way:

It would appear that best win probability for player A is 28.5%, and his optimal target would be player C.

But player A can do better - while using same allowed targets (B,C,Miss) - by introducing novel approach: random target selection

Essentialy, he will randomly select with same probability among allowed targets - in this case among player B, player C and Miss, each with 1/3 chance. Including 'Random' in target options result in:

We can see that player A has improved his win chance to 32.7% ( compared to 28.5% from before ), and that his optimal target is 'r' (random).

Logic behing optimal selection can be demonstrated with:

Since assumption of problem was that every player is allowed same target choice, it means that players B and C also could select 'Random' as target if it is optimal for them. That increase target options to 4 for each players ( 2 other players, Miss and Random ), and total number of targeting combinations to 4^3= 64 .

But exactly same optimal target selection logic is applicable ( as explained in original problem, or in optimal search section ). It means that :
Solution to alternative problem is : player A can win with 32.7% probability, and his optimal initial target is 'Random' ( to randomly select, with equal chance, between allowed targets B,C and miss )

In initial reasoning we said that 'Random' option can be used since it does not introduce new actual targets - allowed targets are still other players ( B and C for player A) and 'Miss' ( to intentionally miss). We only assumed that player can choose to randomly select among valid targets. Probably only visible change to problem would be that now players would have to announce their selection explicitly to satisfy assumption that "players know which target choice was made by players before them ". Until now, we implicitly assumed that player C can deduce what are player A and B choices : first - whatever their optimal target was, it will remain same in all future round while still all 3 players are in game, and second- it is feasible that they could deduce target just by watching where players were targeting. But that was already questionable for 'Miss' target ( did player A accidentally or intentionally miss), so we mentioned that we can even assume players must announce their targets to avoid ambiguity. For 'Random' targets that is even more pronounced, but is not a problem if player A announced " I will randomly select between players B,C and Miss in all my truel rounds"

But what if we go even further here? What if player A announce " I will randomly select between players B and Miss"? Or " I will shoot at B in 25% cases, and on C in 50% cases, rest I'll miss" ?
In other words, what if we allow any random distribution ?

Luckily, python solvers here already support any random distribution, both while specifying predefined targets or when searching for optimal target. Lets see how that differs when we use predefined targets:

Now lets slightly change random distribution for player A:

It obviously change result, and it appears that last distribution is best for player A - but we must not forget that here we fixed target selection, so players B and C could not select potentially beter options.

In order to really find best target options with any random distribution, we would have to allow Optimal search with any random distribution. By default, if we include random in search (like we did above with search=[Others,Miss,Random] ) and when we ask program to find optimal solutions ( like we did when we did not specify targets, defaulting to targets=[Optimal, Optimal, Optimal] ) , program will always use "evenly random" option as shown on ptruelWhy listing above.

But there is another parameter that can instruct solver to try 'all' random distributions. That parameter is optRndLevel, and is expressed as list with integer value for each player defining how finely grained will be random distribution :

While 'ptruel' function still display optimal target as 'r', in this case it is actually not clear what distribution that 'r' represent - to see that, use result of actual calculation function 'truel' which return multiple rows of 3 numbers ( when 3 players ):

Since 'all distributions' would have infinite number of combinations, it must be somehow limited. So parameter accept number of 'random levels' for each player. In above example, 5 means that there would be 5 different random probabilities for each player, so examples of allowed random distributions would be : Atarget=[0,0,0] - meaning miss ; Atarget=[0,1/5,0] - meaning A will target B in 1/5 of cases, otherwise miss ; Atarget=[0,3/5,1/5] - meaning A will target B in 3/5 of cases, C in 1/5 cases, and miss in remaining 1/5 cases ; ... etc . Essentially, if optRndLevel=[5,5,5] is used, all random distribution values would be multiples of 1/5, ie 20% steps ( so 0%, 20%, 40%, 60%, 80% and 100%), but there would be no target=[0, 15%, 0] for example. Obviously, if we use optRndLevel=[20,20,20], it would cover all 5% steps, but it would be much slower to calculate.

Example of that random search with 20 steps instead of 5 ( more fine grained, but slower):

Just by increasing how fine-grained random distribution can be, we changed probability of player A to win from 34% to 44%, and his optimal target from "randomly B in 60% cases, rest miss" to "randomly B in 45% of cases, rest miss"

This means that finding exact optimal solution when any random distribution is allowed may not always be computationally feasible. But finding close approximate optimal solution is doable - especially if solution is not equally sensitive to random levels across all players. In this example we see that only player A was selecting random options, regardless of 5 or 20 levels, so we could keep levels for players B and C lower. For example:

Above used optRndLevel=[20,5,5] but resulted in exactly same solution as optRndLevel=[20,20,20] - while being 100x faster. That is because players B and C did not have optimal random choices, so more random levels in their case was not needed.

If we use that to further increase random levels for player A only, we get:

We can see that we have same results for [20,2,2] as for [20.20.20], further confirming that players B and C did not have optimal random choices. Furthermore, we see that optimal result only slightly improved when increasing from 20 levels to 50 levels for player A ( [50,2,2]), and did not improve at all when increasing to 100 random levels.

That indicate we already found optimal result at 50 random levels, and we can assume that it would be optimal solution when any player can select any random distribution. Therefore we can state that :

Solution to alternate problem when ANY random target distribution is allowed is : player A can win with 46% probability, and his optimal target is random between player B in 42% of cases and Miss in 58% cases.

Moving from 'can miss' to 'can randomly target with equal chance' improved player A win chance from 28.5% to 32.7%. But moving from 'equal random' to 'any random distribution' did not only further improve player A win chance to 46%, it made him person with highest win chance.

3.7 Random target option - where and why it change optimal targets solutions

As we could see in solution to alternative problem above, it can be optimal strategy for some players to randomly select targets in some situations. Just like in 'miss' target analysis, it does necessarily mean that selecting Random as initial target will result in them having highest chance to win - but Random can be an optimal choice for player as long as it increase their win probability.

Both mathematical solution solver and simulation solver from this document were enhanced to support random targets. While it was very easy and natural extension of simulation ( which is already based on random hit/miss results), it was quite harder to incorporate in recursive mathematical solver in a way that would not significantly increase computational time, but it was eventually doable both for 'equal Random' where players can select randomly with equal chance among valid targets, and for 'unequal Random' where player A can select for example "randomly shoot at B in 25% cases and C in 50% of cases" ( thereby not only is probability of shooting at B different from that of shooting at C, they are also different from 0% probability to intentionally Miss ). This enhancement is covering both ability to calculate win probabilities when random targeting is involved, and ability to find optimal targets when random targets are allowed.

To see how often ( and in which situations ) selecting "Random" would be an optimal strategy, we can again use "interesting targets" visualization option. Note that "Random" by default means random select among valid targets with equal chance. In this case 'valid targets' are Others and Miss, which means every player has 3 valid targets ( for example, player A can shoot B, C or Miss ), and when player A select "Random" as his targeting option it means he will randomly select to shoot at B in 1/3 cases, at C in 1/3 cases and will intentionally Miss in 1/3 of the cases.

Unusual targets graph (3rd one) is similar in shape to graph where we only allowed Miss , without Random. Most important difference is green area around pHitA=70% and pHitB=60% - that green color indicate that at least some player had 'Random' as optimal initial target.

Unusual targets for winner (last graph) is again similar to corresponding graph with just [Others,Miss] allowed, with difference in same area ( where pHit~60% ) - only this time that area is not green, indicating that Random was not an optimal target for winners, even if it was optimal target for other players in that area.

We can do similar comparison, but allowing player C to have less than ideal hit chances: 10%, 25%, 50% and 100%.

Top row with "unusual targets" demonstrate that 'Random' is optimal option for some players in noticeable number of cases (1-3% of cases). Bottom row with "unusual targets for winners" shows similar pattern as in just Miss case, namely reduced number of 'unusual' targets in general when we reduce hit chance for player C.

But, more notably, we cab see some cases where Random is optimal choice even for winners , when player C is not at 100% hit chance ( for example, 3rd graph where pHitC=50%). Example of one specific point:

In this specific case, standard optimal solution ( first line, when only others and Miss are allowed targets ) shows that player A has around 27% win chance, far behind player C at 63%.

But when we allow 'Random' targeting, player A not only gets significantly higher win chance ( 44% vs 27%), but he also becomes a "winner" - player with highest chance to win in this particular example.

We could look at reasoning behind selection of optimal targets in this case:

What if we allow different random distributions, in addition to just "equal random" ?

Above analysis is related to default 'Random' option, which is "select randomly among valid targets ( other players and miss) with equal probability". As we demonstrated in alternative problem solution, it is possible to have even more optimal solution when 'any random distribution' is allowed. But such distribution require significantly more calculations ( depending on granularity parameter optRndLevel ), and while even high granularity was feasible for single case calculation in alternative problem ( with optRndLevel=100 ), it could be prohibitively slower for graphs with hundreds of thousands of pixels/calculations .

But in almost all cases if any player have random target as optimal option, it will be visible even at lower number of random levels, as it was demonstrated in alternative problem section. So even 2 random levels are enough to indicate which areas would have optimal targets 'random with any distribution':

We can see that number of cases where optimal target is 'any random distribution' is noticeably larger compared to cases where 'equal random distribution' is optimal target.

To test if further increasing random details ( optRndLevel>2 ) will not significantly change this picture, we must reduce resolution :

Above graphs confirm that further increasing random details ( above optRndLevel=2 ) does not significantly change distribution of cases where distributed random is optimal target.

Therefore we can present below comparison between random optimal targets ( green color) in cases when 'no random', 'equal random' and 'any random' are allowed targets respectively:

We can look at same graphs, but for "unusual targets for winners" :

Most interesting in above graph is that it shows how "any random" option can result in Random being optimal even for winners in many cases ( green color on rightmost picture ), while it is not optimal anywhere for "equal random" ( middle picture).

It should be noted that above graph is for usual pHitC=100%, but we already showed that when we reduce pHitC ( for example, to 50% ) , there could be cases where even standard 'Random' is optimal choice for some winners. For example:

When player C has hit chance fixed at 50% ( instead of 100% ), there are more cases where 'Random' is optimal choice: not only more cases when "any random" is used (3rd image from above, with more green area than previously), but also we can see that 'Random' is optimal choice even in some "equal random" cases ( 2nd image above, which previously did not have any green areas).

General conclusions about following questions:

When is 'equal Random' target optimal ? ( 'equal Random' means equal probability to select any valid target )

When is 'unequal Random' target optimal ? ( 'unequal Random' means that any random distribution is allowed, in steps defined by optRndLevel parameter )

In other words, standard 'Random' option is optimal in many cases for at least one player, and can be optimal even for winners if player C hit chance is reduced. If unequal Random is allowed, it is optimal in more cases than standard Random.

3.8 SELF target option - is it ever optimal targets solution

Original truel problem does not assume that players can select themselves as an target - that is essentially suicide attempt.

Likewise, default allowed targets in solutions here do not include that option when searching for optimal solutions. But it is possible to specify 'Self' as target option.

For example, if A always shoot at himself (while all three players are alive), and B/C shoot at each other, we can specify that like this :

It shows that 'Self' is not good option for player A, since he does not win for any hit probability case for players A/B.

But we can also extend default search for optimal targets to allow 'Self' , and try to see if 'Self' is optimal target for any player in any hit probability case:

Above graphs look identical to situation when 'Self' is not allowed , shown in 'Miss' target analysis - which indicate that Self is never optimal target.

To further demonstrate that, we can compare 'unusual targets' graphs when optimal search include [Others, Miss] vs [Others, Miss, Self] vs [[Others, Miss, Self, Random ] :

If 'Self' was optimal target for any player at any point in above graphs, it would be red colored. Therefore above graphs demonstrate that 'Self targeting' is not optimal strategy at any point.

3.9 Four players - how different options influence four players version of truel

Original truel problem got its name from "three participants", and that is focus for majority of analyses in this document as well as majority of other/previous 'truel' documents.

But python functions presented in this document are able to support more than just three players, either in calculation part ( where both mathematical calculation and simulation can support N>3 players and find both win probabilities and optimal targets for them ), and in visualization part ( where AB graphs can display it). So we can explore truel for more players, and in this section we will do so for "four players" truel ( quadruel? )

There are certain limitations of solvers used here, and those limitations are more pronounced for N>3 players:

While 'Strongest' option is rarely used anyway, and it would affect regular truel too, limitation for simulations is more affecting 'N>3' analysis when finding optimal targets. Namely, when simulation tries to find optimal target and it eliminate one player ( so it become truel instead of quadruel), it would have to now solve optimal target for that truel subcase - and that would be prohibitively slow to compute. For example, if we have 4 players with allowed targets [Others,Miss], it means each player have 3 target options and all 4 players have 3^4= 81 possible target combinations - so each simulation that search for optimal targets need to actually do 81 simulations, which already slows things down. But if, in addition to that, we want to truly simulate for optimal targets, we would need to do same logic in every case when quadruel transforms to truel - we would need to do 3^3=27 full sub-simulations to determine which target was optimal for those truel cases. Which would further slow down simulation 27 times. And all those slowdowns exponentially increase with larger number of allowed targets - so if [Others,Miss,Self,Random] is allowed, each player has 5 possible target choices and full simulation would need 5^4 * 5^3 sub-simulations, or 78 thousands(!) times longer execution than simple simulation with predefined targets.

To reduce that slowdown impact, simulation procedures here are skipping that optimal target search when player is eliminated - they only check those 81 options when all players are still alive, but do not multiply by those 27 when reduced to truels. Instead, they assume players will target 'Strongest' opponent when reduced to truel. That is reasonable assumption, and it is shown in above sections that in most cases 'target Strongest' is optimal choice. But not in all cases, so this assumption result in simulation not being able to truly find optimal targets without increasing simulation time 27 times ( or 125 times when more targets are allowed ). This would exponentially increase for 5 players etc. Note that this limitation does not affect regular "truel" simulations, since there reduction is to 2 players, and for them it is always optimal to shoot each other.

Luckily, mathematical calculation functions from this document are able to find true optimal targets even for 4+ players - when number of players is reduced from 4 to 3, those functions are solving for optimal targets on every combination of those 3 remaining players also.

Another question when analyzing 4+ players is "what do we show on 2D AB graphs" ?

Even for truel it was obvious that if we use each axis for 'hit probability' range of one player ( so Y axis = 0%..100% hit chance of player A, and Y axis is same for player B ), we must fix hit probability of player C to some value ( default being to fix pHitC=100%, which match probability of player C to hit from original problem ). Both solvers and visualizers here are able to support any pHitC, but fact remains that any AB graph will show picture for entire player A/B range and some fixed player C hit probability. For 4+ players that same logic is extended:

2D AB graph will show results for fixed player C,D,E+ hit probabilities ( supplied in pHits parameter ) , while hit probabilities of players A and B will be on Y/X axis respectively.

So question is changed to: what values to fix for hit probabilities of players C and D ? Using original truel logic we can set pHitD=100%, but that leaves pHitC open.
Below is example of winners with optimal probabilities in three cases with pHitD=100%: pHitC=25%, pHitC=50% and pHitC=100%, and last case has pHitC=phitD=50% :

Those looks quite differently, which is not so surprising considering that they represent different hit probabilities of 2 out of 4 players. But if one should be selected as 'representative', it should probably be 2nd one with pC=50% and pD=100% hit chances.

'Miss' as optimal target

Lets see usual graphs ( with win orders, unusual targets for any player and unusual targets for winners ) in situation when we only allow classical [Others, Miss] targets:

We can see that 'Miss' is indeed optimal target in around 10% of the cases for 'at least one player' ( 3rd picture above) , and even in noticeable 5.6% of cases for 'winners' (4th picture). It is interesting that 'Miss' is optimal in smaller percentage of cases than for classic truel (N=3).

But since we mentioned that selection of fixed pC probabilities can significantly change results, lets see how often is 'Miss' optimal target for winners in different pHitC scenarios :

It shows that 'Miss' is optimal target for winners most often for lower pHitC probabilities, so more in pC=25% than pC=50%. But important thing is that Miss can be optimal targen not only for some players, but also for winners - in certain cases.

General observation is that 'Miss' is often optimal target both for winners and other players.



'Random' as optimal target

If we allow 'Random' as target option in 4 player quadruel, which by default means 'equal Random', we get results shown below :

Above graphs indicate that there are no cases where 'equal Random' is optimal target in quadruel. That single green spot is likely boundary exception ( case where selecting random or some other choice would result in equal win probabilities ).

We can also check if 'equal Random' will be optimal target for any player in different pHitC and pHitD scenarios :

Above graph demonstrate that 'equal Random' can be an optimal target in some cases for 4-player quadruel - for example, in noticeable number of cases when pHitC= pHitD=50%. But since 'Random' can often be an option in edge cases ( when one player get same win chances if he select different targets), and since those edge cases are more probable when pC equals pD, last picture demonstrate difference when pD at 51% is just slightly larger than pC at 50%. Even there, 'equal Random' is optimal target sometimes, but in smaller number of cases and not for winners.

When looking only at optimall targets for winners, 'equal Random' appears less often, as seen on 2nd row above. If we ignore pC=pD=50% case, due to high likelihood of edge cases, we see that 'equal Random' is optimal choice only for very small number of cases for winners.


Graph below shows situation if we allow "unequal Random" , limited to optRndLevel=2, meaning all random distributions in 50% steps ( eg, [50% 0 50%],[0 0 50%],[100% 0 0], [0 0 0]...) :

While above graph is limited only to 2 random levels (50% steps), previous scenarios demonstrated that even such lower resolutions are good to indicate areas where 'Random' is optimal target. We can see that 'unequal Random' is optimal target for at least some players in certain cases ( 3rd image above), but it is very rarely optimal target for winners - in few edge cases shown as green dots on 4th image.

To further check if winners can use 'unequal Random' at some points, for different pHitC and pHitD and with slightly higher random granularity ( in 1/3 steps ), we can again use 'uw' colot option for 'unusual targets for winners'. In order to improve performance ( since optRndLevel=3 will significantly slow things down), images are generated in transposed order compared to those for 'equal Random' - meaning that rows represent hit percentage choice, while column represent image type ( unusual targets for all, for winners...). That way calculations are done only once for all images in one row, so we could include win orders and win percentage images at no additional computational cost :

First thing to note on first row in above images ( for pC=50%, pD=100% , with random level 3 ) is that result looks similar to previous graphs with random level 2 - unusual targets appear in roughly same area. But higher granularity of random levels appears to slightly increase number of unusual targets, and number of cases where 'unequal Random' is optimal target - especially for winners.

Conclusions regarding 'Random' as optimal target in 4-player quadruel:



3.10 Selecting optimal targets when multiple targets result in same win probability

Finding optimal targets out of all possible targets is not always simple task, as it was explained in section 1.3 Optimal targets search, but it is mostly straightforward. But there is one question that was not mentioned in details in that section, and which can influence optimal result:

What target should player select if not just one, but several of his target choices result in same best win probability for him ?

This question does not have such straightforward answer as rest of optimal search, and it could be almost subjective what people would choose. Some example approaches are: 1) select most 'simple' target ( so single target > miss > random ) and if equally simple prefer to target player with higher hit probability [Default] 2) always prefer to target higher hit probability player ( so Miss is last ) 3) prefer to target player with highest win chance ( as opposed to highest hit chance ) 4) single player > random > miss, if same A>B>C...

Solvers in this document support all those options from above, through ifEQ parameter ( with same values as above , and default value +1 ). Default option prefers 'simplest' targets, and when both targets are equally simple ( for example, player A gets same win probability if he target player B or player C, and both B or C are equally 'simple' targets - without Random or Miss ), then prefer to target player with higher hit probability ( so in previous example, target C if player C has higher hit chance ).

Situations where this parameter influence optimal solutions are rare, and mostly occur on boundary/edge cases where two or more players have same hit probabilities - so on standard AB graphs, that would be mostly diagonal where pA ~ pB. But if we set hit probability for player C lower, at 50% for example, then edge cases would also be horizontal line at pA=50%, and vertical line at pB=50%. It is also possible to get that rare edge case for other hit probabilities. On AB graphs, such egde cases usually appear like "glitches" - individual pixels or lines that have different result color from surrounding areas. Note that they are not errors - those are actual optimal results, at those positions and under selected ifEQ interpretation.

Probably best case to compare how different ifEQ choices could change optimal results is for 4-player cases: not only due to larger complexity and number of players, but also because there are more cases when two players have same hit probabilities, and more chances that multiple players will have same win probabilities. For example, we can look at case presented in 4-players section, where player C hit chance was fixed at 50% and player D hit chance was fixed at 100% - and, in order to make those edge cases more visible, with lower resolution of 100x100 points:

We can see 'glitches' near horizontal/vertical middle lines, which were expected since pC=50% and those lines are places where either pA ~ pC (horizontal middle line) or where pB~pC (vertical middle line). But we can also see few edge cases at other positions, for example single vertical line in blue "unusual targets" graph.

If we do same graph, but with #2 option for ifEQ (always prefer to target higher hit probability player), we get results from below:

Above graph looks practically identical to previous (default) option, which confirms that those edge cases are actual optimal solutions under different ifEQ assumptions.

Below is analysis with #3 option for ifEQ (prefer to target player with highest win chance):

This time we can see more edge cases ( more "glitches")- prefering to target players with higher win chance results in more edge cases. This can be explained by fact that targeting those players reduce their win probability and make win probabilities of all players closer in general. And that could increase chance for edge cases because it increase chance that two different targets result in same win probability.

Below is analysis for ifEQ=4 case. This ifEQ option prefers targets with highest probability, so single target before randoms before Miss - and if two targets have same probability, then prefers players in shooting order (A>B>C...):

It looks very similar to ifEQ=3 case , only with slightly more edge cases ( mostly green dots, indicating 'Random' as optimal target ) on diagonal and around pA ~ 50% (blue line indicating Miss as optimal targets on "unusual targets" images). But it also has slightly less glitches on upper right edge ( where ifEQ=3 has green line indicating 'Random' as optimal target).

In general, ifEQ options #1 (default) and #2 look similar and exibit least edge cases, while ifEQ options #3 and #4 have only slightly more edge cases - but overall, edge cases ("glitches") that are changeable depending on ifEQ parameter are small minority, and most glitches visible on graphs persist across all interpretations of "what to select if two targets lead to same win probability" question.




Copyright notice

Copyright (c) 2021 gmnenad

This document or its elements, including source code and Truel analysis results, are free to use and redistribute - as long as this article at gmnenad.com is referenced when appropriate.