How to do Decision Theory

A gentle introduction

Hein de Haan

How to Build an ASI

· ~6 min read · September 25, 2021 (Updated: March 9, 2022) · Free: No

It's Friday night, and you're at a party with a hundred other people. At some point, a strange looking being walks in. The being calls himself Omega, and claims he is very good at predicting future events. Omega has a lot of money (which is unsurprising if his claim is actually true). He also likes to play a game, of which he explains the rules as follows:

Here are two boxes, A and B. A is transparent, and it contains €1000. B is opaque, and contains either €1.000.000 or nothing. You can choose to take both boxes, or only box B. The content of box B is fixed before the game starts. If I have predicted you choose only box B, then and only then does it contain €1.000.000. Otherwise, it is empty.

Omega plays the game with everyone at the party before he comes to you. About half the people won €1.001.000, and the others won €1000. Interestingly, however, it turns out Omega correctly predicted the choice of everyone at the party. Therefore, you know Omega is an (unreasonably) great predictor. But now it's your time to play. Do you choose to take both boxes ("two-boxing"), or only box B ("one-boxing")?

Dominant strategy

One way to look at this problem is to look for a dominant strategy: a strategy that works better than all other strategies (in this case, only one) independent of what Omega has done with box B. It turns out, such a strategy exists: two-boxing! You see, if Omega put nothing in box B, then two-boxing is better than one-boxing because it gets you €1000 instead of nothing. But if Omega put a million in box B, then two-boxing still gets you €1000 more than one-boxing: it gets you €1.001.000 instead of €1.000.000. So no matter what Omega has done, it seems two-boxing is the better choice.

Problem solved?

Well… It turns out things are a bit more difficult than this. Two-boxing may be the dominant strategy, but it has a clear problem: two-boxers almost always end up with a thousand euros, since Omega will almost always have correctly predicted their strategy. One-boxers, on the other hand, almost always get a million. Can we create some kind of theory that reliably tells us which action to choose on any problem we may face?

Well, Decision theory (more specifically Normative decision theory) is a field that attempts just that. Taken from LessWrong:

"Decision theory is the study of principles and algorithms for making correct decisions — that is, decisions that allow an agent to achieve better outcomes with respect to its goals."

An agent here can be anything that takes actions to achieve its goals. It can be a human, a mouse, a robot, etc. Goals can be anything; in the problem described above (known as Newcomb's problem), the goal is getting as much money from Omega as possible.

Expected utility maximization

An important concept in Decision theory is Expected utility maximization, which sounds complicated but really isn't. Taken from LessWrong:

"Utility is how much a certain outcome satisfies an agent's preferences."

So, for example, for most humans, being in a car accident has very low utility, whereas being on a nice vacation has very high utility. In Newcomb's problem, utility is strongly correlated with (or simply equals) the amount of money made. What is expected utility? Again LessWrong has a nice definition:

"Expected utility is the expected value in terms of the utility produced by an action. It is the sum of the utility of each of its possible consequences, individually weighted by their respective probability of occurrence."

For example, say I have a one in a million chance of getting killed when crossing the street. Getting killed wile crossing the street has -10.000 "utility points" for me, and crossing the street safely gives me 100 utility points. Multiplying both utilities by their corresponding probabilities (1 in 1.000.000 and 999.999 in 1.000.000 respectively) and then summing the results gives me 99.99 expected utility points. This result can be compared to utility scores of other actions in order for me to choose which action to take. The reader should note that this example is simplistic: in the real world, there are often much more factors to consider.

Expected utility maximization, then, is simply choosing the action which leads to the outcome with the highest expected utility.

But, of course it's not that simple. It turns out it's quite difficult to determine what actions (e.g. crossing the street) lead to what outcomes (e.g. getting across the street safely).

Different Decision Theories

The field of Decision theory has so far produced a number of, well, Decision theories (yes, the name of the field is unfortunate). One of the "standard" theories is Causal Decision Theory (CDT). Informally speaking, CDT states that the agent should consider the causal effects of her actions: in order to determine what action to choose, she looks at how her optional actions causally relate to different outcomes.

Evidential Decision Theory (EDT), on the other hand, says an agent should look at how likely certain outcomes are given her actions (and observations), regardless of causality.

Let's look how CDT and EDT handle Newcomb's problem (described at the top of this post) to better understand both theories and there differences.

An agent using CDT argues that at the moment she is making the decision to one-box or two-box, Omega has already either put a million euros or nothing in box B. Her own decision now can't change Omega's earlier decision; she can't cause the past to be different. Furthermore, no matter what the content of box B actually is, two-boxing gives an extra thousand euros. The CDT agent therefore two-boxes.

Now, consider an EDT agent facing Newcomb's problem. She argues as follows: "If I two-box, Omega will almost certainly have predicted this. Future-me two-boxing would therefore be strong evidence that box B is empty. If I one box, Omega will almost certainly have predicted that too — which is why future-me one-boxing would be strong evidence of box B containing a million euros." Following this line of reasoning, the EDT agent, in contrast to the CDT agent, one-boxes.

It's important to note once again that Omega is an extremely good predictor. Therefore, he will almost certainly have correctly predicted both the CDT agent's and EDT agent's choice in advance. So, Omega predicted the CDT agent would two-box; therefore, Omega put nothing in box B and the CDT agent ends up with a thousand euros. The EDT agent ends up with a million, since Omega predicted she would one-box and put a million in box B. EDT is clearly the better Expected utility maximizer here.

So, EDT wins, right? Well, on this problem, yes. But behold…

The Smoking lesion problem

Taken from LessWrong:

"Smoking is strongly correlated with lung cancer, but in the world of the Smoker's Lesion this correlation is understood to be the result of a common cause: a genetic lesion that tends to cause both smoking and cancer. Once we fix the presence or absence of the lesion, there is no additional correlation between smoking and cancer.

Suppose you prefer smoking without cancer to not smoking without cancer, and prefer smoking with cancer to not smoking with cancer. Should you smoke?"

Here, the CDT agent chooses smoking: she reasons that smoking doesn't cause lung cancer in this imaginary world, and since she (as stated by the problem) prefers smoking without cancer to not smoking without cancer, she smokes. The EDT agent, however, reasons that if she were to smoke, that is evidence of her having the genetic lesion, which causes lung cancer. She therefore doesn't smoke.

Note that here, the CDT agent does better than the EDT agent. The CDT agent smokes, doesn't get cancer and therefore wins more utility than the EDT agent.

So, what can we conclude? Given how CDT and EDT handle Newcomb's problem and the Smoking lesion problem, it seems neither theory is perfect for maximizing Expected utility. In follow-up posts, we'll discuss more problems like Newcomb's problem and the Smoking lesion problem; we'll also take a look at more decision theories (for example Functional Decision Theory).

This article was partially based on this article by Eliezer Yudkowsky and Nate Soares.

#decision-theory #causal-decision-theory #evidential-decision #superintelligence #artificial-intelligence