A wild-goose chase after a wild guess

In this post we examine the role of a “wild guess” in decision-making, with the view to demonstrate that methodologies that make a “wild guess” a key ingredient of their analysis, can easily turn out to be voodoo methodologies namely, exponents of a voodoo decision theory. Our main objective is to bring to light the pitfalls lurking in this proposition, because these may not be easily discernible to all those who come across a methodology proposing a “wild guess” as a key ingredient of its analysis. For, as experience has shown, the rhetoric accompanying such methodologies can easily conceal the plain facts about what these methodologies actually do, so that even experts can be fooled by them.

It ought to be pointed out, though, that the voodoo methodologies that are of concern to us in this discussion can be identified for what they are by universally accepted “tests”. It is therefore most unfortunate that experts who fail in their duty to apply these simple, well-known tests, are instrumental in legitimizing the use and promotion of voodoo decision making.

To illustrate this point we examine info-gap decision theory’s alleged great prowess to obtain reliable results, under conditions of severe uncertainty, on the basis of a local analysis in the neighborhood of a poor point estimate that can be substantially wrong. For the benefit of those who are not familiar with this theory, it should be pointed out that info-gap decision theory is hailed (in the info-gap literature) as a theory offering a reliable methodology that seeks robust decisions for decision problems where a key parameter is subject to a severe uncertainty. The severity of the uncertainty is characterized by:

  • a vast (eg. unbounded) uncertainty space,
  • a poor estimate of the parameter of interest that can be substantially wrong, and
  • a likelihood-free uncertainty model.

To simplify matters, assume that the parameter of interest is a real number u and that the uncertainty space is \mathbf{U}=(-\infty,\infty). This means that the true value of u can be any real number and what is more, that we are in no position to specify any likelihood structure on \mathbf{U} relating to the true value of u.

No doubt, the uncertainty in the true value of u, as stipulated by info-gap decision theory, is indeed severe.

So, how does info-gap decision theory propose to deal with such a situation?

The first question that immediately comes to mind is this. What would happen in situations where we would be unable to put forward an estimate of the true value of u. After all, there are situations where, for various reasons, this piece of information is not available. In such cases all we would be able to say is that the true value of u is an element of some set \mathbf{U}, call it the uncertainty space, and that this set consists of all the possible/plausible true values of u under consideration. In other words, the uncertainty under consideration would be characterized by these two properties:

  • The uncertainty space \mathbf{U} is vast (e.g. unbounded).
  • The uncertainty is likelihood-free

It is important to note that the second property entails thatthere are no grounds to assume that the true value of the parameter of interest is more/less likely to be in the neighborhood of a given value of u, say u', than in the neighborhood of any other value of u, say u''.

The question is then what would be the implications of this state of affairs, where no estimate of the true value of u would be available, for info-gap decision theory

It is important to appreciate that this question goes to the heart of info-gap decision theory. This is so because its application hangs on the availability of a point estimate of the true value of u. To wit, the point estimate is a key ingredient of info-gap’s uncertainty model, so that by necessity it is the fulcrum of its robustness model and its decision model. In other word, you cannot even contemplate using the info-gap methodology unless you put forward a point estimate of the true value of the parameter of interest.

It goes without saying that the results generated by info-gap’s robustness analysis are affected by the value of the point estimate. But what is more, the “quality” of the results generated by the analysis is contingent on the “quality” of the estimate used in the analysis.

Considering then the pivotal role of the estimate in info-gap decision theory’s methodology, the question arising is this: what should/can be done to enable implementing it in “pathologic” cases where no estimate is available?

As this question opens a “pandora’s box” of complicated issues, we shall not go into it here. Instead, let us examine two possible practical solutions:

  • We nominate a completely arbitrary element of \mathbf{U} to serve as the point estimate. Namely, we let the estimate be a wild guess!
  • We decline to use the theory.

Still, neither option implies a smooth sailing. So let us examine what each option entails.

Chasing a wild  GOOSE  guess

In many applications, nominating “a wild  GOOSE  guess” as an estimate of the (unknown) true value of the parameter of interest is done as a matter of course. So, in this sense, info-gap decision theory is not unique. What makes it unique, though, is its treatment of this “wild guess”. Namely, the role and place that it assigns to the “estimate” in the analysis that it prescribes performing to define and identify robust decisions. In fact, this is what makes the theory a voodoo decision theory par excellence.

And to see why this is so take note that:

  • All that the theory prescribes doing to this end is to conduct a local robustness analysis in the neighborhood of the (poor) estimate. More bluntly, it makes do with a local robustness analysis in the neighborhood of the (poor) estimate.
  • It prescribes no sensitivity analysis whatsoever for this point estimate.

Consider then the consequences of this methodology.

Local vs Global robustness

Keeping in mind the above characterization of the likelihood-free property, it is clear that the immediate implication of info-gap’s uncertainty model being likelihood-free is that there are no grounds whatsoever to assume that the true value of the parameter is more/less likely to be in the neighborhood of the estimate than in the neighborhood of any other value of the parameter. This implies that there are no grounds to focus the robustness analysis on the neighborhood of the estimate. Because, not only methodologically, but practically as well, there are no grounds to assume that the local robustness of a decision in the neighborhood of the estimate would be a good, or, for that matter, a bad indication of its robustness to severe uncertainty.

Indeed, to determine the robustness of a decision against the severe uncertainty in the parameter’s true value, it is imperative to examine the decision’s performance over the entire uncertainty space. This means that the robustness analysis must be global in nature. Namely, it must be based on a suitable definition of global robustness that seeks to take adequate account of the entire uncertainty space.

All this goes to show that info-gap decision theory’s prescription for robustness to uncertainty is reminiscent of the Lamppost Trick . Clearly, in the framework of info-gap decision theory’s methodology, the wild guess is assigned the role of the lamp in the Lamppost Trick.

And what is so remarkable in all this is that the estimate, which is admitted to be no more than a “wild guess”, is not even subjected to a sensitivity analysis.

Sensitivity analysis

For many years now, sensitivity analysis has been a vital component of the analysis of quantitative models, so much so that the proposition that a model’s key parameters be submitted to a sensitivity analysis is taken for granted. Thus, consider the following statements:

1. Introduction. A parameter sensitivity analysis (SA) is considered to be so important in any modeling activity that it has become a routine exercise that is expected of any modeling project.

Hearne, J. (2010, p. 107)
Ab automated method for extending sensitivity analysis to model functions
Natural Resource Modeling, 23(2), 107-120, 2010

6. Conclusion. Although parameter SA is expected of all models, an analysis of the functions used in a model is performed less frequently.

Hearne, J. (2010, p. 119)

But, in info-gap decision theory, which claims to offer a reliable methodology for decision under a “truly” severe uncertainty, the estimate \tilde{u} of its robustness model’s key parameter is not put to the test of a sensitivity analysis (stress test). And this in spite of the fact that the point estimate of this key parameter is assumed to be a poor indication of the parameter’s true value that can turn out to be substantially wrong.

It is important to realize that the local info-gap robustness analysis that info-gap decision theory conducts with respect to the parameter u is no substitute for a sensitivity analysis with respect to the point estimate \tilde{u}. The point to note here is that the info-gap robustness of a decision, that is obtained on grounds of \tilde{u}, can (and usually does) vary with the value of the estimate \tilde{u}.

To illustrate this point, consider the following figure. It displays the info-gap robustness analysis of a decision for two values of the estimate, namely \tilde{u}' and $\tilde{u}’&s=1$. The rectangle represents the uncertainty space \mathbf{U}.

The radii of the two blue circles represent the info-gap robustness of the decision under consideration relative to the two point estimates. Clearly, the decision under consideration is far more info-gap robust at \tilde{u}' than it is at \tilde{u}''.

However, in view of the fact that the uncertainty space of info-gap’s model is likelihood-free, the question arising is which of the two analyses, if indeed any of them, properly reflects the decision’s robustness against the severe uncertainty in the true value of the parameter. Which means of course that the estimate yielding such results ought to be put to the test. And yet, info-gap decision theory is utterly oblivious to this basic issue.

In sum, the theory seems utterly unconcerned about the fact that the results yielded by the analysis that it prescribes are based on a “wild guess” and that they may therefore be ” highly suspect”. To the contrary, the results obtained from this analysis are given official sanction by the theory to the effect that the theory seeks decisions that are robust to severe uncertainty.

Universally accepted tests

Of course, the promulgation of unsubstantiated theories/methods is not a new phenomenon. So, over the years, common-sense tests have been put forward to enable the diagnosis of such theories. These tests seek to pinpoint the flaws in methods/theories that render them suspect. We illustrate how two such tests would be used to identify the flaws that undermine info-gap decision theory.

The “no free lunch” test

The no free lunch metaphor seeks to highlight the point made by the so-called No Free Lunch theorems in a range of areas of expertise. Broadly speaking, the idea here is that problems have certain inherent difficulties that must be dealt with directly, that is in a manner that takes on the specific issues that these difficulties give rise to, because otherwise these problems cannot be considered “solved”. This means that if these difficulties are not properly reckoned with in the analysis, they are certain to resurface at a later stage, when one would discover that the results yielded by the analysis in fact … failed to address the issues that one had set out to resolve in the first place.

The no free lunch effect is illustrated perfectly in the case of info-gap decision theory.

To explain, because an info-gap methodology (as we saw above) effectively ignores the severity of the uncertainty that info-gap decision theory claims to address, users of this theory are bound to discover that … having completed the analysis prescribed by this theory, they need to go back to … the difficulties presented by the severity of the uncertainty to deal with them themselves. This fact is stated eloquently in this observation:

Analysts who were attracted to IGT because they are very uncertain, and hence reluctant to specify a probability distribution for a model’s parameters, may be disappointed to find that they need to specify the plausibility of possible parameter values in order to identify a robust management strategy.

Hayes, K. (2011, p. 88, emphasis added)
Uncertainty and Uncertainty Analysis Methods
Final report for the Australian Centre of Excellence for Risk Analysis (ACERA)
CSIRO Division of Mathematics, Informatics and Statistics, Hobart, Australia
130 pp.

The point made by this observation is that if one sets out to tackle a problem that is subject to severe uncertainty, then sooner or later one will have to deal with the specific issues arising from the … uncertainty being severe.

So, in the case of info-gap decision theory, because its local likelihood-free robustness model cannot possibly cope with the issues arising from the severity of the uncertainty that info-gap decision theory postulates, users of this model sooner or later discover that they must come up with their own measures to deal with this uncertainty.

The inference therefore is that one would be well-advised to examine carefully any decision theory which nominates a poor point estimate as the basis of the analysis, to make sure that it does not offer a “free-lunch”. Namely, one had better make sure that the theory faces up to the estimate’s poor quality and that it takes the appropriate measures to meet this fact. Otherwise, as indicated by Hayes (2011), one might be disappointed to learn that it is impossible to justify, indeed verify, the validity of the results generated by the theory, without dealing properly with the … poor quality of the estimate.

Another means that should immediately reveal whether a methodology centered on a poor estimate is offering a “free lunch” is an appeal to the following maxim.

GIGO

Clearly, the well-known and frequently appealed to Garbage In — Garbage Out (GIGO) maxim requires no commentary. Still, it is worth noting that keeping this maxim in mind can save one from the ambarassment of falling into the trap of a “free lunch methodology”. For, keeping in mind the instruction it gives, should remind one that: the default assumption about the quality of the results generated by a model or an analysis fed with poor quality (garbage) input is that … the output can be expected to be on a par: garbage. Schematically,

\textrm{{\bf Garbage}\ In} \rightarrow \fbox{\raisebox{0.85cm}{\ }\ Model/Analysis\ \raisebox{-0.5cm}{\ }} \rightarrow \textrm{{\bf Garbage}\ Out}

Corollary:
The results of an analysis are only as good as the estimates on which they are based.

Hence,

\textrm{{\bf Poor}\ Estimate} \rightarrow \fbox{\raisebox{0.85cm}{\ }\ Model/Analysis\ \raisebox{-0.5cm}{\ }} \rightarrow \textrm{{\bf Poor}\ Results}

It is important to realize that the real trouble with voodoo decision-making methodologies is not that their results are obtained from poor estimates based analyses, but that their interpretation of the results contravenes the GIGO maxim and its many corollaries. The picture then this:

\textrm{{\bf Garbage}\ In} \rightarrow \fbox{\raisebox{0.85cm}{\ }\ Voodoo\  Analysis\ \raisebox{-0.5cm}{\ }} \rightarrow \textrm{{\bf Gold}\ Out}

Thus, given the severe uncertainty predicated by info-gap decision theory hence, the poor quality of the estimate on which its analysis is based, the GIGO maxim implies the following obvious conclusions:

  • The inherently local orientation of info-gap’s robustness model, and the assumed poor quality of the estimate, mean that the results would be commensurate: of poor quality.
  • To meet this problem, it is imperative to adopt a global approach to robustness.

Methodologies that are based on decision theories that do not address these issues should be suspected of being voodoo methodologies.

Declining the use of a theory

Just as it is vital to establish whether a theory has the capabilities to deliver on what it claims to deliver, it is important to be able to determine when a theory should be rejected on the grounds that it is unsuitable for application in a case under consideration. Thus, when it comes to info-gap decision theory, the situation is straightforward. It is eminently clear that this theory utterly unsuitable for the treatment of severe uncertainty of the type that it stipulates because it clearly lacks the capabilities required for this task.

That this is so is evidenced by the fact that the robustness model that it puts forward for the pursuit of robustness to uncertainty is a model of local robustness, to be precise a radius of stability model. As such, this model is designed to seek decisions that are robust against small perturbations in the nominal value of a parameter, meaning that, by definition, this is the only task it ought to be used for. However, as the pursuit of robustness to severe uncertainty (of the type postulated by it) requires the employment of a model that is designed to seek global robustness, it is clear that info-gap decision theory must be rejected on the grounds that it’s robustness model is unsuitable for this task.

That said, it is important to point out that instruction on how to properly approach the problem of robust decision-making under severe uncertainty can be found in the vast literature on robust optimization. It is also important to point out to the followers of info-gap decision theory that info-gap’s robustness model is in fact an extremely simple robust optimization model.

The bottom line

A decision theory claiming to deal with severe uncertainty must demonstrate that it properly addresses the … severity of the uncertainty under consideration. This requirement is particularly binding on any decision theory nominating a poor point estimate of the true value of the parameter of interest as the key ingredient of the robustness analysis. Such a theory must show that it properly address the … poor quality of the point estimate under consideration.

Universally accepted maxims such as there is no free lunch and garbage in — garbage out can easily identify/detect voodoo decision methodologies that are based on poor estimates but … ignore the poor quality of the estimate.

One can also watch out for the too good to be true signals.

Viva la Voodoo!

Moshe

Explore or Ignore?

The objective of this post is to set out a detailed explanation of the fact that radius of stability models of robustness are, by definition, models of local robustness and are therefore unsuitable for the management of a severe uncertainty that is characterized by a vast uncertainty space.

Some readers may wonder whether going to such great lengths to elaborate a fact that seems so patently obvious is in fact necessary. After all, the situation is crystal clear. The fact that radius of stability models are models of local robustness means that they are not designed to seek a global robustness. From this it follows that they are unsuitable to take on a severe uncertainty that is manifested in a vast uncertainty space because such an uncertainty requires to be handled by means of a global robustness analysis.

This observation is of course perfectly legitimate. Still, I submit that a detailed explanation of this issue is very much required, because it is important to make it clear to those who are not at home in this area of expertise that a number of (experienced) risk analysts/scholars do indeed propose to use radius of stability models (read: info-gap robustness models) to tackle problems that are subject to a severe uncertainty of this type. And what is more, it is important to make it clear that such a proposition is based on a misguided attribution of properties and capabilities of models of global robustness to models of local robustness. It is important to call attention to this error and to elucidate it because it is generally buried (in info-gap publications) under a pile of misleading rhetoric.

The discussion that follows makes this contention abundantly clear.

Consider then the following three clearly distinct (but related) sets that are associated with a parameter u, call it u^{*}_{\ }, whose true value is unknown, indeed is subject to severe uncertainty:

  • The uncertainty space
    Let \mathbf{U} denote the set of all the possible/plausible values of u. The basic assumption is that u^{*} is an element of this set. Hence, the idea is to determine the robustness of the system considered against variations in the value of u over \mathbf{U}.
  • The active uncertainty set
    Let \mathbf{A} denote the subset of \mathbf{U} that effectively takes part in the robustness analysis. In other words, the set \mathbf{A} is that subset of \mathbf{U}, all of whose elements are reckoned with in the analysis so that they affect, or can affect, the results of the robustness analysis. We shall refer to the elements of \mathbf{A} as the active values of u.
  • The inactive uncertainty set
    Let \overline{\mathbf{A}}=\mathbf{U}\setminus\mathbf{A} denote the complement of \mathbf{A}. This is that subset of the uncertainty space \mathbf{U} whose elements are not reckoned with in the analysis so that they have no impact, indeed can have no impact, on the results generated by this analysis. We shall refer to the elements of \overline{\mathbf{A}} as the inactive values of u.

Example 1: global approaches

There are approaches to dealing with severe uncertainty that take the active uncertainty space \mathbf{A} to be equal to the uncertainty space \mathbf{U}. Obviously, such approaches hold that every possible/plausible value of u counts and must therefore be considered in the analysis.

The picture is this:


Figure 1

This picture speaks for itself so that no further comment on it is required.

Example 2: scenario generation

It is common practice in many applications to base the uncertainty/robustness analysis on a relatively small number of possible realizations (scenarios) of the parameter of interest, rather than on the entire uncertainty space \mathbf{U}. In fact, in some applications, the uncertainty set \mathbf{U} may consist of infinitely many elements, but the analysis would take into account no more than three values of u: an “optimistic” value of u, a “pessimistic” value of u, and an “average” value of u. In such cases, the active uncertainty space \mathbf{A} consists of no more than three elements.

The generic situation is depicted in the figure below:


Figure 2

This picture speaks for itself so that no further comment on it is required.

In cases where the objective is to test the system’s behavior over the uncertainty space \mathbf{U}, it may prove necessary to generate an extremely large number of scenarios so as to ensure that one adequately represents the variation of the value of u over \mathbf{U}. See for example the report Enhancing strategic planning with massive scenario generation: theory and experiments (2007), notably the following extract from the Preface:

As indicated by the title, this report describes experiments with new methods for strategic planning based on generating a very wide range of futures and then drawing insights from the results. The emphasis is not so much on “massive scenario generation” per se as on thinking broadly and open-mindedly about what may lie ahead. The report is intended primarily for a technical audience, but the summary should be of interest to anyone curious about modern methods for improving strategic planning under uncertainty.

That said, consider now the following extract from Wikipedia (January 10, 2012, emphasis added):

Scenarios planning starts by dividing our knowledge into two broad domains: (1) things we believe we know something about and (2) elements we consider uncertain or unknowable. The first component — trends — casts the past forward, recognizing that our world possesses considerable momentum and continuity. For example, we can safely make assumptions about demographic shifts and, perhaps, substitution effects for certain new technologies. The second component — true uncertainties — involve indeterminables such as future interest rates, outcomes of political elections, rates of innovation, fads and fashions in markets, and so on. The art of scenario planning lies in blending the known and the unknown into a limited number of internally consistent views of the future that span a very wide range of possibilities.

It is important to take note then that scenario planning, as suggested here by the term “art”, is generally an extremely difficult task. This is due to the difficulties that one would face in quantifying and implementing the many qualitative guidelines offered by the vast literature on scenario generation.

Example 3: local analysis

There are many applications where the object of interest is the behavior of a system in a small neighborhood of the uncertainty space \mathbf{U} rather than in the entire set \mathbf{U}. In such cases, the active set \mathbf{A} can be a neighborhood, call it N(\rho,\tilde{u}), of radius \rho around some element \tilde{u} of \mathbf{U}, where \tilde{u} is given a priori and \rho is determined according to some given recipe.

The generic situation is depicted in the figure below:


Figure 3

This picture speaks for itself so that no further comment on it is required.

Obviously, in cases such as these, one had better be able to provide a cogent argument explaining why the analysis is conducted not on the uncertainty space \mathbf{U} itself, but rather on a relatively small neighborhood N(\rho,\tilde{u}) thereof, where \tilde{u} denotes the center point of the neighborhood and \rho denotes its size (radius). The point is that experience has shown that there could be “good” and “bad” reasons for prescribing a local analysis. It is important therefore that the proposition to employ a local analysis be fully verifiable to allow prospective users to ascertain (in each case) whether such a proposition is indeed sound for the case considered.

It goes without saying that such a justification would be imperative in cases where the uncertainty analysis is claimed to deal with the full spectrum of the variability of u, including values representing “rare events”, “surprises”, “catastrophes”, and so on. The onus would then be on anyone making such claims (ex. info-gap scholars) to explain, indeed justify, how a local analysis in a small neighborhood N(\rho,\tilde{u}) of the large space \mathbf{U} can possibly do the job it is claimed to do.

Discussion

Having clarified the distinctive characteristics of the sets that might figure in an uncertainty analysis, my next task is to call attention to the fact that, surprising though it may sound, some scholars/analysts seem to confuse the following two distinctly different concepts/objects:

  • The set of possible/plausible values of a parameter, denoted above by \mathbf{U}.
  • The set of values of the parameter that actively participate in an analysis, denoted above by \mathbf{A}.

It is important to be aware of this fact as it is a prerequisite for a correct assessment of the results reported on in certain publications dealing with the management of severe uncertainty. For, as might be expected, those scholars/analysts who confuse the two concepts/objects effectively misconstrue the results yielded by the analysis that they perform. The following example illustrates this point.

Example: info-gap decision theory

The robustness analysis that is prescribed by info-gap decision theory is manifestly a local robustness analysis, namely the robustness of a decision is determined/defined by a model of local robustness. The implication therefore is that unless proven otherwise, the decision in question is/can be only locally robust/fragile. But, the rhetoric in this literature depicts the results yielded by the robustness analysis as though they were yielded by a global analysis. So much so that, some info-gap scholars contend that info-gap’s robustness analysis yields a decision that generates satisfactory outcomes under the widest set of possible values of the parameter of interest.

So, the question that those asserting this claim must answer is this:

How can a model of local robustness possibly yield a decision that generates satisfactory outcomes under the widest set of possible values of the parameter of interest?

If you take the trouble to look into the issues raised by this question, you immediately see that info-gap’s robustness analysis does not seek such a decision because:

  • The robustness analysis prescribed by info-gap decision theory does not seek a decision that generates satisfactory outcomes under the widest set of possible values of the parameter of interest.
  • All that this robustness analysis prescribes doing is to seek a decision that generates satisfactory outcomes over the largest neighborhood N(\rho,\tilde{u}) of a given nominal value \tilde{u} of the parameter.

And to see that this is so, keep in mind the above distinction between the uncertainty space \mathbf{U} and the active uncertainty set \mathbf{A}. Take note then that in the case of info-gap decision theory, insofar as a decision x\in X is concerned, the active uncertainty set is the following:

\displaystyle \mathbf{A} = N(\rho^{*},\tilde{u})

where \rho^{*} can be any real number that is larger than

\displaystyle \hat{\rho}(x,\tilde{u}):= \max_{\rho\ge 0}\ \{\rho: r^{*} \le r(x,u), \forall u\in N(\rho,\tilde{u})\}

recalling that \hat{\rho}(x,\tilde{u}) denotes the robustness of decision x.

This is shown schematically in the figure above, where the uncertainty space \mathbf{U} is represented by the large rectangle and the active set \mathbf{A}=N(\rho^{*},\tilde{u}) is represented by the small yellow circle.

Of course, the more basic issue that this figure brings into sharp focus is info-gap decision theory’s “unique” approach to severe uncertainty. And to explain this point let us examine the following question:

What measures does info-gap decision theory take, more precisely, what measures does info-gap’s robustness model put in place in order:

  • to deal adequately with the uncertainty being severe, namely
  • to ensure that the active uncertainty set \mathbf{A} properly represents the variability of u over \mathbf{U}?

And the answer to this question is this:

The measures that info-gap decision theory takes to deal with the difficulties arising from the uncertainty being severe is to … ignore the severity of the uncertainty. This fact is brought out forcefully by the above figure, which illustrates the profound (one might say comical) incongruity between the huge challenge posed by the severity of the uncertainty that the theory claims to address, and the localized robustness analysis that it prescribes to meet this challenge.

More specifically, while the theory claims to take on a severe uncertainty that is manifested in

  • a vast (e.g. unbounded) uncertainty space and a poor estimate that can turn out to be substantially wrong,

the weapon that it proposes to deal with this uncertainty is

  • a robustness analysis that makes do with establishing the size of the smallest perturbation in the estimate that can cause a violation of the performance requirement.

And if this were not enough, then this profound incongruity is further exacerbated by declarations in the info-gap literature that this type of analysis puts at the analyst disposal a reliable methodology for dealing with “rare events”, “surprises”, “catastrophes”, etc.

I refer to this incongruity as the explore but ignore effect, to wit:

  • Explore: By expressly positing a vast (e.g. unbounded) uncertainty space, info-gap decision theory presumably gives notice that it proposes to explore in depth the possible/plausible variations in the value of u over its entire uncertainty space.

However,

  • Ignore: By employing a radius of stability robustness analysis, the theory effectively confines the search to decisions that are robust against small perturbations in the value of the estimate. It therefore ignores the performance of decisions over the bulk of the uncertainty space (No Man’s Land).

And what is so remarkable in all this is that info-gap scholars seem to have no clue of this incongruity. For how else can one explain the big fuss that is made in the info-gap literature about info-gap’s robustness analysis supposed capabilities to explore the entire uncertainty space \mathbf{U}.

The (misguided) argument that info-gap scholars put forward to substantiate this claim runs as follows:

  • The nested neighborhoods N(\rho,\tilde{u}))\subseteq \mathbf{U}, \rho\ge 0 in info-gap’s uncertainty model expand as \rho increases.
  • Furthermore, these neighborhoods are constructed/defined so that as \rho \rightarrow \infty the neighborhood N(\rho,\tilde{u}) approaches \mathbf{U}.
  • So for a sufficiently large \rho, the neighborhood N(\rho,\tilde{u}) is sufficiently similar to \mathbf{U}.

In short, the family of nested neighborhoods N(\rho,\tilde{u}))\rho\ge 0 spans the uncertainty space \mathbf{U}.

This is illustrated in the picture below:


Figure 4

The neighborhoods N(\rho,\tilde{u}), \rho\ge 0, are represented by the gray circles centered at \tilde{u} and the uncertainty space \mathbf{U} is represented by the largest (blue) circle.

However!

While it is no doubt true that these neighborhoods do indeed span the uncertainty space, the important point to note here is that the key factor that drives info-gap’s robustness analysis and directly determines the results yielded by it is the performance level that the decisions are required to meet. In other words, this valid argument does not address at all the extent to which, if any, info-gap’s robustness analysis takes into consideration the performance levels in areas of the uncertainty space that are distant from the estimate \tilde{u}.

Differently put, the info-gap robustness of decision x is determined in total disregard to the performance of x in relation to values of u that are outside the neighborhood N(\rho^{*},\tilde{u}), where \rho^{*} is any real number greater than \hat{\rho}(x,\tilde{u})).

For this reason I refer to the set \mathbf{U}\setminus N(\rho^{*},\tilde{u}) as the No Man’s Land of decision x at \tilde{u}.

What this metaphor brings out is that it is immaterial whether your algorithm for computing the value of \rho(x,\tilde{u}) is capable of exploring the entire uncertainty space \mathbf{U}. The key element here is that this value is not affected by the performance of x over the No Man’s Land \mathbf{U}\setminus N(\rho^{*},\tilde{u}). The inference therefore is that if to determine the value of \hat{\rho}(x,\tilde{u}) it is unnecessary to explore the No Man’s Land \mathbf{U}\setminus N(\rho^{*},\tilde{u}), then exploring the entire uncertainty space is in fact wasteful.

It goes without saying that some algorithms for determining the radius of stability of systems exploit this fact.

The following sequence of pictures is designed to make vivid the errors in the argument that info-gap decision theory seeks decisions that are robust against severe uncertainty.

The first picture calls attention to the fact that, an info-gap analysis presupposes a distinction between “acceptable” and “unacceptable” values of u. The shaded area represents the set of “acceptable” values of u.


Figure 5

Next, only neighborhoods that are contained in the region of acceptable values of u are admissible in an info-gap robustness analysis. This is illustrated in the picture below.


Figure 6

Hence, the info-gap robustness of the decision depicted in this picture is equal to the radius of the largest circle (neighborhood) contained in the shaded area. The largest circles (neighborhoods) are not admissible.

So the result of info-gap’s robustness analysis can be summarized by the following picture:


Figure 7

The info-gap robustness of decision x, denoted \hat{\rho}(x,\tilde{u}), is equal to the radius of the largest (green) circle centered at \tilde{u} that is contained in the shaded area. To be precise, the radius of this circle, namely the info-gap robustness of the decisions under consideration, takes no account whatsoever of the decision’s performance (shape of the shaded area) outside a circle that is slightly larger than the shown green circle, denoted N(\rho^{*},\tilde{u}), where \rho^{*} is slightly larger than \hat{\rho}(x,\tilde{u}).

This is illustrated in the following picture that depicts the No Man’s Land of info-gap’s robustness analysis.


Figure 8

Now.

To the best of my knowledge, most countries do not ban analysts from conducting their robustness analysis in the No Man’s Land.

Still.

The whole point of the No Man’s Land metaphor is to illustrate a situation where the performance of a decision over its No Man’s Land is not taken into account in the robustness analysis. The message of this illustration is of course that in such a situation (as in the case of info-gap’s robustness analysis) the robustness of such a decision cannot be claimed to represent the decision’s performance over the No Man’s Land. So, if the No Man’s Land is vast and/or the performance of the decision over this region of the uncertainty space is a determinant factor in the analysis, then the robustness analysis cannot be claimed to reflect the robustness of the decision over the uncertainty space.

Size of the No Man’s Land

I have been accused, on a number of occasions, of deliberately misrepresenting the implications of info-gap’s robustness analysis by drawing the region covered by this analysis as being far smaller than the No Man’s Land, let alone the uncertainty space.

Of course, given my discussion so far, I could have dismissed this claim outright as lacking in any merit. However, because a reply to these claims brings out the full dimensions of the fundamental flaw in info-gap’s robustness analysis, I think it important to take it up.

My reply to these accusations is that far from misrepresenting the facts about info-gap’s robustness analysis, my depiction of the No Man’s Land effect in the context of info-gap’s robustness analysis is in fact, extremely charitable. Indeed, my depiction of info-gap’s No Man’s Land is greatly in info-gap’s “favor” because its size in this picture is in fact immeasurably smaller than what it ought to be.

This is so because, according to the Father of info-gap decision theory (Ben-Haim 2001, p. 208; 2006, p. 210; emphasis added):

Most of the commonly encountered info-gap models are unbounded.

This means that in the case of the “most commonly encountered” models, the No Man’s Land would typically be unbounded. And the implication is that my depiction of the region covered by info-gap’s robustness analysis vastly exaggerates its size: the region covered by info-gap’s robustness is infinitesimally small compared to the No Man’s Land.

The Explore and/or Ignore quandary

My experience of the past eight years has shown that those who have fallen under the spell of info-gap decision theory have no clue of the basic contradiction that lies at its core. This is a contradiction between what this theory claims to do — seek decisions that are robust against severe uncertainty — and what it actually does — seeks decisions that are robust against small perturbations in a given value of the parameter of interest.

Based on numerous discussions that I have had over the past eight years with info-gap scholars, I can confidently state that their failure to detect this contradiction is due to a more basic incomprehension of the difference between local and global robustness. Info-gap users therefore have no qualms to assert that info-gap’s robustness analysis explores the uncertainty space \mathbf{U} in depth, not realizing that because this analysis is confined to the neighborhood of a point estimate, (which is assumed to be poor and can be substantially wrong), it effectively ignores the bulk of the the uncertainty space \mathbf{U}, hence the severity of the uncertainty under consideration.

Hence, the quandary that info-gap scholars find themselves in is this. If they advocate info-gap decision theory as a tool for dealing with a severe uncertainty of the type it claims to address, they must explain how this theory, which in fact ignores the severity of the uncertainty, can properly be advocated for this task. If on the other hand, they recognize that as a model of local robustness info-gap’s robustness model is not designed to explore in depth the uncertainty space, then they must explain how this fact squares with the rhetoric in the info-gap literature which hails info-gap decision theory as particularly suitable for the management of severe uncertainty.

That said, I should point out though that these matters are not discussed in the info-gap literature. As a result, the already entrenched misconceptions about info-gap’s robustness model purported capabilities to explore the uncertainty space become further exacerbated as illustrated for instance by a recent peer-reviewed article which proposes info-gap decision theory as a suitable method for handling not only Black Swans, but also unknown unknowns (See Review 17).

Viva la Voodoo!

Moshe

Voodoo decision theories

Apparently some readers seem to have difficulties in finding my definition of the key term Voodoo Decision Theory on this site. Take note then that this term is discussed/explained on the Voodoo Science page. For your convenience, here is a copy of this page:

The terms “voodoo economics”, “voodoo mathematics”, “voodoo statistics”, “voodoo ecology” and so on, seem to have been coined with the same object in mind. To put across the idea captured in point 4 of the definition given in the old ENCARTA dictionary (color added):

Voodoo n

  1. A religion practiced throughout Caribbean countries, especially Haiti, that is a combination of Roman Catholic rituals and animistic beliefs of Dahomean enslaved laborers, involving magic communication with ancestors.
  2. Somebody who practices voodoo.
  3. A charm, spell, or fetish regarded by those who practice voodoo as having magical powers.
  4. A belief, theory, or method that lacks sufficient evidence or proof.

It should be pointed out, therefore, that the term “voodoo theory” is used in this blog to convey the thinking summed up in point 4 of the above definition. Hence, in this discussion a voodoo theory designates a theory that lacks sufficient evidence or proof, and/or is based on utterly unrealistic and/or contradictory assumptions, spurious correlations, and so on.

I should point out, though, that the term "Voodoo Decision Theory" is not my coinage (what a pity!):

The behavior of Kropotkin’s cooperators is something like that of decision makers using the Jeffrey expected utility model in the Max and Moritz situation. Are ground squirrels and vampires using voodoo decision theory?

Brian Skyrms (1996, p. 51)
Evolution of the Social Contract
Cambridge University Press.

To illustrate a voodoo decision theory in action, consider this.

Example

Suppose that your task is to determine how a given function, f=f(x), behaves on the interval X=[-1000,1000]. For instance, assume that the issue is the constraint f(x) ≥ 0. That is, assume that you want to know how robust this constraint is over the interval X=[-1000,1000].

Also, assume that evaluating function f on X is difficult and/or costly.

Then, a voodoo decision theory would come to the rescue as follows: instead of examining the constraint f(x) ≥ 0 over X, it would prescribe testing it only over a small subset of X, say X’=[-1,1].

Now suppose that the constraint f(x) ≥ 0 performs well on X’=[-1,1]. What can we say about the performance of this constraint on X=[-1000,1000]?

No Man’s Land No Man’s Land
[-1,1]

Well, if you espouse voodoo decision theory, you would argue that the performance of f(x) ≥ 0 on X’=[-1,1] provides a good indication of the performance of f(x) ≥ 0 on X=[-1000,1000], and therefore the constraint f(x) ≥ 0 performs well on X. In other words, you would argue that the performance on X=[-1,1] is representative of the performance on X=[-1000,1000].

You can save a lot of $$$$$$$ this way: instead of evaluating the performance of a system over the required large space, you quickly evaluate its performance only on a relatively small subset of the required space.

However, seeing through the nonsensical argument made by voodoo decision theory, you would argue that this is absurd because:

  • X’=[-1,1] constitutes a tiny part of X=[-1000,1000], in fact only 0.1 percent of it.
  • All the points in X’ are in the same neighborhood.
  • Therefore X’ is not representative of X insofar as the performance of f(x) ≥ 0 is concerned.
  • Therefore, hardly anything can be deduced about the performance of f(x) ≥ 0 on X from the performance of f(x) ≥ 0 on X’.
  • All we can say is that f(x) ≥ 0 performs well on X’.

Got the drift?

Of course, some readers may question the significance of this example, arguing that it is hyperbolic. After all, who would so much as contemplate suggesting that X=[-1,1] is representative of X=[-1000,1000] with regard to the constraint f(x) ≥ 0 (unless f has some very unique properties)?

My answer to this is that — as we shall see — this example is not an exaggeration. It is implicit in the type of argument used by experienced senior analysts in academia and business/industry (eg. banks) to justify the application of the methodology that they propose/develop (see the discussion on the No Man’s Land syndrome associated with info-gap decision theory).

Viva la Voodoo!

Moshe

Info-gap decision theory: Reality Check

Info-gap decision theory’s primary texts, namely the three books by Ben-Haim (2001, 2006, 2010), present the theory as though it constitutes a major breakthrough in the quantification of uncertainty and in decision-making under severe uncertainty.

Not only that these claims are not corroborated, the fact of the matter is that the central model deployed by info-gap decision theory, namely its robustness model, is a reinvention of a staple model of robustness — known universally as radius of stability (circa 1960).

But more than this, from the standpoint of decision theory, this model is a very simple instance of Wald’s famous maximin model (circa 1940), the foremost model for dealing with a non-probabilistic uncertainty used in decision theory, robust optimization, robust control, robust statistics, etc.

Adherents of info-gap decision theory would therefore be well advised to take a hard critical look at the following statements, which they apparently take seriously:

Info-gap decision theory is radically different from all current theories of decision under uncertainty. The difference originates in the modeling of uncertainty as an information gap rather than as a probability. The need for info-gap modeling and management of uncertainty arises in dealing with severe lack of information and highly unstructured uncertainty.

Ben-Haim (2001, 2006, p. xii)

In this book we concentrate on the fairly new concept of information-gap uncertainty, whose differences from more classical approaches to uncertainty are real and deep. Despite the power of classical decision theories, in many areas such as engineering, economics, management, medicine and public policy, a need has arisen for a different format for decisions based on severely uncertain evidence.

Ben-Haim (2001, 2006, p. 11)

Probability and info-gap modelling each emerged as a struggle between rival intellectual schools. Some philosophers of science tended to evaluate the info-gap approach in terms of how it would serve physical science in place of probability. This is like asking how probability would have served scholastic demonstrative reasoning in the place of Aristotelian logic; the answer: not at all. But then, probability arose from challenges different from those faced the scholastics, just as the info-gap decision theory which we will develop in this book aims to meet new challenges.

Ben-Haim (2001 and 2006, p. 12)

The emergence of info-gap decision theory as a viable alternative to probabilistic methods helps to reconcile Knight’s dichotomy between risk and uncertainty. But more than that, while info-gap models of severe lack of information serve to quantify Knight’s ‘unmeasurable uncertainty’, they also provide new insight into risk, gambling, and the entire pantheon of classical probabilistic explanations. We realize the full potential of the new theory when we see that it provides new ways of thinking about old problems.

Ben-Haim (2001 p. 304; 2006, p. 342)

Info-gap decision theory clearly presents a ‘replacement theory’ with which we can more fully understand the relation between classical theories of uncertainty and uncertain phenomena themselves.

Ben-Haim (2001 p. 305; 2006, p. 343)

The management of surprises is central to the “economic problem”, and info-gap theory is a response to this challenge. This book is about how to formulate and evaluate economic decisions under severe uncertainty. The book demonstrates, through numerous examples, the info-gap methodology for reliably managing uncertainty in economics policy analysis and decision making.

Ben-Haim (2010, p. x)

Info-gap scholars would also do well to consider the following two very simple questions:

  • Given that info-gap’s robustness model and info-gap’s decision model for robustness are simple robust optimization models, how is it that not a single reference can be found in Ben-Haim (2001, 2006, 2010) to the vast literature on robust optimization?
  • How is it that there is not a single reference in: Info-gap Economics: An Operational Introduction (Ben-Haim 2010) to Hansen and Sargent’s extensive work (e.g. their 2008 book Robustness) on the use of robustness models in economics?

    For the benefit of info-gap scholars, here is the publisher’s description of the book:

    The standard theory of decision making under uncertainty advises the decision maker to form a statistical model linking outcomes to decisions and then to choose the optimal distribution of outcomes. This assumes that the decision maker trusts the model completely. But what should a decision maker do if the model cannot be trusted?

    Lars Hansen and Thomas Sargent, two leading macroeconomists, push the field forward as they set about answering this question. They adapt robust control techniques and apply them to economics. By using this theory to let decision makers acknowledge misspecification in economic modeling, the authors develop applications to a variety of problems in dynamic macroeconomics.

    Technical, rigorous, and self-contained, this book will be useful for macroeconomists who seek to improve the robustness of decision-making processes.

In short, info-gap decision theory is in dire need of a serious reality check.

PostScript:

All that is “new” and “revolutionary” in info-gap decision theory is the proposition to use a model of local robustness (radius of stability) to manage a severe uncertainty expressed in terms of a vast (e.g. unbounded) uncertainty space. It is precisely this proposition that makes info-gap decision theory a voodoo decision theory par excellence.

Viva la Voodoo!

Maximin without the min

Experience has shown that many scholars/analysts who are not at home with the modeling aspects of the maximin paradigm are surprised to learn that true-blue maximin models such as this

\displaystyle z^{*}:= \max_{y\in Y}\min_{u\in \Pi(y)}\ \{g(y,u): C(y,u),\forall u\in \Pi(y)\}

can have an equivalent phrasing, without the iconic inner \displaystyle \min_{u\in \Pi(y)} operation. Here C(y,u) represents a list of constraints on (y,u) pairs.

The following result, that is used widely in game theory and robust optimization, explains how to dispose of this iconic operation:

\textsc{Theorem}.

\displaystyle \min_{a\in A} h(a) \equiv \max_{\substack{a\in A\\ v\in \mathbb{R}}} \{v: v\le h(a)\}

assuming that the \min is attained on A, where \mathbb{R} denotes the real line.

The \equiv sign indicates that the two optimization problems are equivalent in the sense that they yield the same optimal value for the objective functions and the same optimal value(s) for the decision variable a. The optimal value of v is equal to the optimal value of h(a).

The following is then an immediate implication of the theorem:

\textsc{Corollary}.

\begin{array}{rcl}      \displaystyle z^{*} &  \displaystyle := &  \displaystyle \max_{y\in Y}\min_{u\in \Pi(y)}\ \{g(y,u): C(y,u),\forall u\in \Pi(y)\}\\    & \displaystyle  \equiv & \displaystyle \max_{\substack{y\in Y\\ v\in \mathbb{R}}}\ \{v: v\le g(y,u), C(y,u),\forall u\in \Pi(y)\}\\        & \displaystyle \equiv & \displaystyle \max_{\substack{y\in Y\\ v\in \mathbb{R}}}\ \{v: CC(y,v,u),\forall u\in \Pi(y)\} \end{array}

where CC(y,v,u) consists of all the constraints in C(y,u), as well as the additional constraint v\le g(y,u).

Such formulations of maximin models are often called Mathematical Programming (MP) formulations. If you use maximin models often, you would do well to develop the skills to switch easily from the iconic formulations to the MP formulations and back.

More …

Counter Examples

The fundamental flaw in info-gap decision theory is so obvious that it is straightforward to construct counterexamples to its central proposition which is that … info-gap decision theory seeks decisions that are robust against a severe uncertainty of the type that it postulates.

The point is that info-gap’s robustness model is a model of local robustness of the radius of stability type. This means that, by construction/definition, info-gap’s robustness model seeks to measure the robustness of decisions to small perturbations in a nominal value of the parameter of interest. It does not seek to measure the robustness of decisions against large variations in the given nominal value. The implication therefore is that it does not seek to measure the robustness of decisions against the variation of the parameter over the uncertainty space under consideration.

So, all it takes to come up with a convincing counterexample is to draw a picture displaying the acceptable regions of two decisions:

  • One that is robust globally (over the uncertainty space) but is fragile locally (in the neighborhood of the point estimate)
  • One that is fragile globally (over the uncertainty space) but is robust locally (in the neighborhood of the point estimate)

For example, consider this:

where robustness is sought with respect to the constraint r^{*}\le r(x,u) over the uncertainty space U. The shaded areas represent the points in U that satisfy this constraint for the respective values of decision x.

Clearly, decision x' is far more robust than decision x'' globally over the uncertainty space \mathbf{U}. Also, according to the precepts of info-gap decision theory, the info-gap robustness of decision x'' is clearly much higher (larger ) than the info-gap robustness of decision x'.

It goes without saying that it is as easy to construct counterexamples that are far more extreme.

More …

Info-gap’s uncertainty model

Info-gap’s uncertainty model — designed to give expression to the uncertainty conditions that the theory deals with — is precisely the uncertainty model underlying radius of stability models, except that info-gap’s terminology is slightly different. In other words, info-gap’s uncertainty model consists of the following objects that are associated with a parameter of interest, call it u:

  • An uncertainty space, \mathscr{U}, that is a set consisting of all the possible/plausible values of u.
  • A point estimate of the true value of u, call it \tilde{u}.

As in the case of radius of stability models, info-gap decision theory imposes a neighborhood structure on \mathscr{U}. That is, a fundamental assumption of this theory is that there is a family of nested sets N(\rho,\tilde{u}),\rho\ge 0, centered at \tilde{u} where N(\rho,\tilde{u})\subseteq \mathscr{U} denotes a neighborhood of size (radius) \rho around \tilde{u}. These neighborhoods are assumed to have the following two basic properties:

  • N(0,\tilde{u}) = \{\tilde{u}\} (contraction)
  • N(\rho,\tilde{u}) \subseteq  N(\rho + \varepsilon,\tilde{u}), \forall \rho,\varepsilon\ge 0 (nesting)

The parameter \rho representing the size (radius) of the neighborhoods is called the horizon of uncertainty.

The severity of the uncertainty under consideration is manifested in these three characteristics:

  • The uncertainty space \mathscr{U} can be vast (e.g. unbounded)
  • The point estimate \tilde{u} is poor and can be substantially wrong.
  • The uncertainty is likelihood-free.

The last means, among other things, that there are no grounds to assume that the true value of u is more/less likely to be in the neighborhood of any particular value of u\in \mathscr{U}. Specifically, there are no grounds to assume that the true value of u is more/less likely to be in the neighborhood of the point estimate \tilde{u} than in the neighborhood of any other u\in \mathscr{U}.

More …