12
$\begingroup$

During some research into martingale optimal transport, I encountered the following strange “mirrored” entropy functional. I will state the functional first, then pose some questions, which will be more in the conversational/intuition style. I will also provide the context in which it arose, but please feel free to skip the context if it is too lengthy.

The mirror entropy functional:

Let $Z$ be a discrete random variable taking $n$ distinct values with probability $p_1, \dots, p_n$ respectively. The classical Shannon entropy of $Z$ is given by

$$\mathcal H(Z) := -\sum_{i=1}^n p_i \log p_i$$

and represents, vaguely, the amount of uncertainty or information contained in $Z$.

The functional I encountered was instead the following mirrored version:

$$\mathcal H^{\circ} (Z) := - \sum_{i=1}^n (1-p_i) \log (1-p_i).$$

Exactly the same form as the entropy, except the probababilities are replaced by the probabilities of their complementary events!

Questions:

  • Has anyone seen this functional arise before, or anything close to it?

  • Is there a natural information theoretic interpretation?

  • Can it be expressed in terms of familiar canonical information theoretic functionals?

How it arose (long, optional!):

I was studying the following belief dynamics problem - given fair, continuous belief updates, how far does the belief ever get from the truth?

Let $Z$ be a discrete random variable as above. Suppose $\mathcal F_t$ is some filtration, that is, an increasing sequence of sigma algebras on the probability space, and suppose $Z$ is $\mathcal F_\infty$ measurable.

For a concrete example, $\mathcal F_t$ might be the natural filtration of a Brownian motion, and $Z$ some functional depending on the paths of the Brownian motion. For each $t$, $\mathcal F_t$ is intuitively the information obtained by observing the paths of the Brownian motion up to time $t$.

Let $\Pi^t$ be the posterior belief of the distribution of $Z$ given the information up to time $t$,

$$\Pi_t := (\Pi^1_t, \dots, \Pi^n_t), \, \text{ where }\, \Pi^i_t := \mathbb P(Z = z_i \, | \, \mathcal F_t).$$

So $\Pi_t$ is a vector representation of the conditional distribution of $Z$, and thus is a martingale with components that sum to $1$ at all times. We have $\Pi_\infty^i = \mathbf 1_{Z = z_i}$, that is, our belief becomes certain at the terminal time. Further, by the martingale convergence theorem, $\Pi_t \to \Pi_\infty$ almost surely.

We are interested in the following “false confidence” functional

$$\sup_t d_{TV}(\Pi_t,\Pi_\infty)$$

which is the peak discrepancy between the belief and the actual truth, as measured by the total variation metric on probability measures.

A priori, even if we fix the distribution of $Z$, the peak false confidence is a pathwise defined object, and so depends crucially on the choice of information flow $\mathcal F_t$. Which is why the following identity comes as a surprise:

Assume the information flow $\mathcal F_t$ is such that $\Pi_t$ is continuous. Then we have $$\mathbb E \sup_t d_{TV}(\Pi_t,\Pi_\infty) = \mathcal H^{\circ} (Z).$$

In particular the expected peak false belief depends only on the law of $Z$ as long as beliefs are updated continuously. This is, for example satisfied by the Brownian filtration, and any random variable measured to it. Further, it is always equal to the strange mirrored entropy functional.

$\endgroup$

2 Answers 2

16
$\begingroup$

Yes, this is known as "extropy", and has been studied quite extensively:

$\endgroup$
1
  • 1
    $\begingroup$ I am very pleasantly surprised. Thanks for the references! $\endgroup$ Commented May 30 at 12:11
5
$\begingroup$

In addition to the excellent references of Carlo, here is one work related to extropic mirror descent: DeGroot–Friedkin Map in Opinion Dynamics Is Mirror Descent. Version without paywall.

What I find particularly interesting is that like entropy, the extropy too is permutation invariant, achieves maximum at the uniform distribution, minimum at the simplex vertices. Both entropy and extropy match for $n=2$ but are different for $n\geq 3$. And a counter-intuitive result that $H(1-x)-H(x)$ is simplex-convex but not convex in $[0,1]^{n}$. Much of this is discussed in Carlo's first reference.

$\endgroup$
1
  • 1
    $\begingroup$ Thanks for the nice paper! It is quite the fascinating functional, which i will hopefully understand more after reading the references. $\endgroup$ Commented May 30 at 17:22

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.