Geometry of relative plausibility and belief of singletons

Fabio Cuzzolin

Perception project, INRIA Rhoˆne-Alpes, 655, avenue de l’Europe, 38334 SAINT ISMIER CEDEX, France (Fabio.Cuzzolin@inrialpes.fr).

February 19, 2007

Abstract. The study of the interplay between belief and probability has recently been posed in a geometric framework, in which belief and plausibility functions are represented as points of simplices in a Cartesian space. All Bayesian approximations of a belief function b form two homogeneous groups, which we call “affine” and “epistemic” families. In this paper, in particular, we focus on relative plausibility and belief of singletons and show that they form, together with a new Bayesian function called “non-Bayesianity flag”, a homogenous family of Bayesian functions related to b, in terms of both their geometry and their behavior with respect to Dempster’s rule of combination. We investigate their geometry, which turns out to be described in terms of three planes and angles. We combine algebraic and geometric properties of the relative plausibility function to conjecture its new interpretation as solution of the probabilistic approximation problem formulated in terms of the rule of combination. Finally, drawing inspiration from the binary case, we prove that all Bayesian approximations of both families coincide when belief functions assign the same mass to events of the same size.

Keywords: Theory of evidence, geometric approach, relative plausibility and belief of singletons, Dempster’s combination, Bayesian approximation.

1. Introduction

In the last decades a number of different uncertainty measures (Vak- ili, 1993) have been proposed, as either alternatives or extensions of the classical probability theory. The theory of evidence (ToE) is one the most popular formalism, extending quite naturally probabilities on finite spaces through the notion of belief function (Shafer, 1976), which assigns probability values to sets of possibilities rather than single events.

Naturally enough, the connection between belief and probability plays a major role in the theory of evidence (Daniel, ), and is the foundation of a popular approach to evidential reasoning, Smets’ pignistic model (Smets, 1988), in which beliefs are represented at credal level (as con- vex sets of probabilities), while decisions are made by resorting to a Bayesian belief function called pignistic transformation.

In fact, the problem of finding correct probabilistic and possibilistic (Dubois and Prade, 1990) approximations of belief functions has been widely studied (Kramosil, 1995), and a number of papers have been

⃝c 2007 Kluwer Academic Publishers. Printed in the Netherlands.

AMAI07georelative.tex; 25/05/2007; 11:29; p.1

2 Fabio Cuzzolin

published on this issue (Yaghlane et al., 2001; Denoeux, 2001; Denoeux and Yaghlane, 2002; Haenni and Lehmann, 2002) (see (Bauer, 1997) for a review), mainly in order to find efficient implementations of the rule of combination. Tessem (Tessem, 1993), for instance, incorporated only the highest-valued focal elements in his mklx approximation; a similar approach inspired the summarization technique formulated by Lowrance et al. ((Lowrance et al., 1986)). On his side, in his 1989 paper (Voorbraak, 1989) F. Voorbraak proposed to adopt the so-called

̃

relative plausibility of singletons (which we denote with plb), the unique

probability that, given a belief function b with plausibility plb, assigns ̃

to each singleton its normalized plausibility. He proved that plb is a perfect representative of b when combined with other probabilities p:

̃

plb ⊕ p = b ⊕ p.

Cobb and Shenoy (Cobb and Shenoy, 2003a; Cobb and Shenoy, 2003b; Cobb and Shenoy, 2003c) also described the properties of the relative plausibility of singletons and discussed its nature of probability function that is equivalent to the original belief function.

The study of belief functions and their interplay with probability func- tions has recently been posed in a geometric setup. For instance, Ha and Haddawy (Ha and Haddawy, 1996) exploited methods of convex geometry to represent probability intervals in a computationally effi- cient fashion, by means of a data structure called pcc-tree. On his side, P. Black dedicated its doctoral thesis to the study of the geometry of belief functions (Black, 1996). An abstract of his results on belief functions and other monotone capacities can be found in (Black, 1997), where he uses shapes of geometric loci to give a direct visualization of the distinct classes of monotone capacities.

A geometric approach to the theory of evidence has been recently developed in which belief functions are represented by points of convex space called belief space (Cuzzolin and Frezza, 2001) (Cuzzolin, 2007b). As a matter of fact, as a belief function b : 2Θ → [0,1] is completely specified by its N =. 2|Θ| − 1 belief values

{b(A),A ⊆ Θ,A ̸= ∅},

it can be represented as a point of RN (Section 2).

In this framework we recently proved (Cuzzolin, 2007d) that each belief function is associated with three different geometric entities, namely the line (b,plb) joining b and plb, the orthogonal complement P⊥ of the probabilistic region P, and the simplex of probabilities P[b] = {p ∈ P : p(A) ≥ b(A) ∀A ⊆ Θ} consistent with b. These in turn determine three different probabilities associated with b, i.e. the intersection probabil- ity p[b], the orthogonal projection π[b] of b onto P, and the pignistic function or barycenter of P[b], BetP[b] = P[b]. We showed that if the

AMAI07georelative.tex; 25/05/2007; 11:29; p.2

Geometry of relative plausibility 3

original belief function b is 2-additive then all those three Bayesian functions coincide (Cuzzolin, 2007d).

However, the analysis of the simplest binary case (Section 3) sug- gests that the relative plausibility function does not fit in this picture, regardless of the additivity of the original belief function.

Indeed, all Bayesian approximations of belief functions can be clas- sified in two distinct families. Functions of the first group (p[b], π[b], BetP [b]) commute (at least under certain conditions) with affine combi- nation of points in the belief space (Cuzzolin, 2007d): we call this group the affine family. On their side, instead, relative plausibility and belief of singletons commute (Voorbraak, 1989; Cuzzolin, 2007a) with Demp- ster’s rule of combination ⊕ (Shafer, 1976; Dempster, 1968b; Dempster, 1968a), and meet a set of dual properties (Cobb and Shenoy, 2003c; Cuzzolin, 2007a) with respect to ⊕, in particular Voorbraak’s repre- sentation theorem: we call them the semantic family. The geometry of the affine family has been investigated in (Cuzzolin, 2007d).

In this paper, instead, we focus on the “semantic” family of Bayesian ̃ ̃

b.f. In Section 4 we will then describe the geometry of the pair plb, b in the general case, as a function of two pseudo b.f. called “plausibility and belief of singletons” (4.1). We will point out how their geometry can be described in terms of three planes (4.3) and angles (4.4) in the belief space. Those are in turn related to another probability R[b] which can be interpreted as a measure of the “non-Bayesianity” of b. As ̃b does not always exists, we need to discuss this singular case separately (4.5). In Section 5 we will show that geometric and algebraic properties of the relative plausibility can be used to formulate a conjecture on its nature of solution of the probabilistic approximation problem (formu- lated in terms of Dempster’s rule of combination). The basis tool will be provided by the study of the geometry of all b.f. which perfectly represent b when combined with a probability (5.1).

We will close the paper (Section 6) by discussing the relationship be- tween the two families of Bayesian functions introduced above. Ab- stracting from the binary case study we will provide a sufficient condi- tion under which they all coincide described in terms of equal distrib- ution of masses, as a trait d’union for all Bayesian approximations of belief functions.

2. A geometric approach to the theory of evidence

In the theory of evidence (Shafer, 1976) degrees of (subjective) belief are represented as belief functions. A basic probability assignment (b.p.a.) over a finite set Θ (frame of discernment) is a function m : 2Θ → [0, 1]

AMAI07georelative.tex; 25/05/2007; 11:29; p.3

4

such that

Fabio Cuzzolin

m(A)=1, m(A)≥0∀A⊆Θ.

m(∅)=0,

A⊆Θ

The belief function (b.f.) b : 2Θ → [0, 1] associated with a basic proba- bility assignment m is simply

b(A) =

The basic probability assignment mb of a belief function b can be

m(B).

uniquely recovered by means of the Moebius inversion formula

B∩C̸=∅

where mbi denotes the b.p.a. associated with bi.

B⊂A

mb(A) = (−1)|A−B|b(B). B⊆A

In particular, a probability function or Bayesian b.f. is a special belief function which assigns mass to singletons only: mb(A) = 0, |A| > 1.

A dual representation of the evidence encoded by a belief function b is the plausibility function, whose value plb(A) expresses the amount of evidence not against a proposition A:

plb(A) =. 1 − b(Ac) = mb(B) ≥ b(A). B∩A̸=∅

Belief functions are combined by means of Dempster’s orthogonal sum.

Definition 1. The orthogonal sum or Dempster’s sum of two belief functions b1, b2 is a new belief function b1 ⊕ b2 with b.p.a.

B∩C=A mb1⊕b2(A) =

We denote with k(b1,b2) the denominator of Equation (1). 2.1. Belief and plausibility spaces

Given a frame of discernment Θ, a belief function b : 2Θ → [0,1] is completely specified by its N − 1 belief values

{b(A),A ⊆ Θ,A ̸= ∅},

N =. 2n (n =. |Θ|), and can then be represented as a point of RN−1. We

can introduce an orthonormal reference frame {XA : A ⊆ Θ,A ̸= ∅}

so that each vector v = v X in RN−1 is potentially a A⊆Θ,A̸=∅ A A

m (B)m (C) b1 b2

m (B) m (C) (1) b1 b2

AMAI07georelative.tex; 25/05/2007; 11:29; p.4

Geometry of relative plausibility 5

belief function, in which each component vA measures the belief value of A: vA = b(A). We call belief space B the set of points of RN−1 corresponding to a belief function. It is not difficult to prove (Cuzzolin and Frezza, 2001) that B is convex. More precisely, if we call

bA =. b∈B s.t. mb(A)=1, mb(B)=0∀B̸=A

the unique belief function assigning all the mass to a single subset A of Θ, the belief space B is the convex closure Cl of all those “basis” belief functions {bA}:

B=Cl(bA, A⊂Θ, A̸=∅). Here Cl denotes the convex closure operator:

Cl(b1,…,bk)={b∈B:b=α1b1+···+αkbk,

Each belief function b ∈ B can be written as a convex sum as

(2)

bΘ

Figure 1. Belief B and plausibility PL space share the probabilistic subspace P, with their vertices bA, plA representing the lower and upper probabilities induced by the same “certain” evidence A.

b =

As a probability is a b.f. assigning non zero masses to singletons only,

mb(A)bA. (4) the set P of all Bayesian b.f. on Θ is the simplex determined by all

∅A⊆Θ

i

plA

PL

αi =1, αi ≥0∀i}. (3)

by = pl y

B

pl Θ

bA

P

b

bx = pl x

AMAI07georelative.tex; 25/05/2007; 11:29; p.5

6 Fabio Cuzzolin basis functions associated with singletons1

P = Cl(bx,x ∈ Θ).

Analogously, we can call plausibility space the region PL of RN whose points correspond to plausibility functions. It can be proven (Cuzzolin, 2003) that this region is also a simplex PL = Cl(plA, ∅ A ⊆ Θ) (see Figure 1), whose vertices are the vectors

plA = − (−1)|B|bB. (5) B⊆A

The A-th vertex plA of the plausibility space is the plausibility vector associated with the basis belief function bA, plA = plbA .

Each plausibility function plb can be uniquely expressed as convex combination of all basis plausibility functions (5) as

∅A⊆Θ

Symmetrically, a plausibility vector plb can be seen as an affine2 com- bination of the basis belief functions {bA : ∅ A ⊆ Θ}

∅A⊆Θ

plb = where (see (Cuzzolin, 2003))

μb(A)bA (7)

plb =

mb(A)plA. (6)

μb(A) =. (−1)|A\B|plb(B) = (−1)|A|+1 mb(B), A ̸= ∅ B⊆A B⊇A

is the Moebius inverse of the plausibility function, called basic plausi- bility assignment (μb(∅) = 0). A useful property of μb is that

A⊇x

as

μb(A) = (−1)|A|+1( mb(B)) = − mb(B)

A⊇x A⊇x B⊇A B⊇x

μb(A) = mb(x)

(9)

(−1)|A|

2 An affine combination of k points v1,…,vk ∈ RN is a sum α1v1 + ··· + αkvk whose coefficients sum to one: αi = 1. We will denote with a(v1,..,vk) the affine

i

subspace of RN generated by the points v1,…,vk ∈ RN, i.e. the set {v ∈ RN : v =

α1v1 +···+αkvk, αi =1}. i

AMAI07georelative.tex; 25/05/2007; 11:29; p.6

x⊆A⊆B

(8)

1 With a slight abuse of notation we will denote with m(x), b(x), plb (x) etc. instead of m({x}),b({x}),plb({x}) the values of the set functions of interest on a singleton.

ization axiom,

mς (A) = 1, ∅A⊆Θ

Geometry of relative plausibility 7 0 B̸=x

where

for Newton’s binomial: nk=0 1n−k(−1)k = 0. 2.2. Pseudo belief functions

It may be confusing to think of belief and plausibility functions as points of the same Cartesian space. However, this is a simple conse- quence of the fact that both are defined on the same domain, the power set of Θ. As Θ is finite they can both be seen as real-valued vectors with the same number N = 2|Θ| − 1 of components.

As belief and plausibility spaces do not exhaust the whole RN it is natural to wonder whether points “outside” them have any meaningful interpretation in this framework (Cuzzolin, 2004). Indeed each vector v = [v1, …, vA, …, vΘ]′ ∈ RN (where v′ denotes the transpose of v), can bethoughtofasafunctionς :2Θ\∅→Rs.t.ς(A)=vA.Eachofthese functions ς have a Moebius inversion mς : 2Θ \ ∅ → R such that

B⊆A

i.e. each vector ς of RN can be thought of as a sum function (Aigner, 1979), where mς(B) ̸≥ 0 ∀B ⊆ Θ. Sum functions meeting the normal-

(−1)|A|= −1B=x x⊆A⊆B

ς(A) =

mς(B)

or “normalized sum functions”, sometimes called pseudo belief func- tions (p.b.f.) (Smets, 1995), are then natural extensions of belief func- tions. We can then call Bayesian pseudo belief function a p.b.f. ς s.t.

mς (x) = 1. x∈Θ

Note that this implies m (A) = 0, but not necessarily m (A) = 0

|A|>1 ς ς

∀|A| > 1.

3. A geometric interplay of belief and probability

As b(Θ) = plb(Θ) = 1 for all b.f. b, we can neglect the coordinate vΘ and think of B as a region of RN−1. It is easy to see that in this case bΘ = 0 = [0,···,0]′ and plΘ = 1 = [1,···,1]′ as for the vacuous b.f. bΘ(A) = 0,plb(A) = 1 for all A ̸= Θ.

AMAI07georelative.tex; 25/05/2007; 11:29; p.7

8 Fabio Cuzzolin

3.1. Belief and probability in the binary case

Figure 2 shows the geometry of belief B and plausibility PL spaces for a binary frame Θ2 = {x,y}, where belief and plausibility vectors are points of a plane with coordinates

b = [b(x) = mb(x), b(y) = mb(y)]′

plb = [plb(x) = 1 − mb(y), plb(y) = 1 − mb(x)]′

respectively. These two simplices

b{y} =[0,1]’=pl{y}

PL

B P pl a(b,plb) b

pl =[1,1]’

Θ

1−m(x) b~

P[b]

~b b

b =[0,0]’ m(x)

Θ b b{x}

m(Θ)

plb

p[b]=π[b]=BetP[b]

m(y) b

Figure 2. Geometry of belief and plausibility functions in the bi-dimensional case

Θ2 = {x,y}. b and plb are points of the simplices B = Cl(bΘ = 0,bx,by),

PL = Cl(plΘ = 1,plx = bx,ply = by) respectively. They lie on symmetric posi-

tions with respect to P = Cl(bx,by). While pignistic function BetP[b], orthogonal

projection π[b] of b onto P, and the intersection p[b] of the segment Cl(b,plb) with ̃ ̃

P all coincide, relative belief b and plausibility plb of singletons remain distinct. B = Cl(bΘ = 0,bx,by) PL = Cl(plΘ = 1,plx = bx,ply = by)

are symmetric with respect to the probability simplex P = Cl(bx,by). Each pair of functions (b,plb) determines a line which is orthogonal to P, on which b and plb lay on symmetric positions on the two sides of the Bayesian region.

1−m(y) b

=[1,0]’=pl{x}

AMAI07georelative.tex; 25/05/2007; 11:29; p.8

Geometry of relative plausibility 9 3.2. Two families of Bayesian approximations

The connection between belief and probability plays a major role in the theory of evidence. The problem of finding meaningful probabilis- tic approximations of belief functions has been widely studied. The above binary example pictorially illustrates the different Bayesian belief functions associated with a given b.f. b.

3.2.1. The affine family

Looking at Figure 2 we realize that b is geometrically associated with three loci: the line a(b,plb) joining b and plb, the set of probabilities P[b] consistent with b

P[b]=. {p∈P :p(A)≥b(A) ∀A⊆Θ},

and the orthogonal complement P⊥ of P.

In (Cuzzolin, 2007d) we proved that a(b,plb) is always orthogonal to P, even in the general case. However, this line does not intersect the probabilistic subspace in general, but there exists a unique Bayesian pseudo belief function ς[b] such that

where

ς[b]=β[b]plb +(1−β[b])b (10)

m (A)

β[b]= |A|>1 b ∈[0,1] (11)

|A|>1 mb(A)|A|

is a scalar function of b. Its values on singletons are

mς[b](x) = mb(x) + β[b]

ς[b] is then naturally associated with a Bayesian belief function called

mb(A). (12) intersection probability, assigning the mass (12) to each singleton, namely

p[b] =. mς[b](x)bx. (13) x∈Θ

The intersection probability p[b] is in general distinct from the orthogo- nal projection π[b] of b onto P (Cuzzolin, 2007d), and from the pignistic

function

BetP[b] =

which is well known (Chateauneuf and Jaffray, 1989) to be the center

mb(A)

|A| bx (14) of mass P[b] of the simplex of consistent probabilities P[b].

Ax

x∈Θ A⊇x

AMAI07georelative.tex; 25/05/2007; 11:29; p.9

10 Fabio Cuzzolin

The three geometric loci arising from the analysis of the binary case

are in conclusion each associated with a different Bayesian b.f. a(b, plb) ↔ p[b], P⊥ ↔ π[b], P[b] ↔ BetP [b] = P[b].

In the binary case, all those Bayesian approximations of b coincide: π[b] = BetP[b] = P[b] (see Figure 2), suggesting that (Cuzzolin, 2007d)

Proposition 1. If b is 2-additive (mb(A) = 0, |A| > 2) then pignistic function, orthogonal projection and intersection probability coincide, BetP[b] = π[b] = ς[b] = p[b].

3.2.2. The epistemic family

Again, Figure 2 shows that another classical Bayesian approximation of b does not fit in this picture, the relative plausibility of singletons (Voorbraak, 1989)

plb (x)

plb(x) = plb(y) (15)

y∈Θ

̃

In the binary case plb is just the intersection of the Bayesian simplex P

with the segment joining the plausibility function plb related to b with the vacuous belief function bΘ = 0, as

̃ plb(x) ̃ plb(y) plb(x) = plb(x) + plb(y), plb(y) = plb(x) + plb(y)

̃

so that

̃ ̃ ̃′1′

plb =[plb(x),plb(y)] = plb(x)+plb(y)[plb(x),plb(y)] = 11

=plb(x)+plb(y)plb+ 1−plb(x)+plb(y) 0∈Cl(plb,0) ̃

Some simple math shows that plb is also the intersection of P with the line joining b and plΘ = 1 (Figure 2).

A dual set of lines points out the existence of another Bayesian b.f.

̃b(x) =. mb(x) (16) y∈Θ mb(y)

which is geometrically the intersection of the line a(b, bΘ) joining b and bΘ with the probability simplex (or dually a(plb, plΘ) ∩ P).

It is natural to call (16) relative belief of singletons.

AMAI07georelative.tex; 25/05/2007; 11:29; p.10

Geometry of relative plausibility 11 3.3. Operators associated with the two families of

approximations

The two families of Bayesian b.f. highlighted by the geometry of the binary belief space turn out to be inherently associated with two dif- ferent operators in B. On one side, both pignistic function and or- thogonal projection (Cuzzolin, 2007d) commute with respect to affine combination.

Proposition 2. π[α1b1 + α2b2] = α1π[b1] + α2π[b2], BetP[α1b1 + α2b2] = α1BetP [b1] + α2BetP [b2] whenever α1 + α2 = 1 ∀b1, b2 ∈ B.

About the intersection probability p[b], even though it does not always

commute with affine combination, it can be proven that it commutes

with Cl(.) under certain conditions. It is natural to call the group

BetP [b], π[b], p[b] the “affine” family of Bayesian approximations of b. ̃ ̃

On the other, both plb (Voorbraak, 1989) and b (Cuzzolin, 2007a) commute with respect to Dempster’s rule of combination3.

Proposition 3. Both relative plausibility and belief of singletons com- mute with respect to Dempster’s combination, namely

̃ ̃ ̃ ̃ ̃ ̃ pl[b1 ⊕ b2] = pl[b1] ⊕ pl[b2]; b[pl1 ⊕ pl2] = b[pl1] ⊕ b[pl2].

̃ ̃

In fact, plb and b meet a number of dual properties (Cuzzolin,

2007a) which can be obtained from each other by switching the role of belief and plausibility: in particular, idempotence and Voorbraak’s representation theorem.

Proposition 4. If b is idempotent with respect to Dempster’s rule, i.e. ̃

b ⊕ b = b, then plb is idempotent with respect to Bayes’ rule: ̃ ̃ ̃

plb ⊕ plb = plb.

If plb is idempotent with respect to Dempster’s rule, i.e. plb ⊕ plb = plb,

then ̃b[plb] is idempotent with respect to Bayes’ rule: ̃b[plb] ⊕ ̃b[plb] = ̃b[plb].

̃

3 Dempster’s rule can be naturally extended to pseudo belief functions (Cuzzolin, 2007a) by applying Dempster’s rule (1) to the Moebius inverses mς1 , mς2 of a pair of p.b.f. We can still denote the orthogonal sum of two p.b.f. ς1,ς2 by ς1 ⊕ ς2. As plausibility functions are also pseudo belief functions, they admit Dempster’s combination.

Proposition 5. plb is a perfect representative of b in the probability space when combined through Dempster’s rule, i.e.

̃

b⊕p=plb⊕p, ∀p∈P;

AMAI07georelative.tex; 25/05/2007; 11:29; p.11

12 Fabio Cuzzolin

̃b represents perfectly the corresponding plausibility function plb when

combined with any probability through (extended) Dempster’s rule:

̃b⊕p=plb⊕p ∀p∈P.

Given the central role of Dempster’s rule in the theory of evidence

̃ ̃

and the strict relation between ⊕ and the group plb,b, it is natural

to call the latter the “epistemic” family of Bayesian approximations. The different geometric behavior of the two groups of probabilities in Section 5.2 is then the reflection of a deeper intrinsic dissimilarity of their semantics.

3.4. A study of the epistemic family

The geometry of family of affine Bayesian approximations has been

studied in (Cuzzolin, 2007d). In this paper we will focus on the second

(“epistemic”) family. In Section 4 we will then describe the geometry ̃ ̃

of the pair plb,b in the general case, as a function of two pseudo b.f.

called “plausibility and belief of singletons” (4.1). We will point out

how their geometry can be described in terms of three planes (4.3) and

angles (4.4) in the belief space. Those are in turn related to another

probability R[b] which can be interpreted as a measure of the “non-

Bayesianity” of b. Note that ̃b does not always exists, namely when

m (x) = 0. The situation in this singular case then needs to be xb

discussed separately (Section 4.5).

In Section 5 we will show how the geometric properties of the relative plausibility can be used to formulate a conjecture on its nature of solu- tion of the probabilistic approximation problem (formulated in terms of Dempster’s rule of combination).

We will close the paper (Section 6) by discussing the relationship be- tween the two families of Bayesian functions we just introduced. Ab- stracting from the binary case study we will provide a sufficient condi- tion under which they all coincide, described in terms of equal distrib- ution of masses.

4. Geometry of relative plausibility and belief of singletons

4.1. Plausibility of singletons and relative plausibility

Let us then study the geometry of relative plausibility and belief of sin- gletons in the belief space. We need to introduce two other quantities. Let us call plausibility of singletons the vector

. plb =

x∈Θ

̄

plb(x) bx. (17)

AMAI07georelative.tex; 25/05/2007; 11:29; p.12

Geometry of relative plausibility 13 Since bΘ = 0 is the origin of the reference frame in RN−1 (see above),

̄

plb can also be written as

where

̄

plb =

plb(x) bx + (1 − kplb ) bΘ kplb =. plb(x).

x∈Θ

x∈Θ

is the total plausibility of singletons.

Comparing this expression with Equation (4) we recognize that the

plausibility of singletons has Moebius inverse m ̄ : 2Θ → R plb

m ̄ (x)=pl (x), m ̄ (Θ)=1− pl (x), m ̄ (A)=0|A|≠ 1,n. plbbplb xbplb

As m ̄ plb

meets the normalization constraint

m ̄ (A)= pl(x)+ 1− pl(x) =1, plb b b

A⊆Θ x∈Θ x∈Θ ̄

even though 1 − kplb ≤ 0, plb is a pseudo belief function (Section 2.2). ̃

Theorem 1. plb is the intersection with the probability simplex of the ̄

line joining vacuous belief function bΘ and plausibility of singletons plb. Proof. By definitions (15) (17)

it follows that

̃ ̃ ̄

x∈Θ

plb(x)bx = plb/kplb . ̃

plb =

Since bΘ = 0 is the origin of the reference frame, plb lies on the segment

̄

Cl(plb, bΘ). This in turn implies that

̃ ̄

plb =Cl(plb,bΘ)∩P.

̃ ̄

The geometry of plb will depend on that of plb through Theorem 1.

4.2. Belief, plausibility of singletons and intersection probability

Let us now remember the intersection probability we introduced in Section 3.2, as a member of the affine family. By definition of p[b] (13)

p[b] = mb(x)bx + β[b] (plb(x) − mb(x))bx = x∈Θ x∈Θ

= (1 − β[b]) mb(x)bx + β[b] plb(x)bx. x∈Θ x∈Θ

(18)

AMAI07georelative.tex; 25/05/2007; 11:29; p.13

14 Fabio Cuzzolin

Analogously to what done for the plausibility of singletons, we can

define the belief function

̄b = mb(x)bx = mb(x)bx + (1 − kmb )bΘ, (19)

where

x∈Θ x∈Θ

k m b =. m b ( x )

x∈Θ

is the total mass of singletons, which assigns to Θ all the mass b gives to non-singletons. ̄b is in other words a “discounted” probability (Shafer, 1976). Equation (18) can then be written as

̄ ̄

p[b] = (1 − β[b]) b + β[b] plb (20)

̄. ̄

where b ∈ Cl(bΘ,P) = D and plb ∈ a(D), the affine space generated

by the simplex D of discounted probabilities (belief functions assigning mass to singletons or Θ only).

Comparing (20) with (10) we recognize that the functions on the line ̄ ̄

a(b, plb) are associated with other quantities laying on the line a(b, plb) in the affine subspace a(D) of discounted probabilities

which are in the same relative positions. ̄ ̄

̄ ̄

b↔b, plb ↔plb, ς[b]↔p[b]

In the binary case D = B, b = b and plb = plb, so that the plausibility of singletons is a plausibility function (Figure 2), and the two lines coincide.

4.3. A three plane geometry

We need to reduce the geometry of relative plausibility and belief of

̄ ̄

singletons to that of plb, b. They correspond through Equation (6) to

the two dual quantities

ˆb = mb(x)plx + (1 − kmb )plΘ = ̄b + (1 − kmb )plΘ

x∈Θ

x∈Θ ˆ ̄

(21)

plb =

Theorem 2. The line passing through the duals of plausibility of sin-

plb(x)plx +(1−kplb)plΘ =plb +(1−kplb)plΘ. We can prove that (see Appendix)

gletons (17) and belief of singletons (19) crosses p[b] too, and ˆˆˆ ̄ ̄ ̄

β[b](plb −b)+b=p[b]=β[b](plb −b)+b. (22)

AMAI07georelative.tex; 25/05/2007; 11:29; p.14

Geometry of relative plausibility 15 If kmb ̸= 0 the geometry of relative plausibility and belief of single-

tons can then be described in terms of the three planes ̄ˆ ̃ ̃

a(plb,p[b],plb), a(bΘ,plb,plΘ), a(bΘ,b,plΘ) (seeFigure3),where ̃b= ̄b/kmb istherelativebeliefofsingletons.More

denote with

̄ˆ ̄ˆ a(b, p[b], b) = a(plb, p[b], plb).

φ3

^−

plb

plb

φ 2

φ 1

p[b]

~ plb

pl bΘ Θ

2. Furthermore, by definition ̃ ̄

−b

b~

P

b^

Figure 3. Planes and angles describing the geometry of relative plausibility and belief of singletons.

precisely:

̄ ̄

1. we have just seen that p[b] is the intersection of both a(b,plb) and

ˆˆ

a(b,plb) and lays in the same relative position on the two lines

(Section 4.1). Those two lines then span a plane which we can

plb − bΘ = (plb − bΘ)/kplb ̃ ̄ˆ

while (21) implies

plb=plb/kplb =(plb−(1−kplb)plΘ)/kplb

so that

̃ˆ

plb − plΘ = (plb − plΘ)/kplb .

AMAI07georelative.tex; 25/05/2007; 11:29; p.15

16 Fabio Cuzzolin

The relative plausibility function then lies in the same relative

̄ˆ

position on the two lines a(bΘ,plb) and a(plΘ,plb), which inter-

̃ ̃ ̄ˆ

sect exactly in plb. bΘ, plΘ, plb, plb and plb then determine another

unique plane which we can denote with ̃

a(bΘ, plb, plΘ).

3. Analogously, by definition ̃b − bΘ = ( ̄b − bΘ)/kmb while (21) yields ̃b − plΘ = (ˆb − plΘ)/kmb , so that the relative belief of singletons lies in the same relative position on the two lines a(bΘ, ̄b) and a(plΘ,ˆb), which intersect exactly in ̃b. bΘ,plΘ, ̃b, ̄b and ˆb then determine a single plane denoted by

a(bΘ , ̃b, plΘ ). 4.4. A geometry of three angles

̄ˆ ̄ˆ

In the binary case, b = b = plb, plb = plb = b and all these quantities

̃ ̃ are coplanar. This suggests a description of the geometry of plb,b in

terms of the three angles

... ̃ ̄ ̄ˆ ̃ ̃

φ1[b] = plb p[b] plb, φ2[b] = b p[b] plb, φ3[b] = b bΘ plb. (23) 4.4.1. Orthogonality condition and non-Bayesianity flag

As a matter of fact, even though the line a(b, plb) is always orthogonal ̄ ̄

to P, a(b, plb) is not in general orthogonal to the probabilistic subspace: ̄ ̄

∃y̸=x∈Θ s.t. ⟨plb−b,by−bx⟩≠ 0, (24) where ⟨·⟩ denotes the usual scalar product, and {by −bx∀y ̸= x} are the

basis vectors of the affine space a(P) generated by P. φ1 is the angle ̄ ̄ ̃ ̃

between a(b, plb) and a particular line a(b, plb) lying on the probabilistic subspace. Its value has an interesting interpretation in terms of belief.

̄ ̄

Theorem 3. a(b, plb)⊥P (φ1[b] = π/2) iff

mb(A)=plb(x)−mb(x)=const ∀x∈Θ. Ax

This can be in turn expressed by means of another probability function

̄ ̄

R[b] =. plb(x) − mb(x)bx = plb − b (25)

x∈Θ kplb −kmb kplb −kmb

AMAI07georelative.tex; 25/05/2007; 11:29; p.16

Geometry of relative plausibility 17

which can be naturally interpreted as a flag of the non-Bayesianity of the belief function b. If b is Bayesian, plb(x) − mb(x) = 0 ∀x ∈ Θ. If b is not Bayesian, there exists at least a singleton x such that plb(x) − mb(x) > 0. This difference then measures how much x contributes to the non-Bayesianity of b.

̄ ̄

Corollary 1. The dual line a(b, plb) is orthogonal to P iff the Bayesian

b.f. R[b] is the uninformative probability, R[b]=P=. bx/n.

x

In this case all singletons give the same relative contribution to the non-Bayesianity of b, and (recalling the definition of p[b] (13))

p[b](x) = mb(x) + β[b](plb(x) − mb(x)) =

= mb(x) + 1 − kmb (plb(x) − mb(x)) = mb(x) + 1 − kmb

y∈Θ(plb(y) − mb(y)) n

the intersection probability assigns the mass originally given by b to non-singletons to each singleton on equal basis.

On the other side,

Theorem 4. φ2[b] ̸= 0 and the lines a(b, plb), a(b, plb) never coincide

̄ ̄ ˆˆ

∀b∈B when |Θ|>2; instead φ2[b]=0 ∀b∈B when |Θ|≤2.

4.4.2. Geometric interpretation of the flag probability

R[b] has a simple geometric interpretation, as Equation (25) implies

̄ ̄ ̃ ̃ R[b](kplb −kmb)=plb −b=plb kplb −bkmb =

̃ ̃ ̃ ̃ ̃ ̃ ̃

=plb kplb −bkmb +kplb b−kplb b=kplb (plb −b)+b(kplb −kmb)

from which

R[b] lies on the line joining b and plb, outside the segment they form. Figure 4 illustrates how its location depends on the ratio between total belief of singletons kmb and total plausibility of singletons kplb .

4.4.3. Equality condition for the epistemic family

Finally, the last angle φ3[b] is related to the condition under which relative plausibility and belief of singletons coincide: the analogous of 2-additivity for the “affine” family of Bayesian approximations.

̃ ̃

As a matter of fact, it nullifies iff b = plb, which is equivalent to

mb(x)/kmb = plb(x)/kplb ∀x ∈ Θ. In other words,

̃ kplb ̃ ̃

R[b]=b+k −k (plb−b). (26)

plb mb ̃ ̃

AMAI07georelative.tex; 25/05/2007; 11:29; p.17

18

Fabio Cuzzolin

bΘ

k/

mb kplb

~ −b

−− plb−b

b

R[b]

~ plb

1−kmb /kplb

−

plb

Figure 4. Location of R[b] in P.

Theorem 5. Relative plausibility and belief of singletons coincide iff

mass and plausibility are distributed homogeneously among singletons.

Again, a necessary and sufficient condition for φ3[b] = 0 can ex- pressed in terms of R[b], as

R[b](x)=(plb(x)−mb(x))/(kplb −kmb)=

= 1 (kplb mb(x) − mb(x)) = mb(x)/km ∀x,

kplb −kmb kmb b ̃ ̃ ̃

P

i.e. R[b] = b, with R[b] “squashing” plb onto b from the outside. In this ̄ˆ ̃ ̄ˆ ̃

case the quantities plb, plb, plb, p[b], b, b, b all lie in the same plane. 4.5. Singular case

We need to pay some attention to the singular case in which the relative belief of singletons simply does not exist, k = m (x) = 0.

mb xb ̄

We can note that, even in this case the belief of singletons b still exists,

and from Equation (19)

̄b = b Θ

while ˆb = plΘ for duality. But then, remembering the description in

terms of planes we gave in Section 4.3, the first two planes ˆ.ˆˆ ̄ ̄

a(b, p[b], b) = a(a(b, plb), a(b, plb)) = ˆ ̄. ̃

= a(a(bΘ, plb), a(plΘ, plb)) = a(bΘ, plb, plΘ) coincides, while the third one a(bΘ, ̃b,plΘ) simply does not exist.

AMAI07georelative.tex; 25/05/2007; 11:29; p.18

p[b](x) = mb(x) + k − k plb

mb

Geometry of relative plausibility 19 ^−

1−kmb

1 ̃ (plb(x) − mb(x)) = k plb(x) = plb(x).

plb

plb plb

~

φ p[b]=pl =R[b]

2b

−^ b = bΘ P plΘ = b

Figure 5. Geometry of relative plausibility and belief of singletons in the singular casewhenkm = m(x)=0.

bx

The geometry of relative belief and plausibility of singletons in the singular case then reduced to a planar geometry (see Figure 5) de- pending only on the angle φ2[b]. It is worth to point out that, in this case,

Theorem 6. If a belief function b does not admit relative belief (as b assigns zero mass to all singletons) then its relative plausibility and intersection probability coincide.

Also, in this case the non-Bayesianity flag coincides with the relative ̃

plausibility of singletons too: R[b] = plb = p[b] (see Equation (25)). 5. Relative plausibility and probabilistic approximation

All Bayesian belief functions described in Section 3.2 are feasible can- didate solutions for the probabilistic approximation problem. Given a belief function b we want to find the probability which is the “closest” to b, according to some criterion. For instance, the orthogonal projection π[b] of b onto P is by definition the unique Bayesian function which minimizes the standard Euclidean distance between b and P in the

AMAI07georelative.tex; 25/05/2007; 11:29; p.19

20 Fabio Cuzzolin belief space:

2 π[b]=argmin∥p−b∥L2 =argmin |b(A)−p(A)| .

p∈P p∈P A⊆Θ

Many different optimization criteria can be proposed, yielding distinct approximation problems. However, the rule of combination is central in the theory of evidence: a belief function is useful only when it is combined with others in a reasoning process. It is natural to think that this should be taken into account. We can then formulate an optimization problem based on the “external” behavior of the desired approximation.

Criterion. A good approximation of a belief function, when combined with any other belief function, must produce results similar to what obtained by combining the original function.

Analytically, this can expressed in the following way

ˆb = arg min b′′ ∈C

dist(b ⊕ b′, b′′ ⊕ b′)db′ (27) b′ ∈B

where b is the original belief function to approximate, b′ ∈ B is an arbitrary belief function on the same frame, dist is a distance function, and C is the class of belief functions the approximation belongs to. Of course the role of ⊕ can be played by any other meaningful operator, like for instance the disjunctive rule of combination for unnormalized belief functions (Smets, 1992).

Let us consider here the class C = P of Bayesian belief functions. 5.1. The geometry of representation

The relative plausibility function possesses a peculiar property which

candidates it to be the solution of the probabilistic approximation prob- ̃

lem as posed in Equation (27): plb represents perfectly b when combined with any Bayesian b.f. (Proposition 5). An immediate consequence is that the modified version of the approximation problem in which the original b.f. is combined with all and only the Bayesian belief functions

̃ is trivially solved by plb:

̃′′′

plb =argmin ∥b⊕p −p⊕p∥dp (28)

p∈P p′ ∈P

whatever the norm we choose, as b⊕p′ −p⊕p′ = 0 ∀p′. It is then natural to conjecture that the relative plausibility function could be the solution of the general approximation problem (27), too.

AMAI07georelative.tex; 25/05/2007; 11:29; p.20

Geometry of relative plausibility 21 A first step towards the proof of such a conjecture is to notice

that the representation property (Proposition 5) is inherited by all the ̃

points on the line joining the relative plausibility plb with b. We just need to recall a useful result on Dempster’s sum of affine combinations (Cuzzolin, 2004).

Proposition6. Theorthogonalsumb⊕ αb, α =1ofab.f.b 4 iiiii

with any

where

affine combination of belief functions can be written as

b⊕

αibi = γi(b⊕bi) (29) ii

αik(b, bi)

γi= αk(b,b) (30)

jjj

and k(b,bi) is the normalization factor of the combination between b

and bi.

Proposition 6 can then be used to prove that

̃ Theorem 7. All pseudo belief functions ς of the line a(b,plb) are

perfect representatives of b when combined with any Bayesian belief function through Dempster’s rule:

̃ ς⊕p=b⊕p ∀p∈P,∀ς∈a(b,plb).

̃

In particular, all the belief functions on the segment Cl(b,plb) are

perfect representatives of b.

Again the proof can be found in the Appendix. Theorem 7 can in

turn be used to prove the conjecture in the case of binary frames.

5.2. Probabilistic approximation in the binary frame

Let us then see how the general criterion (27) works for the binary

′ ̃ frame. We can rewrite the integral in (27) in terms of the lines a(b , plb′ )

̃

or better the segments Cl(c,plb′) (see Figure 6), by exploiting the rep-

resentation property.

After simplifying the notation for p ̃ = plb′ we can write

. ̃

|b⊕b′ −p⊕b′|db′ dp ̃= = arg min |b ⊕ b′ − p ⊕ p ̃| db′ dp ̃

p∈P p ̃∈P b′ ∈C l(c,p ̃)

4 In fact the collection {bi} is required to include at least a b.f. which is

combinable with b, (Cuzzolin, 2004).

pˆ=argmin p∈P

p ̃∈P b′ ∈C l(c,p ̃)

AMAI07georelative.tex; 25/05/2007; 11:29; p.21

22 Fabio Cuzzolin

since by Proposition 5 p ⊕ b′ = p ⊕ p ̃. Besides, as

b⊕b′ =b⊕[λp ̃+(1−λ)c]=ν(λ)b⊕p ̃+(1−ν(λ))b⊕c

we have that

pˆ = arg min |ν(λ)b⊕p ̃+(1−ν(λ))b⊕c−p⊕p ̃| dλ dp ̃ (31) p∈P p ̃∈P λ∈[0,1]

where (30)

ν(λ) = λk(b,p ̃) . λk(b, p ̃) + (1 − λ)k(b, c)

We can then recall that Dempster’s rule maps a line passing through

by =pl y plΘ

~

plb l’

~ b bb’bp

bc

l

b’

~

p

bΘ c bx = pl x

Figure 6. Probabilistic approximation in B2. Dempster’s rule maps lines passing

through plΘ = [1, 1] to lines still passing through the same point, as b ⊕ [1, 1] = [1, 1] ̃′′

∀b. This implies that b⊕plb′ is the relative plausibility of b⊕b whatever b ∈ Cl(c, p ̃) ̃

is. p = plb is the only Bayesian function for which p ⊕ p ̃ = b ⊕ p ̃ and all the difference vectors b ⊕ b′ − p ⊕ p ̃ are parallel.

any pair of points to another line passing through the images of these points (Cuzzolin, 2004):

ς⊕ : a(ς1, ς2) → a(ς ⊕ ς1, ς ⊕ ς2).

The consequence is that, since b ⊕ plΘ = b ⊕ [1,1]′ = [1,1]′ = plΘ (a trivial consequence of the definition of ⊕), not only c and p ̃ belong to a line l passing through plΘ, but the images b⊕c and b⊕p ̃of c and p ̃

AMAI07georelative.tex; 25/05/2007; 11:29; p.22

Geometry of relative plausibility 23

through the map b ⊕ (.) also lay on a line passing through plΘ = [1, 1]. As in the binary case the relative plausibility of b is the intersection of P with the line joining b and plΘ (see Section 3), this fact implies that

̃

b ⊕ p ̃ = plb⊕c

i.e. b ⊕ p ̃ is the relative plausibility of b ⊕ c (see Figure 6). ̃

Now, if p = plb

|ν(λ)b ⊕ p ̃+ (1 − ν(λ))b ⊕ c − p ⊕ p ̃| =

̃ ̃ =|ν(λ)plb ⊕p ̃+(1−ν(λ))b⊕c−plb ⊕p ̃|=

̃ ̃

= |(1 − ν(λ))b ⊕ c − (1 − ν(λ))plb ⊕ p ̃| = |1 − ν(λ)| |b ⊕ c − plb ⊕ p ̃|

̃

since b ⊕ p ̃ = plb ⊕ p ̃, and the integral (31) to optimize becomes

̃ p ̃∈P |b ⊕ c − plb ⊕ p ̃| λ∈[0,1] |1 − ν(λ)|dλ dp ̃.

Geometrically this means that for all the belief functions b′ ∈ Cl(c,p ̃) ′ ̃

all the difference vectors b ⊕ b − plb ⊕ p ̃ appearing in the integral are parallel to the line a(b ⊕ c, b ⊕ p ̃).

This property (which is not true for any other probability p ∈ P) is inherent to the relative plausibility of singletons, and seems to strongly support our conjecture.

6. Equality conditions for both families of Bayesian approximations

The rich tapestry of results of Section 4 completes our knowledge of the geometry of the relation between belief functions and probability which started with the affine family (Cuzzolin, 2007d).

The epistemic family is formed by Bayesian b.f. which depend on the balancebetweenthetotalplausibilitykplb oftheelementsoftheframe, and the total mass kmb assigned to them. This measure of the “non- Bayesianity” of b is symbolized by the Bayesian function R[b]. Theorem 5 states the condition under which they coincide in terms of the flag R[b], and is the analogous of the 2-additivity condition in the affine family (Cuzzolin, 2007d) (Proposition 1).

After having stressed the different semantics of the two groups, however (see Section 3.3), it is worth to understand under which conditions functions of both families reduce to the same Bayesian b.f. Theorem 6 is a first step in this direction: when b does not admit relative belief,

AMAI07georelative.tex; 25/05/2007; 11:29; p.23

24

̃

its relative plausibility plb and intersection probability p[b] coincide.

In this last part of the paper we draw inspiration from the binary case (Section 3.1) to find a sufficient condition for all Bayesian b.f. to merge. This condition is expressed in terms of equal distribution of masses.

6.1. Equal plausibility and mass distribution in the first family

Let us start by focusing on functions of the affine family, and look for a more stringent condition for their unification. In particular, let us consider orthogonal projection and pignistic function. We first need to recall the general expression of the orthogonal projection of b onto P (Cuzzolin, 2007d):

π[b](x) = A⊇x

+

1 + |Ac|21−|A| mb(A) n

1 − |A|21−|A|

mb(A) n .

1 − |A|21−|A|

−

1 − |A|21−|A|

Fabio Cuzzolin

can prove that

Lemma 1. The difference π[b] − BetP[b](x) between the probability values of orthogonal projection and pignistic function is

A̸⊃x

Remembering the general expression (14) of the pignistic function we

mb(A) |A| . (33) An immediate consequence of (33) is that

mb(A) n A⊆Θ

A⊇x

(32)

Theorem 8. Orthogonal projection and pignistic function coincide iff mb(A)(1 − |A|21−|A|)|Ac| = mb(A)(1 − |A|21−|A|) (34)

A⊇x |A| A̸⊃x for all x∈Θ.

Theorem 8 gives an exhaustive but rather arid description of the relation between π[b] and BetP[b]. On the other hand, the structure of Equation (34) tells us that it can be met if mb(A) is made a function of |A|. As a matter of fact, a number of sufficient conditions can be given with a more intuitive meaning in terms of belief.

Theorem 9. The following are sufficient conditions for BetP[b] = π[b]:

1. mb(A) = 0, |A| ̸= 1,2,n;

AMAI07georelative.tex; 25/05/2007; 11:29; p.24

Geometry of relative plausibility 25 2. mass is equally distributed among events of the same size k ≥ 3,

m (A)=

b n

m (B)

|B|=k b , ∀A:|A|=k, ∀k=3,..,n;

k

3. for all singletons x ∈ Θ, and ∀k = 3, .., n − 1

plb(x; k) =. mb(A) = const = plb(·; k). (35) A⊃x,|A|=k

Condition 2. states that if mass is equally distributed among higher- size events the orthogonal projection is the pignistic function: the prob- ability closest to b (in Euclidean sense) is indeed the barycenter of the simplex P[b] of consistent probabilities.

The quantity plb(x;k), instead, represents the contribution of size k subsets to the plausibility of x ∈ Θ. Condition 3. then says that events of the same size contribute with the same amount to the plausibility of each singleton. All those conditions are clearly met by b.f. on binary frames, as |A| ≤ 2.

6.2. Equal plausibility distribution as a general sufficient condition

The binary case again provides us with an intuition of how equal dis- tribution of plausibility (Equation (35)) gives an equality condition for Bayesian approximations of b of both families. From Figure 2 we can appreciate how belief functions with mb(x) = mb(y) lay on the bisector of the first quadrant, which is orthogonal to P. Their relative plausibility is then equal to their orthogonal projection π[b].

Let us first note that Theorem 3 can indeed be interpreted in terms

of equal distribution of plausibility among singletons. If Equation (35)

is met for all k = 2,…,n − 1 (this is trivially true for k = n) then the

non-Bayesian contribution R[b](x) = pl (x) − m (x) = m (A) of

b b Axb n n−1

each singletons x becomes

R[b](x) = mb(A) = mb(A) = mb(Θ) + plb(·; k) Ax k=2 |A|=k,A⊃x k=2

which is constant ∀x ∈ Θ. In other words,

Corollary 2. If plb(x; k) = const for all x ∈ Θ, ∀k = 2, …, n − 1 then

̄ ̄

the dual line a(b,plb) is orthogonal to P, and the non-Bayesianity flag

becomes the non-informative probability R[b] = P.

AMAI07georelative.tex; 25/05/2007; 11:29; p.25

26 Fabio Cuzzolin

The quantity plb(x;k) seems then to be connected to geometric orthogonality in the belief space. This is confirmed by the following remark. In (Cuzzolin, 2007d) we have shown that a belief function b ∈ B is orthogonal (as vector of RN ) to the probabilistic subspace P iff

mb(A)21−|A| = mb(A)21−|A|. (36) A⊃y,A̸⊃x A⊃x,A̸⊃y

Again, a sufficient condition for (36) can be given in terms of equal distribution of plausibility.

Theorem 10. If a belief function b is such that plb(x;k) = const = plb(·; k) for all k = 1, …, n − 1 then b is orthogonal to the probabilistic subspace, b⊥P.

In this case, confirming the intuition given by the binary case (Figure 2), all Bayesian approximations of b converge to the same probability:

Theorem 11. If plb(x;k) = const = plb(·;k) for all k = 1,…,n − 1 then b is orthogonal to P, and

̃

plb =R[b]=π[b]=BetP[b]=P.

Theorem 11 gives then a sufficient condition under which all the prob- abilities associated with b merge.

In conclusion, if events of the same size contributes equally to the plausibility of each singleton (plb(x;k) = const) for certain values of |A| = k, we have the following consequences on the relation between all Bayesian b.f. and their geometry

k = 3,…,n ⊢ BetP[b] = π[b] ̄ ̄

k=2,…,n ⊢ ̃ ̃ a(b,plb)⊥P

k=1,…,n ⊢ b⊥P,plb =b=R[b]=P=BetP[b]=p[b]=π[b].

Less binding conditions will may be harder to formulate, but worth to be studied in the near future.

7. Conclusions

Each belief function is associated with two different families of Bayesian functions, distinguished by the operator with which they commute: affine combination or Dempster’s rule. While the affine family p[b], π[b], BetP[b] is inherently related to the relative positions of belief and plausibility functions, and unifies under the assumption of 2-additivity,

̃ ̃

the epistemic family b, plb, R[b] has a geometry that can be described

AMAI07georelative.tex; 25/05/2007; 11:29; p.26

Geometry of relative plausibility 27

in terms of three planes and angles which depend on the non-Bayesian contribution of singletons measured by R[b]. The study conducted here completes the picture of the geometric behavior of all Bayesian b.f. related to b, started in (Cuzzolin, 2007d). As an application, we consid- ered the probabilistic approximation problem posed in terms of Demp-

̃

ster’s rule: the representation property of plb can then be used together

with its geometry to conjecture a new role of the relative plausibility function as solution of this optimization problem.

Finally, unifying conditions for all Bayesian functions of both families can be given by means of the notion of equal plausibility distribu- tion. The results we provided here, nevertheless, give only point-wise information about the differences between distinct approximations. In (Cuzzolin, 2007d), (Cuzzolin, 2007c) we started working on a quanti- tative analysis of these differences as functions of the basic probability assignment of the original belief function: a complete, exhaustive quan- titative comparison of all Bayesian approximations is the natural arrival point of this line of research.

Appendix

Proof of Theorem 2

ˆ ̄ ˆ ̄

By Equation (21) we have that b−b = (1−kmb)plΘ and plb −plb =

ˆˆˆ

(1 − kplb )plΘ. Hence β[b](plb − b) + b is equal to

̄ ̄ ̄

β[b] plb +(1−kplb)plΘ −b−(1−kmb)plΘ +b+(1−kmb)plΘ =

̄ ̄ ̄

=β[b] plb −b+(kmb −kplb)plΘ +b+(1−kmb)plΘ =

̄ ̄ ̄

=b+β[b](plb −b)+plΘ β[b](kmb −kplb)+1−kmb

but

β[b](kmb −kplb)+1−kmb = 1−kmb (kmb −kplb)+1−kmb =0

kmb −kplb by definition of β[b] (11), and (22) is met.

AMAI07georelative.tex; 25/05/2007; 11:29; p.27

28 Fabio Cuzzolin

Proof of Theorem 3

The scalar product of interest can be written as

̄ ̄

⟨plb −b,by −bx⟩=

=

(plb(z)−mb(z))bz,by −bx = = (plb(z) − mb(z)) [⟨bz, by⟩ − ⟨bz, bx⟩] =

(plb(z) − mb(z)) [⟨bz∪y, bz∪y⟩ − ⟨bz∪x, bz∪x⟩]

z∈Θ

z∈Θ

z∈Θ

since it is easy to see that, by the definition of bA,

⟨bA,bB⟩ = ⟨bA∪B,bA∪B⟩. We can distinguish three cases:

− ifz̸=x,ythen|z∪x|=|z∪y|=2andthedifference∥bz∪x∥2− ∥bz∪y∥2 goes to zero;

− if z = x then ∥bz∪x∥2 −∥bz∪y∥2 = ∥bx∥2 −∥bx∪y∥2 = 2n−2 −1− (2n−1 − 1) = −2n−2 where n =. |Θ|;

− if instead z = y then ∥bz∪x∥2 − ∥bz∪y∥2 = ∥bx∪y∥2 − ∥by∥2 = 2n−2. Hence

̄ ̄ n−2 n−2

⟨plb −b,by −bx⟩=2 (plb(y)−mb(y))−2 (plb(x)−mb(x))

∀y ̸= x, and as m (A) = pl (x) − m (x) the thesis follows. Axb b b

Proof of Corollary 1

As a matter of fact

mb(A)= (plb(x)−mb(x))=kplb −kmb x∈Θ

x∈Θ A⊃x,A̸=x

so that the condition of Theorem 3 can be written as

plb(x) − mb(x) = mb(A) = kplb − kmb ∀x. A⊃x,A̸=x n

Replacing this in (25) yields R[b] = 1 b . x∈Θ n x

Proof of Theorem 4

The value of φ2[b] also depends on the flag probability (25).

AMAI07georelative.tex; 25/05/2007; 11:29; p.28

Geometry of relative plausibility 29 Lemma 2. φ2[b] is nil if and only if

⟨1, R[b]⟩2 = ∥R[b]∥2⟨1, 1⟩. (37) where 1 = plΘ is the N-dim vector [1,..,1]′.

̄ ̄ ̄

Proof. By Equation (22) p[b] = b + β[b](plb − b), and remembering

that β[b] = 1−kmb we can write kplb −kmb

̄ ̄ ̄ ̄ ̄ ̄ ̄ plb −p[b]=plb − b+β[b](plb −b) =(1−β[b])(plb −b)=

kplb−1 ̄ ̄ (38) = k −k (plb −b)=(kplb −1)R[b]

plb mb

by definition (25) of R[b]. On the other side, as

ˆ ̄

plb = plb + (1 − kplb )plΘ

by Equation (21), we get ˆˆ ̄ ̄

plb −p[b]=(plb −plb)+(plb −p[b])= ̄ ̄

= plb +(1−kplb)plΘ −plb +(kplb −1)R[b]= (39) =(1−kplb)plΘ+(kplb −1)R[b]=(kplb −1)(R[b]−plΘ).

Combining (39) and (38) then yields

ˆ ̄

⟨plb − p[b], plb − p[b]⟩ = (kplb − 1)(R[b] − plΘ), (kplb − 1)R[b] =

= (kplb − 1)2⟨R[b] − plΘ, R[b]⟩ = (kplb − 1)2 ⟨R[b], R[b]⟩ − ⟨plΘ, R[b]⟩

But now where

⟨R[b], R[b]⟩ − ⟨1, R[b]⟩ . ˆ ̄

= (kplb − 1)2

⟨plb − p[b], plb − p[b]⟩ ˆ ̄

cos(π − φ2) =

ˆ ˆ ˆ 1/2

1/2 =(kplb −1) ⟨R[b]−plΘ,R[b]−plΘ⟩

∥plb − p[b]∥∥plb − p[b]∥ ∥plb − p[b]∥ = ⟨plb − p[b], plb − p[b]⟩

=

= = (kplb − 1) ⟨R[b], R[b]⟩ + ⟨plΘ, plΘ⟩ − 2⟨R[b], plΘ⟩

AMAI07georelative.tex; 25/05/2007; 11:29; p.29

1/2

30 and

Fabio Cuzzolin

̄

∥plb − p[b]∥ = (kplb − 1)∥R[b]∥.

by Equation (38). Then, as φ2[b] = 0 iff cos(π − φ2[b]) = −1 we can write the desired condition as

(kplb −1)2(∥R[b]∥2−⟨1,R[b]⟩)

−1 = (kplb − 1)2∥R[b]]∥⟨R[b], R[b]⟩ + ⟨1, 1⟩ − 2⟨R[b], 1⟩

that is equivalent to (after elevating to the square both numerator and denominator)

∥R[b]∥2 ∥R[b]∥2 + ⟨1, 1⟩ − 2⟨R[b], 1⟩ =

= ∥R[b]∥4 + ⟨1, R[b]⟩2 − 2⟨1, R[b]⟩∥R[b]∥2 and by erasing the common terms we have as desired.

Lemma 3. The angle φ2[b] is zero if and only if R[b] is parallel to plΘ = 1.

Proof. Condition (37) has the form 222222

⟨A,B⟩ = ∥A∥ ∥B∥ cos (AB) = ∥A∥ ∥B∥ 2

i.e. cos (AB) = 1, with A = plΘ, B = R[b]. This yields cos(R[b]plΘ) =

1 or cos(R[b]plΘ) = −1, but as both R[b] and plΘ have all positive

components it must be cos(R[b]plΘ) = 1 so that R[b]plΘ = 0.

Now the condition of Lemma 3, R[b] ∥ plΘ means R[b] = α plΘ for

some scalar α, i.e.

R[b] = −α (−1)|A|bA A⊆Θ

(since pl = − (−1)|A|b by Equation (5)). Θ A⊆Θ A

On the other side, R[b] = x∈Θ R[b](x)bx is a probability (i.e. a linear combination of basis probabilities bx only) and the two conditions are never met together, unless |Θ| = 2.

Proof of Lemma 1

Using the form (32) of the orthogonal projection we get π[b]−BetP [b](x) =

=

A⊇x

1+|Ac|21−|A| mb(A) n

1 1−|A|21−|A| −|A| + mb(A) n

A̸⊃x

AMAI07georelative.tex; 25/05/2007; 11:29; p.30

Geometry of relative plausibility 31 1 + |Ac|21−|A| 1 |A| + |A|(n − |A|)21−|A| − n

but

= n|A|

so that π[b] − BetP [b](x) =

n|A| = 1 1 1−|A|

n − |A| = (|A|−n)(1−|A|21−|A|)

= n−|A| 1−|A|2

1−|A|21−|A| mb(A) n

n

1−|A| +

1−|A|21−|A| mb(A) n

(40)

Proof of Theorem 7

=

or equivalently, Equation (33).

A⊇x

A̸⊃x

We can exploit Proposition 6 on Dempster’s sums of affine combina- ̃

tions. Namely, each pseudo belief function ς on the line a(b, plb) can be

written as

̃

ς = λb + (1 − λ)plb.

But then by Proposition 6 ̃ ̃

with

as

p⊕ λb+(1−λ)plb λ k(b, p)

= ν p⊕b+(1−ν)p⊕plb λ pl (x)

̃ = x∈Θ b

λ k(b,p)+(1−λ)k(plb,p) λ x∈Θ plb(x)+1−λ

ν=

k(b, p) = ̃ ̃

k(plb,p) =

x∈Θ y∈Θ

p(x) p(x)plb(x) =

x∈Θ

A⊇x

mb(A) = p(x)plb(x), x∈Θ

k(b, p)

= plb(y).

y∈Θ

p(x)plb (x) plb(y)

x∈Θ ̃

Hence, since b ⊕ p = plb ⊕ p, we have that ̃

p⊕ς=p⊕ λb+(1−λ)plb =νb⊕p+(1−ν)b⊕p=b⊕p.

AMAI07georelative.tex; 25/05/2007; 11:29; p.31

32 Fabio Cuzzolin

Proof of Theorem 9

Let us consider all claims. Equation (34) can be expanded as follows

n−1 n−1 1−k21−k n−k mb(A)= 1−k21−k

mb(A) A̸⊃x,|A|=k

mb(A) = 0

k=3

k A⊃x,|A|=k

k=3 mb(A) −

n−1

≡ 1 − k21−k

n − k

k=3

k A⊃x,|A|=k n−1 1−k

A̸⊃x,|A|=k

≡1−k2 n −kmb(A)=0 k=3 k A⊃x,|A|=k |A|=k

after noticing that 1 − k 21−k = 0 for k = 1,2 and the coefficient of mb(Θ) in Equation (34) is zero, since |Θc| = |∅| = 0. The condition of Theorem 8 can then be rewritten as

n−1 1−k n 1−k2

k=3 k

n−1 mb(A)=(1−k21−k) mb(A)

A⊃x,|A|=k k=3

|A|=k

for all x ∈ Θ.

1. is then immediate by (41).

Concerning 2., in this case the equation becomes

n−1 mb(A) n−1

n(1−k21−k) n−1 |A|=k = (1−k21−k) m (A)

kk−1nb k=3 k k=3 |A|=k

n n − 1 n whichistruesince k k−1 = k .

3. under (35) the system of equations (41) reduces to a single equation

n−1 n−1 (1−k21−k)nkplb(.;k)=(1−k21−k) mb(A) k=3 k=3 |A|=k

which is verified iff

mb(A)=nkplb(.;k)

|A|=k which is in turn equivalent to

|A|=k

(41)

n plb(.; k) = k

mb(A) ∀k = 3, …, n − 1.

AMAI07georelative.tex; 25/05/2007; 11:29; p.32

But now

n plb(.; k) = k

mb(A) = under the hypothesis.

mb(A)|A| =

mb(A) x∈Θ A⊃x,|A|=k

mb(A) = |A|=k,A⊃y

Proof of Theorem 10

|A|=k

|A|=k

Geometry of relative plausibility 33

Condition (36) is equivalent to

mb(A)21−|A| = mb(A)21−|A| ≡ 1

n−1 A⊇x k=1

2k

mb(A)≡ 1 plb(y;k)= 1 plb(x;k)

A⊇y n−1

Proof of Theorem 11

To see this let us rewrite the values of the pignistic function BetP [b](x) in terms of plb(x; k) as

BetP[b](x) = mb(A) = n mb(A) = n plb(x;k) A⊇x |A| k=1A⊃x,|A|=k k k=1 k

which is constant under the hypothesis, yielding BetP[b] = P. Also, as

n−1

2k 2k 2k

n−1 k=1 |A|=k,A⊃x k=1 k=1

= 1

for all y ̸= x, and if plb(x; k) = plb(y; k) ∀y ̸= x the condition is met.

we get

n b k=1

plb(x) = A⊇x

mb(A) =

k=1 A⊃x,|A|=k

mb(A) = plb(x;k)

plb(x; k)

n n

pl (x) plb(x)=k =n

̃

which is equal to 1/n if plb(x;k) = plb(·;k) ∀k,x. Finally, under the

same condition,

p[b](x) = mb(x) + β[b] plb(x) − mb(x)

=

̃

b(x)= k

= y∈Θ

1 =npl(·;1)=n.

=plb(·;1)+β[b]

n 1

mb(x)

k=2 plb(x; 1)

plb(·;k)= n, plb(·; 1)

x∈Θ k=1

k=1

plb

plb(x; k)

mb

plb(y;1) b

AMAI07georelative.tex; 25/05/2007; 11:29; p.33

34 Fabio Cuzzolin References

Aigner, M.: 1979, Combinatorial Theory. New York: Classics in Mathematics, Springer.

Bauer, M.: 1997, ‘Approximation Algorithms and Decision Making in the Dempster- Shafer Theory of Evidence–An Empirical Study’. International Journal of Approximate Reasoning 17, 217–237.

Black, P.: 1996, ‘An examination of belief functions and other monotone capacities’. PhD dissertation, Department of Statistics, Carnegie Mellon University. Pgh. PA 15213.

Black, P.: 1997, ‘Geometric Structure of Lower Probabilities’. In: Goutsias, Malher, and Nguyen (eds.): Random Sets: Theory and Applications. Springer, pp. 361– 383.

Chateauneuf, A. and J. Jaffray: 1989, ‘Some characterization of lower probabilities and other monotone capacities through the use of Mbius inversion’. Math. Soc. Sci. 17, 263–283.

Cobb, B. and P. Shenoy: February 2003a, ‘On transforming belief function models to probability models’. Technical report, University of Kansas, School of Business, Working Paper No. 293.

Cobb, B. R. and P. P. Shenoy: 2003b, ‘A comparison of Bayesian and belief function reasoning’. Information Systems Frontiers 5(4), 345–358.

Cobb, B. R. and P. P. Shenoy: July 2003c, ‘A comparison of methods for trans- forming belief function models to probability models’. In: Proceedings of ECSQARU’2003, Aalborg, Denmark. pp. 255–266.

Cuzzolin, F.: 2007a, ‘Dual properties of relative belief of singletons’. submitted to the IEEE Transactions on Fuzzy Systems.

Cuzzolin, F.: 2007b, ‘A geometric approach to the theory of evidence’. IEEE Transactions on Systems, Man and Cybernetics part C.

Cuzzolin, F.: 2007c, ‘On the Properties of the Intersection Probability’. submitted to Artificial Intelligence.

Cuzzolin, F.: 2007d, ‘Two new Bayesian approximations of belief functions based on convex geometry’. IEEE Transactions on Systems, Man, and Cybernetics – Part B.

Cuzzolin, F.: April 2004, ‘Geometry of Dempster’s rule of combination’. IEEE Transactions on Systems, Man and Cybernetics part B 34:2, 961–977.

Cuzzolin, F.: July 2003, ‘Geometry of Upper Probabilities’. In: Proceedings of the 3rd Internation Symposium on Imprecise Probabilities and Their Applications (ISIPTA’03).

Cuzzolin, F. and R. Frezza: 2001, ‘Geometric analysis of belief space and conditional subspaces’. In: Proceedings of ISIPTA’01, Cornell University, Ithaca, NY.

Daniel, M., ‘Transformations of belief functions to probabilities’. Technical report, Institute of Computer Science, Academy of Sciences of the Csech Republic. Dempster, A.: 1968a, ‘A generalization of Bayesian inference’. Journal of the Royal

Statistical Society, Series B 30, 205–247.

Dempster, A.: 1968b, ‘Upper and lower probabilities generated by a random closed

interval’. Annals of Mathematical Statistics 39, 957–966.

Denoeux, T.: 2001, ‘Inner and outer approximation of belief structures using a

hierarchical clustering approach’. Int. Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 9(4), 437–460.

AMAI07georelative.tex; 25/05/2007; 11:29; p.34

Geometry of relative plausibility 35

Denoeux, T. and A. B. Yaghlane: October 2002, ‘Approximating the Combination of Belief Functions using the Fast Moebius Transform in a coarsened frame’. International Journal of Approximate Reasoning 31(1-2), 77–101.

Dubois, D. and H. Prade: 1990, ‘Consonant approximations of belief functions’. International Journal of Approximate Reasoning 4, 419–449.

Ha, V. and P. Haddawy: August 1996, ‘Theoretical foundations for abstraction- based probabilistic planning’. In: Proc. of the 12th Conference on Uncertainty in Artificial Intelligence. pp. 291–298.

Haenni, R. and N. Lehmann: October 2002, ‘Resource bounded and anytime approx- imation of belief function computations’. International Journal of Approximate Reasoning 31(1-2), 103–154.

Kramosil, I.: 1995, ‘Approximations of Believeability Functions under Incomplete Identification of Sets of Compatible States’. Kybernetika 31, 425–450.

Lowrance, J. D., T. D. Garvey, and T. M. Strat: 1986, ‘A framework for evidential- reasoning systems’. In: A. A. for Artificial Intelligence (ed.): Proceedings of the National Conference on Artificial Intelligence. pp. 896–903.

Shafer, G.: 1976, A Mathematical Theory of Evidence. Princeton University Press. Smets, P.: 1988, ‘Belief functions versus probability functions’. In: S. L. Bouchon B. and Y. R. (eds.): Uncertainty and Intelligent Systems. Springer Verlag, Berlin,

pp. 17–24.

Smets, P.: 1992, ‘The Nature of the Unnormalized Beliefs Encountered in the

Transferable Belief Model’. In: Proceedings of the 8th Annual Conference on Uncertainty in Artificial Intelligence (UAI-92). San Mateo, CA, pp. 292–29, Morgan Kaufmann.

Smets, P.: Montr ́eal, Canada, 1995, ‘The Canonical Decomposition of a Weighted Belief’. In: Proceedings of the International Joint Conference on AI, IJCAI95. pp. 1896–1901.

Tessem, B.: 1993, ‘Approximations for Efficient Computation in the Theory of Evidence’. Artificial Intelligence 61:2, 315–329.

Vakili: 1993, ‘Approximation of hints’. Technical report, Institute for Automation and Operation Research, University of Fribourg, Switzerland, Tech. Report 209. Voorbraak, F.: 1989, ‘A computationally efficient approximation of Dempster-Shafer

theory’. International Journal on Man-Machine Studies 30, 525–536. Yaghlane, A. B., T. Denoeux, and K. Mellouli: 2001, ‘Coarsening approximations of belief functions’. In: S. Benferhat and P. Besnard (eds.): Proceedings of

ECSQARU’2001. pp. 362–373.

AMAI07georelative.tex; 25/05/2007; 11:29; p.35

AMAI07georelative.tex; 25/05/2007; 11:29; p.36