You are on page 1of 32

Vol. 66 (2010) REPORTS ON MATHEMATICAL PHYSICS No.

DISTINGUISHABILITY MEASURES AND ENTROPIES FOR GENERAL


PROBABILISTIC THEORIES

GEN KIMURA!, KOJI NUIDA! AND HIDEKI IMAI I,2

! Research Center for Information Security (RCIS),


National Institute of Advanced Industrial Science and Technology (AIST),
Daibiru building 1003, 1-18-13 Sotokanda, Chiyoda-ku, Tokyo, 101-0021 Japan
2 Graduate School of Science and Engineering, Chuo University,
1-13-27 Kasuga, Bunkyo-ku, Tokyo 112-8551, Japan
(e-mails:gen-kimura@aist.go.jp.k.nuida@aist.go.jp)

(Received October 8, 2009 - Revised July 9, 2010)

As a part of the construction of an information theory based on general probabilistic theories,


we propose and investigate several distinguishability measures and "entropies" in general prob-
abilistic theories. As their applications, no-cloning theorems, information-disturbance theorems
are reformulated, and a bound of the accessible informations is discussed in general probabilistic
theories, not resorting to quantum theory. We also propose the principle of equality for pure
states which makes general probabilistic theories more realistic, and we discuss the role of
entropies as a measure of pureness.
Keywords: foundations of quantum theory, general probabilistic theories, distinguishability mea-
sures and entropies.

1. Introduction
Recent development of the quantum information theory has shown us the ability
of information processings and computations based on the quantum physics to
go far beyond those based on classical physics. At its heart, this is because the
potential ability of probability is enlarged from classical theory to quantum theory.
Indeed, quantum theory can be considered as a probabilistic theory, which-in some
sense-properly includes the classical probability theory (Kolomogorov's probability
theory). However, this does not mean that quantum theory is the most general theory
of probability even among the possible theories which have operational meanings.
So far, the most general theory of probability with suitable operational meanings
has been developed by many researchers [1-6, 9]. Following the recent trend [9],
we call such theories the general probabilistic theories (or simply GPTs).
As the quantum information theory has been constructed on the basis of quantum
theory, information theories can be constructed on the basis of any probabilistic
theory [4, 7-14]. There are several motivations for this line of research: First, this
is an attempt to find physical principles (axioms written in physical languages)
for quantum theory [16-18]. Indeed, by considering the general framework which

[175]
176 G. KIMURA, K. NUIDA and H. IMAI

encompasses the quantum theory, we look for principles which determine the
position of the quantum theory in this general framework. The development of the
quantum information theory motivates us to find the principles based on information
processings for the theory of quantum physics [19, 20, 10]. Second, the construction
of the information theory based on the most general theory of probability enables
us to understand logical connections among information processings by resorting
to the particular properties of neither classical nor quantum theory, but only to
the essential properties which a suitable probability theory should possess. Third,
this is a preparation for the possible break of quantum theory. For instance, one
can discuss a secure key distribution in the general framework without assuming
quantum theory itself [21]. Finally, this might provide a classical information theory
under some restrictions of measurements, since any general probabilistic theory has
a classical interpretation based on such restrictions of measurements [5, 22].
In this paper we propose and give systematic discussions of several distinguisha-
bility measures (especially, the Kolmogorov distance and fidelity) and three quantities
related to entropies for general probabilistic theories. The corresponding measures and
entropies in classical and quantum theories have been proved to be useful [25, 26],
and we give generalizations for them in any GPTs and discuss their applications.
In particular, the no-cloning theorem and a simple information-disturbance theorem
in GPTs are reformulated using fidelity, and a bound of the accessible information
is discussed based on one of the "entropies". Finally, we introduce the principle
of "equality of pure states" meaning that there are no special pure states. We call
such a GPT symmetric, and in symmetric GPT the measure of pureness will be
discussed.

2. General probabilistic theories


In this section, we give a brief review of general probabilistic theories (see for
instance [1, 3, 5, 9] and the references therein for details). Although, in the end,
we are going to use mathematical notions such as convexity, affine functions, etc.,
it should be noticed that we do not assume any mathematical structure without
physical reasons.
The important ingredients of the GPTs are the notions of state and measurement.
In any GPT, we have a physical law which determines a probability p(aIM, s) for
obtaining an output a in a measurement M of an observable in a state s. In this
paper, for simplicity, we only consider measurements with finitely many outcomes.
Naturally, we assume the separating properties of both states and measurements: (AI)
The states Sl and S2 are identified if p(aIM, SI) = p(ajM, S2) for any measurement
M and measurement outcome a. (A2) The measurements M I and M 2 are identified
if p(aIMI , s) = p(aIM2, s) for any measurement outcome a in any state s. We
also assume the convex property of states. (A3) For any states SI, S2 and q E [0, 1],
there exists a state S going into SI with probability q and into S2 with probability
1 - q; namely, it follows that p(aIM, s) = qp(aIM, SI) + (l - q)p(aIM, S2) for any
measurement. (A4) further, we naturally assume that the dynamics preserves these
DISTINGUISHABILITY MEASURES AND ENTROPIES 177

probabilistic mixtures. (AS) we introduce a natural topology on the state space which
is the weakest topology such that s ---+ p (a 1M, s) is continuous for any measurement;
finally, we assume (A6) there exists joint state ()) of the system A + B defining
a joint probability for each measurement MA and MB which satisfies the no-signaling
condition, i.e. the marginal probabilities for the outcomes of a measurement on A
do not depend on the measurements on B, and vice versa. Moreover, the joint state
is determined by joint probabilities for all pairs of measurements of A and B.
Having based on these assumptions, one can show the following [I, 3, 5, 9]:
(a) There exists a locally convex topological vector space V such that, in a suitable
representation, the state space S is a convex subset in V where q Sj + (I - q )S2
corresponds to the state described in (A3) above. An extreme point of S is called
a pure state. Moreover, without loss of generality, one can assume that S is compact
with a natural topology [13]. Notice that by the famous Krein-Milman theorem (see,
for instance, Theorem 10.4 in [23]) the set of extreme points Spure is nonempty and
S is the closed convex hull of extreme points. In particular, in finite-dimensional
cases, any state s E S has a convex decomposition with a finite number of pure states
(hereafter, a pure state decomposition): s = Lx PxSx where Px :::: 0, Lx Px = I,
SX E Spure (see, for instance, Theorem 5.6 in [24]).
A map j : S ---+ lR is called an affine functional if it satisfies
[tqs, + (I - q)S2) = qj(sd + (I - q)j(S2)
for any q E [0, I], S1, S2 E S. In particular, an affine functional e : S ---+ lR is called
an effect if the range is contained in [0, I]. We denote the sets of all the affine
functional and all the effects by A(S) and reS), respectively. It is easy to see
that res) is a convex subset of a real vector space A(S). We call an extreme
°
effect a pure effect. The zero effect and the unit effect u such that o(s) = and °
u (s) ::: I are trivially pure effects. It is easy to see that the effect u - e is pure
iff the effect e is pure. Moreover, we can introduce a natural topology on reS)
which is the weakest topology such that the map £(S) ---+ JR, e r-+ e(s), becomes
continuous for every s E S. It is shown that reS) is compact with respect to this
topology [13].
(b) It is often convenient to characterize a measurement without explicitly speci-
fying the measurement outcomes. In that case, any measurement M is characterized
by the set of effects m, such that p(ailM, s) = mi(s) and Li m, = u: In the fol-
lowing, we occasionally use the notation M = (m j) j (implicitly assuming conditions
m j E E (S) and L j m j = u) to denote the measurement on S meaning that m j (s) is
the probability of obtaining jth output (say a j) by a measurement M in a state s.
(c) Dynamics is described by an affine junction on the state space. In general,
the initial state space S and final state space S' might be different. Then, a time
evolution map is given by an affine map j from S to S'. We denote by A(S, S')
the set of all the affine maps from S to S'.
(d) The joint systems are described by a convex set in a tensor product of the
corresponding vector spaces. A joint state co on A + B with state spaces SA and SB
is described by a hi-affine map on [(SA) x [(SB). In particular, if to is a joint state
178 G. KIMURA, K. NUIDA and H. IMAI

on A + B, then the marginal state of A is defined by WA (e) := wee, UB) (e E [(SA))


where UB is the unit effect on SB. From the extreme property of pure states, it is
easy to see' that if the marginal state WA is pure, then a joint state W is a state
with no correlations: w(eA, eB) = wA(eA)wB(eB) (eA E [(SA), es E [(SB)).
It is important to notice that all the mathematical structures are not introduced ad
hoc but they appear naturally based on physical assumptions (AI)-(A6). It is also
possible to formulate the measurement process by considering the cone generated
by ° and S in V [3, 10, 15].
In this paper, we treat for simplicity finite GPT where V is finite-dimensional but
most of the definitions and properties below hold in general with some topological
remarks [23]. (However, notice that in finite-dimensional cases there are essentially the
unique topologies, and one can use another characterization of the natural topology,
for instance using the Kolmogorov distance below. In particular, the unique topology
is the Euclidean topology and thus one can imagine a state space of each GPT as
any compact convex (or equivalently closed bounded convex) subset in Euclidean
spaces.) Moreover, we assume that any set of effects m, such that Li m, = U has
a corresponding measurement. (It is also an easy exercise to reformulate what is
below without this assumption.)
Here, let us see the typical examples for finite GPTs.
[Finite classical systems] Let n = {WI, ... , Wd} be a sample space. The state
is represented by a probability Pi for an elementary event toi. The state space is
given by S, := {p E JRd IPi :::: 0, Lf=1 Pi = I}. There are d pure states which are
the definite states when one of the elementary event occurs with probability 1:
namely, p{iL) = (DiLl, ... , 0iLd) E S, (fL = 1, , d). Notice that S« is a (standard)
simplex. In particular, any state P = (PI, , Pd) E S; has the unique pure state
decomposition: P = L~=I PiLP{iL).
[Finite quantum systems] Let 1-{ be ad-dimensional complex Hilbert space.
A quantum state is represented by a density operator p, i.e. a positive operator on
1-{ with unit trace. The state space is given by Sq = {p E £(1-{) I P:::: 0, tr P = I}
where £(1-{) is a real vector space of all the (Hermitian) operators on 1-{. Pure
states are characterized by I-dimensional projection operators. A quantum effect e is
represented by an operator E satisfying °:: :
E ::::: li, called a POVM (positive operator-
valued measure) element, by the correspondence e(p) = tr(pE). Here 0, IT denote
the zero and identity operator on 1-{, respectively. In particular, any measurement of
an observable M = (mi)i where m, are effects on Sq has the corresponding POVM
measurement (Mi)i such that M, E £(1-{), M, :::: 0, Li M, = li and mi(p) = tr(pMi).
Notice that the set of all the extreme effects is the set of all the projection
operators P (1-{). The POVM measurement (Pi) i consists of projection operators
Pi called the PVM (projection-valued measure) measurements. The following is an
example of GPT which is neither classical nor quantum.

I Although the proof is simple (see for instance [30, 9]), this property is important in its applications. For
instance, in the context of key distribution, Alice and Bob can assure to be safe if there joint state is pure,
since then their system does not have any correlations with another system (eavesdropper).
DISTINGUISHABILITY MEASURES AND ENTROPIES 179

[Hyper cuboid systems and squared system] Let Seb := {c E JRd I 0 :::: c, :::: 1
(i = 1, ... , d)}. The pure states are the 2d vertices. We call this hyper cuboid
system, and especially the squared system when d = 2 [11]. These might be the
easiest examples of GPT which are neither classical nor quantum. However, one can
construct a classical model such that a suitable restriction of measurements reduces
the hyper cuboid systems [22].
Finally, notice that the probabilistic theories with state spaces SA and S8 are
equivalent if they are affine isomorphic, i.e. there exists a bijective affine map
from SA to S8. For instance, any GPT which has a simplex state space is affine
isomorphic to some standard simplex, and therefore can be considered as a classical
system.

3. Distinguishability measures for general probabilistic theories


In this section, we introduce several distinguishability measures (Kolmogorov
distance, fidelity, Shannon distinguishability, etc.) for GPTs. The corresponding
measures for quantum systems are proved to be useful in quantum information
theories [25]. It is indeed straightforward to generalize them to any GPT using the
notions developed in the preceding sections, and some of them have been used in
[10, 12, 13]. Most of the properties for quantum can be found in [25]. However,
we think it is useful to sum up these measures, especially the Kolmogorov distance
and fidelity, for GPT systematically and all the proofs of this section are put in
Appendix A for the reader's convenience. A striking thing is that all the results
below does not resort to ingredients such as vectors and operators on a Hilbert
space, but only to the analysis of probabilities.
All the measures are based on those for classical systems among every possible
measurements of observables: In the following, let S be the set of states (state
=
space), and E £(S), M = M(S) be the sets of effects and measurements on S.

3.1. Kolmogorov's distance in GPT


The Kolmogorov distance De(p, q) is known to serve as a good distinguishability
measure between two probability distributions P = (pdi and q = (qj) j,
1
De(p, q) := "2 L IPi - qil·
I

Indeed, De has a metric property and it follows that De(Pi, q j) = max- IP(S) - q (S) I
where the maximization is taken over all subsets S of the index set {i}. Thus
De(p, q) is considered as a metric for two probability distributions with an operational
meaning.
In any GPT, one can define [13] the Kolmogorov distance between two states
S1,52 E S by
(1)
180 G. KIMURA, K. NUIDA and H. IMAI

where PI (M) := (mi(sl))i and P2(M) := (mi(s2))i are probability distributions of


ith output of the measurement M in states SI and S2, respectively. The maximiza-
tion in (1) is always attained by some measurement, which we call an optimal
measurement, due to the compactness of the effect set [13]. Notice that D(SI' S2)
is a metric of S, i.e. (i) D(SI, S2) ~ 0; equality iff SI = S2, (ii)D(sl, S2) = D(S2, SI),
and (iii) D(SI, S3) ::: D(SI, S2) + D(S2, S3), and it is bounded above by 1, i.e.
o::: D(SI, S2) ::: 1. These follow from the metric property of D; and the separation
property of states. The mentioned above operational meaning of D; also gives an
operational meaning to D(p, 0'); that is the maximum difference of probabilities
among all the events S and all measurements. For quantum systems, D(PI, P2) is
the trace distance between density operators PI, P2: D(PI, P2) = ~ tr IpI - P21 [26],
where IAI := J At A.
For any measurement M = (mi) and states Sl, S2 E S, one can consider a two-
valued measurement M 2 = (m+, m_) where m+ := LiEX+ m, and m: := LiEX_ m,
with X+ := {i I mi(sl) - mi(s2) ~ O} and X_ := (i I mi(sl) - mi(s2) < O}. Using
this, one has another characterization of the Kolmogorov distance,
D(p, 0') = max(e(p) - e(O')). (2)
eE£

The quantity on the right-hand side is a metric used in [10].


Let Ps(SI, S2) be the maximal success probability to distinguish two states Sl
and S2 in a single measurement under the uniform prior distribution. Without loss
of generality, it is enough to consider two-valued measurement (m I, m2) E M for
a discrimination problem of two states SI and S2 by guessing SI (or S2) when
observing 1- (or 2)-th output. Thus, we have

Ps(SI,S2):= max
(m\,mz)EM
(~ml(SI)+~m2(S2))
2 2

= (1
~2 + max(e(sl)
eE£
- e(S2)))' (3)

From (2) and (3), we have another operational meaning of the Kolmogorov distance.

PROPOSITION 1. For any states SI, S2 E S in GPT,

D(SI, S2) = 2Ps(SI, S2) - 1.

Note that D(SI, S2) takes the maximum 1 iff Ps(SI, S2) = 1, i.e. when SI and S2
are completely distinguishable in a single measurement. On the other hand, D(SI, S2)
takes the minimum 0 (thus SI = S2) iff P, (SI, S2) = 1/2, i.e. SI and S2 are completely
indistinguishable (and indeed such states should be identified due to the separation
property of states).
In the following, we show the monotonicity, strong convexity, joint convexity,
and convexity for the Kolmogorov distance for any GPT.
DISTINGUISHABILITY MEASURES AND ENTROPIES 181

PROPOSITION 2 (Monotonicity). For any state S l, S2 E 5. and time evolution


(map) A E A(5, 5 ' ), we have
D (SI, S2) :::: D (A(sl ),A (sz)).
This implies that the distingui shability between Sl and S2 cannot be increased by
any physical means. Notice that it is well known that the trace distance in quantum
systems has the monotonicity property under any trace preserving completely positive
map [26]. Proposition 2 generalizes this for any trace preserving positive map .
PROPOSITION 3 (Strong convexity). Let P = (P;)i and q = (qi)i be probability
distributions over the sam e index set. and s., ti E 5 be states of CPT with the same
index set. Then it follows that

D(L p.s. , Lqiti) s De(p , q) +L PiD (Si, ti) '


i i i

As corollaries, we have the following.


COROLLARY 4 (Joint convexity).

D(LPiSi , LPiti):S LPiD (Si ,ti).


i i i

(As a spe cial case Pi = qi of the strong convexity.)


COROLLARY 5 (Convexity).

D(LPi Si ,t) ::: LPiD (Si ,t ).


i i

(As a speciaL case t, = t of the j oint convex ity.)

3.2. Fidelity in GPT


The Bhattacharyya coefficient (the clas sical fidelity ) between two prob ability
distributions P = (Pi )i and q = (q j) j is defined by

rs», q) := L JPiqi. (4 )

Note that (i) 0 :s Fc(p , q ) :s 1, where Fe(p, q) = 1 iff p = q ; (ii) Fe(p , q) = Fc(q , p ).
We say that two probability distributions p, q are orthogonal iff Fe(p , q) = O.
In any GPT, one can also define the fidelity [3, 12] between two states s" S2 E 5
as
F(S I, S2) = min Fe(PI (M), P2(M )), (5)
M={mi}EM

where PI (M ) := (mi(s l))i and P2(M ) := (mi(s2))i. Note that the minimization in (5)
is alway s attained by some measurement, which we call an optimal mea surement,
again due to the compactness of the effect set [13]. In quantum mechanics, one
182 G. KIMURA, K. NUIDA and H. IMAI

has the formula F(PI, P2) = trlp:/2p~/21 = tr[(p:/ 2p2P:/2)1/2] between two density
operators PI, P2 [26].
From the property of the Bhattacharyya coefficient and the separation property
of states, it follows that (i) 0 :s F(s], S2) :s 1 where F(SI, S2) = 1 iff SI = S2;
(ii) F (SI' S2) = F (S2, sd. We say that states SI and S2 are orthogonal (SI ..l S2) iff
F(SI, S2) = O.
PROPOSITION 6 (Monotonicity). For any states SI,S2 E S, and time evolution
I\. E A(S, 8'), it follows
F(I\.(SI), I\.(S2)) ::: F(SI, S2).
PROPOSITION 7 (Strong concavity [12]). Let P = (Pi)i and q = (qi)i be
probability distributions over the same index set, and si, t, E S be states of GPT
with the same index set. Then

F(LPiSi, Lqiti) ::: LJPiqJ(Si, ti)'


i i i
As corollaries, one gets the following.
COROLLARY 8 (Joint concavity and concavity).

F(LPiSi, LPiti)::: LPJ(Si,td,


i i i
F(L p.s., t) ::: L pJ(Si, t).
i i

PROPOSITION 9. In a bipartite system A + B, we have the following:


(i) Fts, 18> t, S2 18> t) = F(s], S2) for any SI,S2 E SA, t E SB.
(ii) F(SA, tA) ::: F(s, t) for any s, t E SA 18>SB where SA and tA are the reduced
states to the system A.
(iii) F(SI, s2)F(tl, t2) ::: F(sl 18> tl, S2 18> t2) for any SI,S2 E SA, tl, ti E SB.
In particular, it follows that
F(s, t)2 ::: F(s 18> s, t 18> t). (6)
Note that the generalization of properties of Proposition 9 is straightforward for
a multipartite system.
However, contrary to the Kolmogorov distance, it is difficult to give an operational
meaning for the fidelity, since there is no known operational meaning of the
Bhattacharyya coefficient. In using the fidelity, it is important to know the relation
with another operational measure like the Kolmogorov distance.

3.3. Relation between the Kolmogorov distance and the fidelity


PROPOSITION 10. For any state s, t E S, it follows
1 - F(s, t) :s D(s, t) :s )1 - F(s, t)2. (7)
DISTINGUISHABILITY MEASURES AND ENTROPIES 183

This relation is famous for quantum systems [25, 26], but Proposition ]0 shows
that this holds for any GPT.
From (7), we have the corollary.
COROLLARY 11. (i) D(s, t) = 0 iff F(s, t) = 1 and (ii) D(s, t) = I iff F(s, t) = O.
In particular, the orthogonality of states turns out to be equivalent to the complete
distinguishability of states (P, = 1).
In this sense, the Kolmogorov distance and the fidelity are equivalent.
Similarly, it is straightforward to introduce another measures which are used in
quantum information theory. For instance, one can define Shannon's distinguishability
and can show the same relations (see for instance Theorem ] in [25]).

4. Applications
In this section, we give simple proofs using the fidelity for the no-cloning
theorem [9] and the information-disturbance theorem [13, 12] in any GPT.
THEOREM ] 2 (No-cloning). In any GPT, two states s I, S2 E S are jointly clonable
iff Sj = S2 or Sl and S2 are completely distinguishable.
Proof: Let the states Sl, S2 E S be jointly clonable. Namely, there exists a time
evolution map (a cloning machine) A E A(S, S ® S) satisfying
A(sd = Sl ® SI, A(sz) = S2 ® S2. (8)
From (6), we have
F(A(sj), A (S2)) = F(sj ®Sj,S2®S2):S F(Sj,S2)2.
From the monotonicity of F, it follows that F(SI, S2) :s F(A(s]), A(S2)) :s F(SI, S2)2,
which implies that F (Sj, S2) = 0 or 1. In other words, SI = S2 or SI and S2 are
completely distinguishable (cf. Corollary ] I).
Suppose that SI = S2, then one has a time evolution A E A(S, S ® S) defined by
A(s):= S®Sj. (Physically, this is nothing but a preparation of a fixed state SI.) It is
obvious that this jointly clones SI and S2. Next, suppose that SI and S2 are completely
distinguishable. Namely, there exists a measurement M = (m" m2) E M (S) such
that ml (SI) = l , m, (S2) = 0 (and thus m2(sd = 0, m2(s2) = I). Then, A(s) :=
m 1(s)s I ® SI + m2(s )S2 ® S2 for any S E S defines a time evolution A E A(S, S ® S)
satisfying the cloning condition (8). (Notice that mj (s), m2(s) ::: O. m, (s)+m2(s) = 1,
and thus ml (S)Sl ® Sj + m2(s)s2 ® S2 E S ® S from the convexity of S ® S. The
affinity of A follows from the affinity of m.) 0
LEMMA ] 3. For any GPT with at least two distinct states, there exist two distinct
states which are not completely distinguishable.
Proof: Let SI 1= S2 E S. Assume that any two distinct states are completely
distinguishable. Then, we have Fts«, S2) = O. From the convexity of S, there
exists a state S := ~SI + 4S2 1= s,. From the concavity of F, we have F(sJ, s) :::
184 G. KIMURA, K. NUIDA and H. IMAI

~ F (Sl, sd + ~ F (Sl' S2) = ~. Therefore, Sl and s are distinct states which are not
completely distinguishable. 0

A physical process which clones any unknown state we call a universal cloning
machine.

PROPOSITION 14 (No-cloning). In any GPT with at least two distinct states,


there are no universal cloning machine.

Proof: This follows from Theorem 12 and Lemma 13. o


In usual application, cloning is often considered only for pure states. A physical
process which clones any unknown pure states we call a universal cloning machine
for pure states. However, such cloning is possible if and only if GPT is classical.

PROPOSITION 15. GPT is classical iff there is a universal cloning machine for
pure states.

Proof: Notice that classical systems are characterized by the fact that all the
pure states are completely distinguishable [9]. This fact and Theorem 12 complete
the proof. 0

THEOREM 16 (Information disturbance). In any GPT, any attempt to get in-


formation to discriminate two pure states which are not completely distinguishable
inevitably causes disturbance.

Proof: Let Sl, S2 E SA be two pure states which are not completely distinguishable,
i.e. 0 < F (Sl, S2)' Assume that there is a physical way to get information to
discriminate Sl, S2 without causing any disturbance to the system. This implies that
we have a time evolution A E A(SA, SA 0SB) and initial state to E SB such that
the reduced states to the system A are the same,
A(sl 0 to)A = Sl, A(S2 0 to)A = S2·
Since Sl, S2 are pure states, there exists no correlations between the systems A an
B, and hence one gets

A(sl 0 to) = Sl 0 tl, A(S2 0 to) = S2 0 ti.

for some tl, tz E SB. From the monotonicity of F and Proposition 9, it follows that
F(SI, S2) = Fts, 0 to, S2 0 to) ~ F(A(sl 0 to), A(S2 0 to)) = F(SI, s2)F(tl, t2). Since
0< F(SI, S2), we have Ftt«, t2) = 1 and thus tl = ti. Therefore, to get information
to distinguish Sl and S2, one has inevitably to disturb at least one of these states. 0

No cloning theorems are discussed in [9] with completely different methods.


In [13], we have proved Theorem 16 using the Kolmogorov distance. Essentially
the same proof as above is given in [12].
DISTINGUISHABILITY MEASURES AND ENTROPIES 185

5. Indecomposable and complete measurement in general probabilistic theories


5.1. Indecomposable effect
In quantum systems, a fundamental POVM element E is that with one-dimensional
range, called a rank-one POVM element. Let us define the corresponding notions
in any GPT, which we are going to call an indecomposable effect.
DEFINTION 17. An effect e E [(5) we call indecomposable if (i) e i= 0 and
(ii) for any decomposition e = el + e: into the sum of two effects el and ez, there
exists c E JR such that e,= ceo We denote the set of all the indecomposable effects
on 5 by [ind(5) C [(5).
It is easy to see that the mentioned above c satisfies 0:::: c :::: 1.
Here, we show some general properties of effects and indecomposable effects.
PROPOSITION 18. Let e be a nonzero pure effect on 5. Then, there exists a state
s such that e(s) = 1. Since 5 is compact, such state can be taken to be a pure
state.
Proof: Suppose that there are no states s such that e(s) = 1. Then, from the
compactness of 5, we have
supe(s) = maxe(s) =: x < 1.
SES SES

From this, e := ejx is an effect which is neither e nor the zero effect O. Since we
have the identity
e=xe+(l-x)O,
this contradicts that e is a pure effect.
Let s = Li pis, (Pi> 0, Li Pi = 1) be a pure state decomposition of s. Then,
it is easy to see e(si) = 1 for any pure state s.. Thus, we can take a pure state s
such that e(s) = 1. 0
COROLLARY 19. Let e be a pure effect which is not u. Then there exists a state
S such that e(s) = O. Such state can be taken to be a pure state.
Proof: Since e(i= u) is pure, the effect e = u - e is a nonzero pure effect.
From Proposition 18, there exits a pure state s such that e(s) = 1 - e(s) = 1. Thus,
e(s) =0. 0
Next, we show that any nonzero effect has a decomposition with respect to
indecomposable effects.
PROPOSITION 20. In any GPT, for every 0 i= e E [(5), there exists a finite
collection of indecomposable effects e, E [(5), 1 :::: i :::: r, such that e = L~=l e.. In
particular, in any GPT, there exists an indecomposable effect.
(See Appendix A for the proof.) Moreover, we have the proposition.
PROPOSITION 21. In any GPT, there exists an indecomposable and pure effect.
Proof: To prove this, we use the following lemmata.
186 G. KIMURA, K. NUIDA and H. IMAI

22. Let e E E be an indecomposable effect and let q := maxSEs e(s).


LEMMA
(Note that °< q :::: 1.) Then, e:= ~ is an indecomposable effect.

LEMMA 23. If e is an indecomposable effect such that there exists a state s


satisfying e(s) = 1, then e is a pure effect.
(See appendix A for the proofs.) From Proposition 20, there exists an indecomposable
effect. From Lemma 22, one can construct an indecomposable effect from any
indecomposable effect such that e(s) = 1 for some pure state. From Lemma 23, it
is an indecomposable and pure effect. D

In the following, we give a characterization of indecomposable effects for the


classical, quantum and hyper cuboid systems.
[Classical Systems] Let S; be the state space of a classical system introduced in
Section 2. Remind that any state p = (PI, ... , Pd) ESc has the unique decomposition
with respect to pure states: s = L/L P/Lp(/L). Thus, an effect e on S; is completely
characterized by d numbers of value x/L := e(p(/L)) E [0, 1] (p., = 1. ... , d). Conversely,
for any given x/L E [0, 1] (p., = 1, ... , d), there exists an effect e such that e(p(/L)) =
X w Let e(/L) E £.(Sc) (p., = 1, ... , d) be the effects defined by e(/L)(p(v)) = D/Lv, For
classical systems, the indecomposable effect is characterized as follows.

PROPOSITION 24. An effect e E £,(Sc) is indecomposable iff there is one pure


state at which the value of effect is nonzero. In other words, £'ind(Sc) is characterized
by £'ind(Sc) = p.e(/L) I 'A E (0,1], u. = 1, ... , d}.

Proof: First, let e be an effect such that there exists one pure state, say v". at
°
which the value of effect is nonzero. Then, one has e i= and e = 'Ae(/L) E £.(Sc)
for some 'A E (0, 1]. Let e = f + g for f, g E £.(Sc). Then f(p(v)) = ° for any
v i= u. and it follows that f = !(P:f1.)) e. Therefore, e is indecomposable. Next, let e
be an indecomposable effect. Assume that there are at least two nonzero pure states,
say p(/LO) , p(/L\) (p.,o i= p.,l = 1, ... , d) for which the effect values are nonzero. Let
x/L := e(p(/L)). Let f, g be effects defined by f(p(/L)) = x/LOD/L/LO and g = e - f.
Obviously e i= cf for any c E JR, and it contradicts that e is indecomposable. Since
e i= 0, there is the only one pure state at which the value of effect is nonzero. D
[Quantum Systems] Next, we show that indecomposable effects for quantum
systems are characterized by one-dimensional projections, i.e. rank-one POVM
elements. Let H be the d-dimensional Hilbert space and let Sq be the set of all
density operators on H. A nonzero POVM element E we call indecomposable iff
the corresponding effect eO := tr(E·) is indecomposable. It is easy to see that
a POVM element E is one-dimensional iff there exists 'A E (0, 1] and a unit vector
1/J E H such that E = 'A11/J} (1/JI·
PROPOSITION 25. A POVM element E E £.q is indecomposable if and only it is
a rank-one POVM element.
DISTINGUISHABILITY MEASURES AND ENTROPIES 187

Proof: Let E = AI1/F )(1/F 1be a rank-one POVM element with a unit vector 1/F E H
and A E (0, I]. Let E = E) + E2 for some POVM elements E I , E2 ,
(9)
Let {1/Fn In be an orthonormal basis of H such that 1/FI = 1/F. Then , from (9), it follows
2
that (1/FjIE I1/Fj) = IIE: / 1/FjII Z = 0 (Vj :::: 2), and hence E I1/F j = 0 (Vj :::: 2). For any
~ E H, we have El~ = E I ('Ltl ( 1/Ftll~ )1/Fn ) = ( 1/FI ~ ) E ( 1/F = I E I 1/F)(1/FI~. Thus, E 1 has
the form of 1t/>)(1/F1 (where t/> := E I 1/F ). Finally, since E( is Hermitian, it follows that
there exists c' E JR such that t/> = c'1/F and hence E ] = c'I1/F )(1/F 1= cE, where c:= ~ .
This implies that E is indecomposable. Next, let E be indecomposable. Assume
that E is rank I POVM element for some I :::: 2, and let E = 'L:I=1 cnl1/Fn) (1/Fn l
(c, E (0, I]) be an eigenvalue decompo sition of E . Let E I := ClI1/FI)(1/FlI and
E 2:= 'L~=z cn l1/Fn)(1/Fnl. Obviously, they are POVM elements satisfying E = E I+E z.
However, for any c E JR, we have E f. CEI (for instance, E1/Fz = Cz f. 0 while
cE 11/FZ = 0). This contradicts that E is indecomposable. Since E f. 0, we conclude
that E is rank one POVM element. 0
[Hyper cuboid systems] Finally, let Scb be the state space of a d- dimensional
hyper cuboid system introduced in Section 2. To determine the indecomposable
effects in Scb, we present a general lemma which is also useful in later arguments
(see Append ix A for the proof).
L EMMA 26. If the state space S of a GPT contains at least two states, then
for every indecomposable effect e E £(S) we have e(s) = 0 for some s E S.
By virtue of this lemma, we obtain the following characterization of indecom-
posable effects in Scb '
PROPOSITION 27. An effect e E £ (Scb) is indecomposable if and 'only if it is
nonzero and it takes 0 at a d - I dimensional face (facet) of Scb'
Proof: First we consider the 'if' part. Suppose that e is nonzero and e takes 0
at a facet F of Scb ' Fix a state s E Scb such that s rt F. Note that e(s) > 0 since
e is nonzero . If e decomposes as e = el + ez with el, ei E £(Scb), then el (hence
ez) also takes 0 at F. This implies that el = Ae where A = ej (s)/e(s) E JR, hence
e is indecomposable.
Second , we consider the "only if" part. By Lemma 26, an indecomposable effect
e takes 0 at some state, hence at some pure state in Scb. By symmetry, we may
assume without loss of generalit y that e(so) = 0 where So = (0, 0, . . . , 0) E Scb' Let
si (i E {I , 2, ... , d}) be the vertex of Scb such that its j -th component is oij . Then
we have e = 'L1=1 e(si)ei where e, E £(Scb) maps (C I. cz... . . Cd ) to c.. Since e is
indecomp osable, it follows that e = Aei for some I :::: i :::: d and A E IR, therefore e
takes 0 at the facet {(CI , "" Cd) E Scb I c, = O} of S cb. 0
For example, the indecomposable effects in the squared system (i.e. when d = 2)
are listed in Table I, where aI, ... , a4 E (0, I] are parameters.
188 G. KIMURA, K. NUIDA and H. IMAI

Table 1. Indecomposable effects in Scb, d= 2

value at
effect (0,0) (0, I) (1,0) (1, I)

el 0 0 (Xl (Xl
e2 (X2 (X2 0 0

e3 0 (X3 0 (X3
e4 (X4 0 (X4 0

5.2. Indecomposable and complete measurements


U sing the indecomposable effects defined above, we define an indecomposable
measurement in any GPT as follows.
DEFINTION 28. In a GPT with a state space S, we say that a measurement
M = (m j) j is indecomposable if all m j E [(S) are indecomposable. The set of all
indecomposable measurements is denoted by Mind(S) or simply by Mind.
From Proposition 25, an indecomposable measurement is a generalization of
a one-rank POVM measurement in quantum systems.
PROPOSITION 29. In any GPT, there exists an indecomposable measurement, i.e.
Mind (S) =1= 0.
Proof: From Lemma 20, a decomposition of the unit effect u with respect to
the indecomposable effects gives an indecomposable measurement. 0
In quantum systems, rank-one PVM measurement plays a fundamental role in the
foundation of quantum physics, which describes a measurement of a nondegenerate
Hermitian operator. One can also define the correspondent notion in any GPT, which
we call a complete measurement.
DEFINTION 30. In a GPT with a state space S, we say that a measurement
M = (m j) j is complete if all m j E [(S) are indecomposable and extreme. The set
of all complete measurements is denoted by Mcomp(S), or simply by Mcomp.
It is easy to see that the set of extreme effects for classical systems is
characterized by [(Sc)ex = {e E [(Sc) I e(p(i» = I or 0 (i = 1, ... , d)} 2.
2Let e be an effect where e(p(i») = 1 or 0 for any i, Assume that there exist f, g E £(Sd and ).. E (0, 1)
such that e = At + (1 - )..)g. Since e(p(i») are 1 or 0 and ).. E (0,1), one can show that f(p(i») are also
restricted to 1 or O. Therefore, e = f = g, and e is an extreme point. Next, let e be an effect where there
exists iO such that e(p(io») E (0, 1). Let im E (I, ... , d) such that e(p(im») is a minimum value among all
e(p(i») f= 0, i.e. e(p(i») f= 0 =} e(p(i)) ::': e(p(im»). Let f, g be effects defined by f(p(i)) = Diim , i.e. f = e(im)
and g(p(i») = x;~:'m. It is easy to see that e f= f, g and e = At + (l - )..)g for X := e(p(im») E (0, 1).
Therefore, e is not an extreme point and this completes the proof. 0
DISTINGUISHABILITY MEASURES AND ENTROPIES 189

Hence, from Proposition 24, there is essentially a unique complete measurement in


classical systems given by Mcomp := (e(J))j, where ej(p(k») := Ojk (j, k = 1, ... , d).
More precisely, M = (m j) j is a complete measurement iff m j = eo(j) where
(J (j) is a permutation of (1, ... , d). On the other hand, in quantum systems,
a complete measurement is given by a rank-one PVM measurement. This follows
from Proposition 25 and the fact that a POVM element is extreme iff it is a projection
operator.
By definition, Mcomp(S) C Mind(S). However, the existence of the complete
measurements does not necessarily hold for any GPT (See Appendix B for a counter
example).

6. Some quantities related to entropy


In this section, we consider three quantities on S in any GPT which are related to
the notion of entropy. Indeed, all of them coincide with the Shannon entropy Hand
von Neumann entropy S in classical and quantum systems, respectively, and therefore
give generalizations of entropies in classical and quantum systems. However, as is
shown, they do not coincide in some GPTs, and do not satisfy some of properties of
entropy. In the following, let H(p), or simply H(Pi), denote the Shannon entropy for
a probability distribution P = (PI, ... , Pd): H(p) := - Li Pi log j». We also denote
it by H (X) when the random variable X is dealt with. The mutual information for
a random variable X and J is denoted by H(X : J) := H(X) + H(J) - H(X, J).
In quantum systems, the von Neumann entropy for a density operator P on 'H is
denoted by S(p) := - tr(p log p).
Let us consider a general GPT with a state space S. For any state S E S, we
denote by D(s) the set of all ensembles {Px; sxlx (s, E S, Px :::: 0, Lx Px = 1)
such that S = Lx PxSx. The set of all ensembles for s with respect to pure states
is denoted by p(s) C D(s); i.e. {Px; sxl E P(S) {} 5 = Lx pxSx, 5x E Spure. Note
that Hand S are concave. Both Hand 5 are positive and take the minimum value
o iff the state is pure. The following upper bound of von Neumann entropy is also
well known: for a probability distribution (Pi)i and a set of density operators {pili,

S(LPiPi) ::: H(Pi) + LPiS(Pi), (10)


i i

with equality iff density operators Pi are orthogonal to each other. See, for instance
[26], for the properties of the Shannon and von Neumann entropies.
In any GPT, let us define the following quantities for 5 E S:
51 (5) := inf H(m j(S)), (11)
M=(mj)jEMind

S2(S):= sup sup H(X: J), (12)


(Pxh}EP(S) M=(mj)jEMind

(13)
190 G. KIMURA, K. NUIDA and H. IMAI

In S2(S), H(X: J) is defined by a joint distribution Pxm}(sx) with an ensemble


{Px, sx} E P(s) and a measurement M = (m})}. From the definition and positivity
of the Shannon entropy and the mutual information, the positivity of S\, S2, S3 are
obvious. It is easy to see that S2 can be redefined with respect to 'D(s) and M.
LEMMA 31. We have
S2(S) = sup sup H(X: J).
{Px.sx}E'D(S) M=(mj)jEM
Proof: A straightforward computation shows that, for any {Px, sx} E 'D(s) and
M = (m})} E M, the value of H(X : J) is not decreased by replacing {Px, sx}
with the pure state decomposition of s obtained by decomposing every s, into
pure states, and by replacing M with the indecomposable measurement obtained
by decomposing every m j into indecomposable effects (cf. Proposition 20). This
implies the desired relation. 0
However, note that it is essential to use Mind and pes) for the definitions of
S\ and S3. Indeed, redefinitions of S\ and S3 with respect to Dts) and M give
trivial quantities: infMEM H(m }(s» = 0, inf{px.sx}E'D(s) H({px}x) = O.
Notice that all three quantities (11)-(13) are defined in physical language:
S\ (s) measures the minimum uncertainty of measurement among indecomposable
measurements in a state s; S2(S) measures the maximum accessible information (by
an optimal measurement) among any preparation of s (see below). Finally, S3(S)
measures the minimum uncertainty for a preparation of s with respect to pure states.
Indeed, preparing the states s, with a prior probability distribution Px, the
accessible information I({px, sxD is defined by SUPM=(mj)jEM H(X : J) where
the joint probability distribution between X and J (the measurement outcome for
a measurement M = (m))} ) is given by p(x, j) := Pxmj(sx). Therefore, from
Lemma 31, we have S2(S) = sup{Px,Sx}E'D(S) I ({Px, sxD, and thus we obtain the
following result.
PROPOSITION 32. In GPT, for any preparation of states {Px, sx}, the accessible
information is bounded as
(14)
where s := Lx PxSx·
Notice that, for quantum systems, the Holevo bound [27] gives an upper bound
of the accessible information by the Holevo X quantity: For a preparation of density
operators Px with a probability distribution Px,
(15)
x

In the following, we see that S2 coincides with the von Neumann entropy for
quantum systems. Thus, (14) gives a weaker bound than the Holevo bound for
quantum systems. (For the pure state ensemble, (14) gives exactly the Holevo bound
since the von Neumann entropy vanishes on pure states.)
DISTINGUISHABILITY MEASURES AND ENTROPIES 191

Now, we show that all three quantities (11)-(13) are generalizations of the
Shannon and von Neumann entropies for classical and quantum systems.
THEOREM 33. (i) In classical systems, SI (s), S2(S), S3(S) are the Shannon en-
tropies. (ii) For quantum systems, SI (s), S2(S), S3(S) are the von Neumann entropy.
Proof: (i) Let S; be the state space of a classical system. From Proposition 24,
any indecomposable measurement in classical system is given by O"i,fLe(fL)i,fL' where
Ai,fL :::: 0, Li Ai,fL = 1 for any f-l = 1, ... , d. Thus, for a state P = (PI, ... , Pd) ESc,
the probability distribution given by the indecomposable measurement is (Ai,fLPfL)i,fL"
Note that from the concavity of the function g(x) := -x logx (x E [0,1]) with the
convention g(O) = 0, it holds that g(h) :::: Ag(X), and thus we have

H(Ai,fLPfL) =L g(Ai,fLPfL) :::: L (L Ai,fL )g(PfL) = H(pfL)·


i.u. u. i

Thus, we have SI(P) :::: H(p). Since (e(fL)fL is an indecomposable measurement


with which the probability distribution is given by p, we have SI (p) = H (p ).
As mentioned before, a state space of a classical system is characterized by
a simplex. Thus, we have
S2(P) = sup H(X: J),
MEM
where the random variable X is described by the probability distribution p. Recall
that the mutual information can be written as H(X: J) = H(X) - H(XIJ) where
H(XIJ) denotes the conditional entropy, and it follows that
S2(P) = sup H(X: J) = H(X) - inf H(XIJ).
MEM MEM

°
Since there exists a measurement M = (m j) discrimining all pure states in a classical
system, we have inf MEM H(XIJ) = (i.e. the uncertainty of X conditioned on the
information of J is zero). Therefore, we have S2(P) = H(X) = H(p).
Again from the unique pure state decomposition, there exists a unique ensemble
{PfL' p(fL)}~=1 for any state P = (PI, ... , Pd) ESc' Therefore, we have S3(P) = H(p).
(ii) Next, we consider a quantum system described by a Hilbert space H. First,
let f be a concave function on [0, 1] such that f(O) = 0, and let p be a density
operator on H. Then, it is easy to show' that for all vectors 1fr E H such that
111fr II :::: 1, we have
f ((1fr 1p 11fr ) :::: (1fr 1f (p ) 11fr).
3Let P = 2:1=1 Pjlr/>j)(rPj be an eigenvalue decomposition
1 of p, Notice that {rPj}1=1 is an orthonormal
basis of Hand 2:1=1 1(1/IIpj)1 2 = 111/111 2 S L Thus (qj)1~: where qj ;= 1(1/IIPj)1 2 (j = I,.,., d) and
qd+1 ;= I - Lj 1(1/IIpj)12 , is a probability distribution. From the concavity of I, we get
d+l d+l
!«1/IIpI1/I)) = !(LPNj):::: LqJ!(Pj) = (1/I1!(p)11/I),
j=1 j=1

where Pd+1 ;= O.
192 G. KIMURA, K. NUIDA and H. IMAI

Let us fix any indecomposable POVM measurement (E j)j on quantum system


H, i.e. rank-one POVM measurement. We can write Ej = IVJj) (1/Jjl with a vector
1/J E H such that 0 < 111/Jj II ~ 1 and Li 11/Jj) (1/Jj I = TI. Recall that the von Neumann
entropy of p is defined by
S(p) := tr g(p)
with the concave function g(x) := -x logx with the convention g(O) = O. Applying
g to the above concave function f, we have

H (ej(p)) = L g«(1/Jj Ipl1/Jj)) :?: L (1/Jj Ig(p )11/Jj) = tr (g(P) L E j ) = tr g(p) = S(p) .
j j i
By considering the indecomposable measurement given by (14)j ) (4)j Dj, where 4>j s
are complete eigenvectors of p, we obtain Sj(p) = S(p) .
Next, from the Holevo bound (15), we have
S2(P) ~ S(p) - inf (LPxS(Px)) = S(p).
(Px ,Px}E1)(p) x
The final equality follows from the eigenvalue decomposition p = Lx Px l4>x) (4)x 1 and
S(I4>x)(4>xD = O. Again with the decomposition {Px, Px = l4>x)(4)xll of eigenvalues
and eigenvectors, there exists an optimal measurement M, := l4>j) (4)jl to discriminate
Px, and thus one has H(X : J) = H(p). Since S(p) = H(p) , we have S(p) =
H(X: J) s S2(p).
Finally, let {Px, Px l E P(p) be a pure state decomposition of p. Then, from the
inequality (10) and the fact that S(Px) = 0 for pure states Px, we have
S(p) ~ H({PxD ~ S3(P)·
Moreover, an eigenvalue decomposition p = Lx Px l4>x) (4)x 1 of p gives a pure state
decomposition such that Px = l4>x)(4)x I are orthogonal to each other, we have the
equality S3(P) = S(p). This completes the proof. 0
Notice that the fact that Sj, S2, S3 coincide with the von Neumann entropy
for quantum systems shows that we have alternative expressions with operational
meanings for the von Neumann entropy. The characterization of S by S3 has
been noticed by Jaynes [28]. Here, we remark that Sl could be defined by the
infimum of Shannon's entropy not among indecomposable measurements but complete
measurements. Then, it is easy to restate the above mentioned proof to show that Sj
coincides with the Shannon and von Neumann entropies for classical and quantum
systems. However, as we have noticed in Section 5.2, there exists a GPT where no
complete measurements exist. This is the reason why we have defined Sl among
indecomposable measurements.
In order to see the properties of Sl, S2, S3 in a general GPT, let us again consider
the squared system Ssq. Let hex) := -x logx - (1 - x) 10g(1 - x) be the binary
Shannon entropy.
PROPOSITION 34. In the squared system, for s = (Cl, C2) E Ssq, we have
Sj (s) = min[h(cj), h(c2)], (16)
DISTINGUISHABILITY MEASURES AND ENTROPIES 193

(17)

kic, +cz -1) s E R IL,


k(cl) S E R IR,

k(O) s E Rui,
k(cI) S E RzB ,
53 (s) = (18)
k(cz) S E R 3U ,

kic, +cz -1) S E R 3B ,

k(cz) S E R 4L,

k(O) S E R4R'

where k(x) := H({x, CI-X, Cz-X, 1+X-CI-CZ}) and the regions RIL, ... , R4R C Ssg
are given in Fig. 1(4).

(See Appendix A for the proof). See the graphs of 51, Si and 53 in Fig. 1(1)-1(3).
Moreover, in Ssg, the following relations among 51, Si and 53 are true.
PROPOSITION 35. For any S E Ssg,

5 1(s) :::: 5z(s) :::: 53(S)


(see Appendix A for the proof).

r-_---. s..
s.

Fig. I. In the squared systems, (1), (2) and (3) show the graphs of 51, 52 and 53. (4) specifies the region
RIL,···, R4R·

6.1. Concavity
In this section, we consider the concavity properties of 51, 5z and 53. It turns
out that 51 is concave on S in any GPT, while there exist GPT models where 5 z
and 53 are not concave.
PROPOSITION 36. In any GPT, 51 is concave on S.
Proof: Let (Px)x (x = 1, ... , m) be a probability distribution and let s, E S (x =
1, ... , m). Then, from the affinity of effects m j and the concavity of the Shannon
194 G. KIMURA, K. NUIDA and H. IMAI

entropy, we have

SI(LPxSx) = inf H(ej(LPxsx))


x M=(mj)jEMind x

~ inf LPxH(mj(sx))
M=(mj)jEMind x

~ L Px inf H(m j(Sx))


x M=(mj)jEMind

(19)
o
In contrast to SI, Sz and S3 are not concave for some GPT. It is easy to give
counter examples but it is obvious that concavity does not hold for the squared
systems from the Figs. 1(2) and (3).
Instead of concavity, we show that Sz satisfies the following weak concavities.
PROPOSITION 37 (Weak concavity). In any GPT Sz satisfies the following.

S o: s) > Lx p;si(sx)
Z Lx Px x - "
L....Jx PxSZ(Sx )'
(20)

SZ(LPxsx) ~ LP;Sz(sx), (21)


x x

SZ(LPxsx) ~ I~I LPxSz(sx), (22)


x x

SZ(LPxsx) ~ m:xpxSz(sx), (23)


x

for any {Px, SX}XEX E'D(S) (in (20), we interpret the right-hand side as 0 if SZ(Px)
are all 0).
Proof: To prove this proposition, we use the following lemma (see Appendix A
for the proof).
LEMMA 38. Let {Px, SX}XEX E'D(S), Px 1= O. Then for any value Jrx ~ 0, x EX,
such that Lx Jrx = 1, we have

Sz( L Pxsx) ~ L JrxPxSz(sx) .


x x

Using this lemma, (20) is proved by putting n, = PxSz(sx)/(Lx' PxISZ(SXI))


(here we may assume that the denominator of n x is nonzero, as otherwise the
claim is obvious); (21) is proved by putting Jr x = Px; (22) is proved by putting
Jrx = 1/IXI; and (23) is proved by putting Jrx = dxxO where Xo E X is such that
PxoSz(sxo) = max, PxSz(sx). 0
DISTINGUISHABILITY MEASURES AND ENTROPIES 195

PROPO SITION 39. In any GPT, 5 3 satisfies

53 ( L PxSx) :s H (px ) + L Px53(Sx).


x x

for any {Px. sx} E V (L x PxSx).


Proof: First, we show that this is the special case where there exists a pure
state decomposition which attain s the infimum of 53(.1' ) = inf{px:sxIEP (sl for any state
.I' E S. Let .I' := L x PxSx and let {p~ ; s; } E P (sx) be an optimal decomposition
of s, such that 53(sx) = H (p~ ). Since {Pxp~}x . )' E p es ), we have 53(.1') <
H (X , Y) := - L x,)' PxP:~: log (pxp:~ ) = - L x Px log Ps + L x Px(- L y P; log P:~ ) =
H (p x) + Lx Px 53(Sx).
In general cases, for any E > 0 there exists a pure state decomposition such
that 153 (sx) - H (p~) I < E and the above proof is easily restated; by taking E ~ 0
completes the proof. 0
Thus, 53 satisfies the same upper bound (10) of the von Neumann entropy in
any GPT.

6.2. Measures of pureness


Since both the Shannon entrop y and the von Neumann entropy vanish if and
only if the state is pure, they can be considered as measures of pureness. (Note
also that they take the maximum value iff the state is the maximal mixed state.)
We show that 52 and 53 have this desired property for any GPT, while 51 does
not satisfy this in general.
PROPO SITIO N 40. For any GPT, (i) 52(.1' ) = 0 if and only (f s is pure. (ii)
53(.1' ) = 0 if and only if .I' is pure.
Proof : (i) Let s E S be a pure state. Since s is an extreme point of S , P (p )
has essentially a unique (trivial) decomposition: {I ; s } with H (X ) = O. Thu s, we
have
52(.1' ) = sup H (X: J) = - inf H (X IJ ) :s O.
M =(l1l jl j EM M

To see the converse, let 5 (s ) = 0 for s E S and let .I' = PIS1 + P2S2, where
PI E (0, 1), PI + P2 = 1 and .1'1, .1'2 E S. Since 52(.1' ) = 0 and {Px; sx}x=1,2 E D(s ),
we have
H (X : J ) = 0
for the random variable X = 1, 2 and for any M = (m j) E M . This implies that
the joint probability ptx , j) := Pxln j (sx) is a product state, or equivalently. the
conditional probability p (j [r ) := p (x , j) / Px = In j (sx) is independent of x (notice
that PI , P2 =1= 0). In particular, we have In j (.1' 1) = p (j 11) = p (j 12) = In j (.1'2). Since
this holds for any effect In i - we have .1' 1 = .1'2 from the separating propert y of states.
Therefore, .I' has only the trivial decompo sition and is a pure state.
196 G. KIMURA, K. NUIDA and H. IMAI

(ii) Let s E S be a pure state and thus P(s) has essentially the unique
(trivial) decomposition {I; s}, where H(X) = O. Thus, we have 53(S) = H(X) = O.
Conversely, let 53(S) = O. Then, for any {Px; sx}x E P(S), it follows that H(X) = O.
Assume that s is not a pure state. Then, we have {Px, sxh E P(S) where PXl' PX2 > 0
for some XI, X2 E X. However, this contradicts that H(X) = O. Therefore, s is a pure
~~. 0
In contrast to 52 and 53, 51 does not have this property. For instance, from
(16), 51 (s) = 0 for any state s on the boundary (four edges) of Ssg. (see Fig. 1(1).
Note that s on edges but not on vertices is not a pure state.) For general GPT, we
show the following.
PROPOSITION 41. In any GPT, 51 (s) = 0 implies that s is on the boundary
of S.
Proof: It suffices to consider the case when S has at least two states. To prove
this proposition, we use the following two lemmata (see Appendix A for the proofs).
LEMMA 42. Let k, £ ::: 1 be integer. Let h(x) = -x logx. If XI, ... , Xe E [0, II k]
and Lj Xj = 1, then H(x) = Lj h(xj) ::: logk.
LEMMA 43. For any s E S, the map I, : [(S) x S ---+ llt fs(e, t) = e(s) - e(t),
is continuous.
Let s E S be such that 5 1(s) = O. First we show that sUPCe,t) fs(e, t) = 1. Let
k ::: 2 be any integer. Since 51 (s) = 0, there is an indecomposable measurement M =
(mdi E Mind such that H(mi(s)) < hOI k) « logk). Then we have mi(s) ::: I-II k
for some i, as otherwise we have a contradiction as follows: If mio (s) E (1I k, 1-1 I k)
for some io then we have H(mi(s)) ::: h(mio(s)) > hOI k); while if mi(s) .::: II k
for all i then we have H(mi(s)) ::: logk by Lemma 42. For this m., Lemma
26 implies that there is a state t E 5 such that m, (z) = O. This implies that
f~(mi,t)::: I-11k. Since k >: 2 is arbitrary, we have sUPCe,t)f~(e,t) = 1. Since
[(5) x S is compact, Lemma 43 implies that fs(e, t) = 1 for some e E [(5) and
t E S, therefore e(s) = I and e(t) = O. This implies that e is not constant and s lies
in a supporting hyperplane of S, hence s is on the boundary of S, as desired. 0
Note that, in Ssg, the converse is also true: all states on the boundary s satisfy
51 (s)= O. However, this is not the case for any GPT. In particular, one can
construct a GPT where 51 (s) i- 0 even for a pure state s. For instance, consider
a GPT introduced in Appendix B with the state space S C ]R2, which has the
four pure states (0,0), 0,0), (0,1), and (2,2). Then any indecomposable effect
in S is of the form se, such that 0 < 'A :s 1 and e, is one of the four effects
in Table 2 in Appendix B. This implies that for any indecomposable measurement
M = (mi)i E Mind we have m, (0, 0) :s 2/3 for all i, therefore 51 (0, 0) > 0 (see
Lemma 42). Thus, for general GPT, neither directions of "s is pure {} 51 (s) = 0"
do not hold in general. In the next section, we consider a class of GPTs with fairly
fine property.
DISTINGUISHABILITY MEASURES AND ENTROPIES 197

7. Principle of equality for pure states and symmetric GPT


In the last part of the preceding section, we considered a GPT where for some
pure state s it holds that 51 (s) > 0 (see GPT in Appendix B). However, the
structure of state space is asymmetric and might be just a toy model for GPTs.
On the other hand, both classical and quantum systems have a certain class of
symmetric structures: In particular, there are no special pure states which have
different properties from another pure states. We call this the principle of equality
for pure states and it can be formulated as follows.

DEFINTION 44 (Equality for pure states). We say that GPT satisfy the principle
of equality for pure states if, for any pure states Sl, S2 E S, there exists a bijective
affine map f on S such that S2 = f(s]). A GPT satisfying this property we call
a symmetric GPT.

It is easy to see the following.

PROPOSITION 45. Classical, quantum, and hyper cuboid systems are all symmetric.

In particular, notice that, for quantum systems for any pure states PI =
/l/!J)(,ifill, P2 = !1/r2) (0/21,
there exists a unitary operator U such that P2 = UpIUt.
We show that 51 vanishes for any pure state in a symmetric GPT. To see this,
we first show the lemma.

LEMMA 46. For any GPT, there exists a pure state s such that 51 (s) = O.

Proof: Let el be an indecomposable and pure effect (see Proposition 21), and let
u - el = ei +...+ em be an indecomposable decomposition of u - e I (see Proposition
20). Then, M = (ej)r=l is an indecomposable measurement. From Proposition 18,
there exists a pure state s such that el (s) = 1. Thus, we have H (e j (s)) = 0, and
SI(S) = 0. 0

PROPOSITION 47. Let S be the state space of a symmetric GPT. Then, 51 (s) = 0
for any pure state s.

Proof: From Lemma 46, there exists a pure state So such that 51 (so) = O. For any
pure state s, there exists a bijective affine f such that So = f (s). Let M = (m j) j be
an indecomposable measurement such that H (m j (so)) = O. Then, it is easy to see
that it := (m j) j where m j := m j 0 f is an indecomposable measurement. Therefore,
it follows that H(mj(s)) = H(mj(so» = O. Thus, we have proved that SICS) = 0
for any pure state s. 0

Therefore, in a symmetric GPT, 51 measures the pureness in some sense. However,


as the squared GPT shows, the converse of Proposition 47 does not hold in general,
even among symmetric GPTs.
198 G. KIMURA, K. NUIDA and H. IMAI

8. Concluding remarks
We have discussed some distinguishability measures (especially, the Kolmogorov
distance and fidelity) in any GPT. In a similar way, for quantum information
theory, it will be convenient to use these measures in constructing an information
theory in GPT. Indeed, we have reformulated the no-cloning theorem and the
information-disturbance theorem using fidelity.
We have also proposed and investigated three quantities related to entropies
in any GPT. All of them are generalizations of the Shannon and von Neumann
entropies for classical and quantum systems, respectively. However, they are in
general distinct quantities, as the squared system gives the example. The concav-
ity of 51 in any GPT holds while it breaks for 52 and 53 in some GPT. 52
and 53 provide a measure of pureness, while 51 does not. However, in a sym-
metric GPT which satisfies the principle of equality of pure states, it follows
that 51 (s) = 0 for any pure state s. In the attempt to find principles of our
world, which are described by a quantum system at least for the present, we
think that symmetric GPTs are sufficient to consider by assuming the principle
of equality for pure states. However, let us remark here that both classical and
quantum systems satisfy a stronger principle, which we call strong equality for
pure states or equality for distinguishable pure states which can be formulated as
follows.

DEFINTION 48. We say that GPT satisfies the principle of strong equal-
ity for pure states if it satisfies the following: Let lSi E Spure}7=1 and {ti E
Spure}T=l (let n :::: m) be two distinguishable sets of pure states, i.e. there ex-
ists a measurement M = (mi)i (N = (ni)i) such that mi(Sj) = 8ij (ni(tj) = 8ij).
Then, there exists a bijective affine map f on S such that t, = f (s.) (i =
1, ... , m).

Notice that the squared GPT is symmetric but does not satisfy this strong equality
for pure states. (For instance, consider {(O, 0), (0, I)} and {(O, 0), (1, I)}.) It might
be interesting to consider these kinds of stronger conditions which classical and
quantum systems satisfy. In particular, we do not know any principles which make
the converse of Proposition 47 true.

Acknowledgment
We would like to thank for useful comments and discussions to Dr. Imafuku
and Dr. Miyadera. This work was partially supported by Grant-in-Aid for Young
Scientists (B) (No.20700017 and No.22740079), The Ministry of Education, Culture,
Sports, Science and Technology (MEXT), Japan.

Note added. Related but independent works for entropies in GPT have appeared
recently in [31, 32] while we were completing this paper. We will investigate the
relations between our work and the results there in the near future.
DISTINGUISHABILITY MEASURES AND ENTROPIES 199

A. Proofs of some propositions


Proof of Proposition 2: Notice that for any measurement M = {mil E M(S'),
and any affine map A E A(S, S'), we have another measurement N = {nil E M(S)
where n, := m, 0 A. Let M = {mil E M(S') be an optimal measurement which
attains the maximum,

Then, we have
1
D(A(sd, A(S2)) = "2 L Ini(S\) - ni(s2)! ::'S ot»; S2). o
I

Proof of Proposition 3: Let M = {mil EM be a measurement which satisfies

Then, we have

D(~PiSi' Lqiti) = ~ L ILPjmi(Sj) - Lqjmi(tj)!


I 1 I } }

::'S L pjD(s, t) + Dc(Pi, qi),


j

where we have used (i) the affinity of m., (ii) the triangle inequality of I· I, and
(iii) Li m, = u. 0
Proof of Proposition 6: The same proof as for Proposition 2 can be used. 0
Proof of Proposition 7: Let M = (mdi be an optimal measurement which attains
the minimum,

Using the affinity of m, and the Schwarz inequality between vectors (JPimi.(S;))i
and (Jqimk(ti))i, one gets

F(LPiSi, Lqiti) = L LPimk(Si) Lqjmk(tj)


i i k i j

~L L JPimk(sdqimk(ti) ~ L Jpiq;F(Si, ti). 0


k
200 G. KIMURA, K. NUIDA and H. lMAl

Proof of Proposition 9: (i) Let SI, Sz E SA, t E SB, and let M = (mdi E
M(SA) be an optimal measurement such that F(SI, sz) = Li Jmi (Sdmi (sz). Then,
fi = mi ® UB gives a measurement F := Udi E M(SA ®SB) and F(SI, sz) =
Li J fi(sI ® t)fi(sz ® t) ?: F(SI ®t, sz®t). Conversely, let G = (gdi E M(SA ®SB)
be an optimal measurement such that F(SI ® t, si ® t) = Li Jgi(SI ® t)gi(SZ ® t).
Then, mi(S) := gi(S ® t), "Is E SA, gives a measurement M = (mdi E SA. Thus,
F(SI ® t, Sz ® t) = Li Jmi(SI) ® mi(sZ) ?: F(SI, sz).
(ii) Let s, t E SA ®SB and SA, tA be the reduced states from s, t to the
system A. Let M = (mdi E SA be an optimal measurement such that F(SA, tA) =
Li Jmi(sA)mi(tA). By noting that gi = m, ® UB gives a measurement on SA ®SB
and gi(S) = mi(sA), gi(t) = mi(tA), one has F(SA, tA) = Li Jgi (S)gi(t) ?: F(s, t).
(iii) Let SI, Sz E SA, tI, tz E SB, and let M = (mi)i E M(SA) and N =
(nj)j E M(SB) be optimal measurements such that F(s}, sz) = Li Jmi(sI)mi(sZ)
and F(tI, tz) = Lj Jni (t})nj (tz), respectively. Then, gij = m, e nj E [(SA ®SB)
gives a measurement G = (gij)ij E M (SA ® SB), and

rt»; SZ)F(tI, tz) = L J gij(SI ® tdgij(SZ ® tz) ?: F(SI ® tI, Sz ® tz). D


ij
Proof of Proposition 10: The proof is essentially the same as in [25]. Let
M = {mil be an optimal measurement which satisfies F(s, t) = Li JPiqi where
Pi = m.ts), qi = m.tt). It follows that L/.fiii - J[ii)z = Li Pi + Li qi +
2 Li JPiqi = 2(1 - F(s, t)). Noting that 1J[ii - J[iil :s 1J[ii + J[iil, we have
Li(J[ii - J[ii)z = Li 1J[ii - J[ii11J[ii - J[iil :s L IPi - qil :s 2D(s, t). Next, let
N = Uil E M be an optimal measurement which satisfies D(s, t) = ~ Li Iri -
sil, where r, = ni(s),si = ni(t). Then, we have D(s,t)z = t(Li Ir; - sil)Z =
z)
t(Li IJri - JSiIIJri + JSiI)z :s t(Li IJri - JSilz)(Li IJri + JSil = t(L;CJri-
JSi)Z)(Li(Jri+JSi)Z) = *(Li ri+ Li s, -2 Li JriSi)(Li ri+ Li Si+ 2 Li Jrisd =
(l - Li Jrisi)(l + Li Jrisd = (l - (Li Jrisd z) :s 1 - F(s, t)z, where we have
used the Schwarz inequality. D
Proof of Proposition 20: In the proof we use some terminology from convex
geometry. We say that a subset C of a finite-dimensional Euclidean space ]RN is
a cone if v E C and A ?: 0 imply Av E C (hence 0 E C). We say that a closed
convex cone C is pointed if C n -C = {OJ. In the proof of Proposition 20, we use
the following fact for pointed cones.
LEMMA 49 ([29, Theorem 3.3.15]). A closed convex cone C C]RN is pointed if
and only if there is a linear functional f on ]RN such that C' = {v E C I f(v) = I}
is compact and satisfies C = {AV I v E C', A?: OJ.
We continue the proof of Proposition 20. Put N = dimeS) < 00 and let
S C V =]RN (recall that now S is finite-dimensional). Choose SI, ... , SN+I E S
such that V is the affine hull of these N + 1 points. Then any affine functional
DISTING UISHABILITY MEASURES AND ENTROPIES 201

on S extends to a unique affine functional on V, therefore the set Af't1 of all


nonnegative affine functionals f on S can be embedded in V' = JRN+I where the
i -th coordinate signifies the value at si. Now the embedded image of Af't1 in V'
is a pointed clo sed convex cone , where the clo sedness follows since the elements
f of the set Af't1 are characterized by closed relations among the values of f at
the point s s.. Thus, by Lemm a 49 , there exists a linear functional g on V' such
that C = {e E Af't11 gee) = I} is compact and satisfies Af't1 = {Ae le E C , A :::: O}.
Note that C is convex by definiti on.
Let 0 #- e E £(5). Then we have Ae E C for some A > 0 by the property of
C. Since C is compact and convex, the Krein-Milman 's theorem implies that 'Ae
can be written as a finite convex combination Ae = L x Pxex of extreme points ex
of C. Since S is compact, by taking a sufficiently small J1, > 0 it follow s that
ex = J1,ex E £(S) for every x. Moreover, choose an integer k > 0 such that kAJ1, :::: 1.
Then we have a decomposition

e = k "L...... -Px-ex,
x kAJ1,
of e into a finite collection of effects (note that 0 :::: Px/ (kAJ1,) :::: 1).
Our remaining task is to show that each qxex, where qx = Px/(kAJ1, ), is an
indecomposable effect prov ided qx > O. Let qxex = e' + e" with e' , e" E £ (5),
e', e" #- O. Then we have ex = (qxJ1, )- I (e' + e"). By the property of C , there exist
v', v" > 0 and s, e" E C such that e' = v'e' and e" = «r'. We have ex = lJ'e' + lJ"e",
where IJ' = (qxJ1,) - lv' > 0 and IJ" = (qxJ1,) -IV" > O. Mor eover, by the definition of
C , we have
I = g(e x) = lJ'g(e' ) + lJ"g (e") = IJ' + n" .
Since ex is an extreme point of C , it follow s that ex = e' = s:
therefore
e' = v'e x = (v' / (J1,qx»qxex. Hence qxex is indecomposable as desired, concluding
the proof of Proposition 20. 0
Proof of Proposition 22: It is easy to show e e
is an effect. Let = el + e2 be
an effect decomposition of e. Then, e = qe, + qe2 is an effect decomposition of e
since q :::: 1. Since e is indecomposable, there exists c E JR such that qe , = ce, or
e] = ceo Thus, e is indecomposable. 0
Proof of Proposition 23: Let e = Ae] + (l - A)e2 be a convex decomposition of
e with A E (0,1). It is easy to see that e, (s) = e2(s) = 1. Since Ae], (l - A)e2 E £
and e is indecomposable, we have Ae] = ce for some c E R Applying this to s ,
we have A = c, and thus e, = e. Therefore, e is a pure effect. 0
Proof of Lemma 26: First we show that an indec omposable e is not con stant
on S . Since S has at least two states, the separation prop erty of states implies
that a nonconstant effect f E £(S) exists. If e takes constantly C E (0, 1], then the
decomposition e = cf + c(u - j) contradicts that e is indecomp osable. Hen ce e is
not constant. Second, if e does not take 0 at any state, then we have e(s) :::: C
for some c > 0 and all s E S since e is continuous and S is compact. Now the
202 G. KIMURA, K. NUIDA and H. IMAI

decomposition e = cu + (e - cu) contradicts that e is indecomposable. Hence e takes


° at some state. 0
Proof of Proposition 34: First we compute 51 (s) for s = (CI , C2) E S sq. Let
M = (mi)i be an indecomposable measurement. To compute 51, it suffices to
consider the case when M contains at most one effect m, of each of the four types
listed in Table 1; indeed, if mi l and mi2 are of the same type (i.e. mi2 = Ami] for
some A E 1R), then by replacing the pair of mil and mi2 with mil + mi2 the value
of H (m, (s) does not increase. Thus we may assume without loss of generality
that M consists of the four effects in Table 1 with parameters al = a2 = a,
a3 = a4 = f3 := 1 - a for some a E [0,1]. Now, by putting g(x ) = -xlogx we
have
H(mi(s» = g(acI) + g(a(1 - CI» + g(f3c2) + g(f3(1 - C2»
= g(a) + ah(c]) + g(f3) + f3h(C2)
= h(a) + ah(c]) + (1 - a)h(c2).

a = °
Since the right-hand side is concave on a E [0, 1], it takes the minimum at either
or a = 1, hence we have 51 (s) = min[h(c]) , h(C2)], as desired.
Second, we compute 52(S) for s = (CI , C2) E S sq. Let {Px, sx}x E pes) with
Sx = (Cx,l , Cx,2) and M = (m j)j E Mind, Again, it suffices to consider the case
when M contains at most one effect m j of each of the four types listed in Table
1; indeed, if m hand miz are of the same type (in the above sense), then by
replacing the pair of m hand m ii with mh + miz the value of H (X : J) does not
change, Thus we may assume without loss of generality that M consists of the
four effects in Table 1 with parameters al = a2 = a, a 3 = a4 = f3 := 1 - a for
some a E [0, 1]. Now a direct calculation implies that
H(X : J) = h(a) + ah(c]) + f3h(C2)
- LPx(h(a) +ah(cx,l) + f3h(Cx,2»
xeX

x x
Since all the pure states (Cx ,l, Cx,2) in Ssq satisfy Cx,l E {O, I} and Cx,2 E {O, I}, we
have H(X : J) = ah(cl) + f3h(C2) = ah(cI) + (1 - a)h(c2), which is independent
of the given decomposition {Px, sxlx of s. This implies that 52(S) = sUPX,J H(X :
J) = max[h(c]), h(C2)], as desired.
Finally, we compute 53(S) for s = (CI , C2) E S sq. By the reason similar to the case
of 51, to compute 53(S) it suffices to consider a decomposition {Px, sx}xeX E pes) such
that all Sx are different pure states. Thus we may assume that X = {00, 01, 10, 11},
Soo = (0,0), SOl = (0, 1), SIO = (1,0) and Sll = (1, 1). Now by putting Pll = P we
have
PIO = CI - P, POI = C2 - P, Poo = 1 - c, - C2 + p .
In the above expression, we have Px E [0, 1] for every x if and only if Pm :s P :s PM ,
DISTINGUISHABILITY MEASURES AND ENTROPIES 203

where
Pm = max[O, Cl + C2 - 1],

Hence we have S3(S) = infpm::op::OPM H(px). Now a direct calculation shows that

( dd )2 H(px) = - L -1 < 0
P XEX Px

for any P E (Pm, PM), therefore H(px) takes the minimum at either P = Pm or
P = PM: S3(S) = min[H(Px)lp=Pm' H(Px)lp=PM]'
First we consider the case that Cl .:s C2 and Cl + C2 .:s 1 (i.e. S E R2U or S E R2B),
therefore Pm = 0 and PM = Cl· If P = Pm then we have (Px)x = (1-c] -C2, C2, Cl, 0),
while if P = PM then we have (PxL = (1 - C2, C2 - Cl, 0, Cl). This implies that

diff M - m := H(Px)lp=PM - H(px)IP=Pm


= g(C2 - Cl) + g(1 - C2) - g(C2) - g(1 - C] - C2),

where g (a) = -a log a, therefore

a diff _ I (1 - C2)C2
- 1 M-m - og - - - - - - - -
aC2 (C2 - Cl)(1 - Cl - C2)

which is now nonnegative by the conditions for Cl and C2. Since diff M- m = 0 when
C2 = 1/2, it follows that diff M - m .:s 0 and S3(S) = H(Px)lp=PM when 0 .:s C2 .:s 1/2
(i.e. S E R2B), and diff M- m ~ 0 and S3(S) = H(px)!P=Pm when I/2.:s C2 .:s 1 (i.e,
S E R2U). Hence the expressions for S3(S) in (18) for S E R2U and S E R2B are
proved. The claim for the remaining cases follows by considering suitable symmetry
of the state space Ssg. 0

Proof of Proposition 35: The first inequality 51 (s) .:s 52(s) is obvious by (16)
and (17). For the second inequality S2 (s) .:s S3 (s), by symmetry, we may assume
without loss of generality that S = (Cl, C2) E R2U, i.e. I/2.:s C2 .:s 1 - ci. This
condition implies that h(Cl) .:s h(C2), therefore S2(S) = h(C2). On the other hand,
(18) implies that S3(S) = g(Cl) + g(C2) + g(1 - Cl - C2), where g(x) = -x logx.
Thus we have

which is a decreasing function of C2 in this range, while S3(S) - S2(S) = 0 when


C2 = 1 - Cl. This implies that S3 (s) ~ S2 (s) for any s E R 2U, hence the claim
holds. 0

Proof of Lemma 38: For each x E X, let {q;, t;hEYx E D(sx) and M; = (mj) }Ox E
M. Then we have {Pxq;, t;}XEX,yEYx E D(s) and M' = (JrXmj)XEx.}EJx E M. Let
Z = {(x, y) I x E X, Y E Yx } and K = {(x, j) I x E X, j E Jx } denote the index sets
204 G. KIMURA, K. NUIDA and H. IMAI

of these ensembles, respectively. Then, by putting h(a) = -aloga we have


H(Z : K) = H(K) - H(K I Z)
= L h(Jrxmj(s» - L Px,q~' h(Jrxmj(t;'»
(x,j)EK (X',y)EZ
(x,j)EK
= L mj(s)h(Jrx) - L Px,q;'mj(t;')h(Jrx)
(x,j) (x',y),(x,j)

= (1- L Px,q~') Lh(Jrx) + LJrx(h(mj(s» - L px,q;'h(mj(t;'»)


(x',y) x (x,j) (x',y)

= LJrx(h(mj(s» - L px,q~'h(mj(t;'»).
(x,j) (x',y)

Since mj(s) = Pxmj(sx) + L(X',y)EZ;X'fxpx,q;'mj(t;') and h(a) is concave on


a E (0, 1), we have h(mj(s» ~ pxh(mj(sx» + L(x',y);x'fx px,q;'h(mj(t;'», therefore

H(Z: K) ~ LJrx(pxh(mj(sx» - L pxq~h(mj(t;»)


(x,j) yEYx
JrxPxH(Yx : Jx ).
= L
XEX
By taking the supremum over all {q;, t;hEYx and M x , x E X (see Lemma 31), it
follows that
S2(S) ~ sup H(Z : K) ~ L JrxPx S2(Sx).
XEX
Hence Lemma 38 holds. o
Proof of Lemma 42: We use induction on the number N of indices j such
that Xj ¢. {O,llk}. The claim is trivial if N = 0, while it cannot happen that
N = 1. We assume N ~ 2, and 0 < Xl :s X2 < 11k by symmetry. Now if
Xl + X2 :s 11k, then we have h(Xl) + h(X2) ~ htx, + X2), therefore H(x) ~ H(y)
where y = (Xl + X2, X3, ... ,Xi). On the other hand, if Xl + X2 > 11k, then we
have h(Xl) + h(X2) ~ h(Xl + X2 - 11k) + h(11 k), therefore H(x) ~ H(y) where
y = (Xl + X2 - 11 k, 11k, X3, ... , Xi). In any case, we have H(y) ~ logk by the
induction hypothesis. Hence H(x) ~ logk, as desired. 0
Proof of Lemma 43: Choose tl, ti. ... , tn E S (n = dimS + 1) such that
these are affine independent. Then any element t of S has a unique expression
t = Altl + ... +Antn, Lj Aj = 1. Let cp : S ~ jRn denote the map t t--+ (AI, ... , An).
Since S is a topological subspace of a finite-dimensional Euclidean space, Sand
DISTINGUISHABILITY MEASURES AND ENTROPIES 205

cp(S) c]Rn are homeomorphic via tp, By identifying S with cp(S) in this way, the
map Is is written as Is(e;AI,··.,A n) = Aje(tl)+"'+Ane(tn)' This implies that
I, is continuous, since both (e; AI, ... , An) H- e(tj) and (e; A], ... , An) H- Aj are
continuous. 0

B. GPT without complete measurements


Here we give an example of a GPT that has no complete measurements. First
note that any nonzero extreme effect e takes 1 at some state, as otherwise we have
a nontrivial expression e = c(c-Ie) + (l - c)O as a convex combination of effects,
where c = SUPSES e(s) E (0, 1).
We consider a GPT with the state space S which is the convex hull of four points
(0,0), (l, 0), (0, 1), (2,2) in ]R2. Then by the above observation and Lemma 26,
°
each indecomposable extreme effect e takes at an edge of S and takes 1 at some
state (precisely, at the vertex of S farthest from the edge). Thus there are four
indecomposable extreme effects in total, as listed in Table 2. Now it is obvious
that no complete measurements exist in the GPT, since the sum of the values of
indecomposable extreme effects at the state (0, 0) cannot be equal to 1.

Table 2. Indecomposable extreme effects in Appendix B

value at

effect (0,0) (1,0) (0, I) (2,2)

er 0 0 1/2 I

e: 0 1/2 0 I

e3 2/3 0 I 0

e4 2/3 I 0 0

REFERENCES

[1] G. Mackey: Mathematical Foundations of Quantum Mechanics, Dover 1963.


[2] H. Araki: Einfuhrung in die Axiomatische Quantenfeldtheorie, I, II (Lecture note distributed by Swiss
Federal Institute of Technology, 1962); Mathematical Theory of Quantum Fields, Oxford University Press
1999.
[3] S. P. Gudder: Stochastic Method in Quantum Mechanics, Dover 1979.
[4] M. Ozawa: Rep. Math. Phys. 18 (1980), 11.
[5] A. S. Holevo: Probabilistic and Statistical Aspects of Quantum Theory, North-Holland. Amsterdam 1982.
[6] G. Ludwig: Foundations of Quantum Mechanics I, II, Springer 1983.
[7J J. Barrett: arXiv:quant-ph/0508211.
[8] A. J. Short, S. Popescu and N. Gisin: Phys. Rev. A 73 (2006), 012101.
[9] H. Barnum, J. Barrett, M. Leifer and A. Wilce: Phys. Rev. Lett. 99 (2007), 240501; arXiv:0611295.
[lOJ G. M. D' Ariano: arXiv:0807.4383. To appear in "Philosophy of Quantum Information and Entangle-
ment" (Cambridge University Press, Cambridge UK); G. Chiribella, G. M. D' Ariano and P. Perinotti:
arXiv:0908.l583.
206 G. KIMURA, K. NUIDA and H. IMAI

[11] G. Kimura, T. Miyadera and H. Imai: Phys.Rev. A 79 (2009), 062306.


[12] C. Zander and A. R. Plastino: EPL 86 (2009), 18004.
[13] K. Nuida, G. Kimura and T. Miyadera: arXiv:0906.5419.
[14] H. Barnum, et al.: arXiv:0805.3553.
[15] B. Davies and 1. T. Lewis: Commun. Math. Phys. 17 (1970), 239; M. Ozawa: J. Math. Phys. 25 (1984),
79.
[16] M. Jammer: The Philosophy of Quantum Mechanics, Wiley 1974.
[17] G. Birkhoff and J. von Neumann: Ann. Math. 37 (1936), 823; A. M. Gleason: J. Math. Mech. 6 (1957),
88; C. Piron: Helv. Phys. Acta 37 (1964), 439; S. Pulmannova: Int. J. Theor. Phys. 35 (1966), 2309.
[18] P. Jordan et al.: Ann. Math. 35 (1934), 29; I. E. Segal: Ann. Math. 48 (1947), 930.
[19] C. A. Fuchs: quant-phl0205039.
[20] R. Clifton et al.: Found. Phys. 33 (2003), 1561.
[21] J. Barrett, L. Hardy and A. Kent: Phys. Rev. Lett. 95 (2005), 010503.
[22] G. Kimura et al.: in preparation.
[23] H. H. Schaefer: Topological Vector Spaces, 2nd edition, Springer 1999.
[24] S. R. Lay: Convex Sets and Their Applications, Krieger Publishing Company, 1982.
[25] C. A. Fuchs and J. Graaf: IEEE Trans. Info. Theory 45(4) (1999), 1216.
[26] M. A. Nielsen and I. L. Chuang: Quantum Computation and Quantum Information, Cambridge University
Press, Cambridge 2000.
[27] A. S. Kholevo: Probl. Inform. Transm. 9 (1973), 110; H. P. Yuen and M. Ozawa: Phys. Rev. Lett. 70
(1993), 363.
[28] E. T. Jaynes: Phys. Rev. 108 (1957), 171.
[29] J. Borwein and A. S. Lewis: Convex Analysis and Nonlinear Optimization, 2nd edition, Springer 2006.
[30] M. Takesaki: Theory of Operator Algebra I, Springer 1979.
[31] A. J. Short and S. Wehner: arXiv:0909.4801.
[32] H. Barnum et al.: arXiv:0909.5075.

You might also like