Professional Documents
Culture Documents
1. Introduction
Recent development of the quantum information theory has shown us the ability
of information processings and computations based on the quantum physics to
go far beyond those based on classical physics. At its heart, this is because the
potential ability of probability is enlarged from classical theory to quantum theory.
Indeed, quantum theory can be considered as a probabilistic theory, which-in some
sense-properly includes the classical probability theory (Kolomogorov's probability
theory). However, this does not mean that quantum theory is the most general theory
of probability even among the possible theories which have operational meanings.
So far, the most general theory of probability with suitable operational meanings
has been developed by many researchers [1-6, 9]. Following the recent trend [9],
we call such theories the general probabilistic theories (or simply GPTs).
As the quantum information theory has been constructed on the basis of quantum
theory, information theories can be constructed on the basis of any probabilistic
theory [4, 7-14]. There are several motivations for this line of research: First, this
is an attempt to find physical principles (axioms written in physical languages)
for quantum theory [16-18]. Indeed, by considering the general framework which
[175]
176 G. KIMURA, K. NUIDA and H. IMAI
encompasses the quantum theory, we look for principles which determine the
position of the quantum theory in this general framework. The development of the
quantum information theory motivates us to find the principles based on information
processings for the theory of quantum physics [19, 20, 10]. Second, the construction
of the information theory based on the most general theory of probability enables
us to understand logical connections among information processings by resorting
to the particular properties of neither classical nor quantum theory, but only to
the essential properties which a suitable probability theory should possess. Third,
this is a preparation for the possible break of quantum theory. For instance, one
can discuss a secure key distribution in the general framework without assuming
quantum theory itself [21]. Finally, this might provide a classical information theory
under some restrictions of measurements, since any general probabilistic theory has
a classical interpretation based on such restrictions of measurements [5, 22].
In this paper we propose and give systematic discussions of several distinguisha-
bility measures (especially, the Kolmogorov distance and fidelity) and three quantities
related to entropies for general probabilistic theories. The corresponding measures and
entropies in classical and quantum theories have been proved to be useful [25, 26],
and we give generalizations for them in any GPTs and discuss their applications.
In particular, the no-cloning theorem and a simple information-disturbance theorem
in GPTs are reformulated using fidelity, and a bound of the accessible information
is discussed based on one of the "entropies". Finally, we introduce the principle
of "equality of pure states" meaning that there are no special pure states. We call
such a GPT symmetric, and in symmetric GPT the measure of pureness will be
discussed.
probabilistic mixtures. (AS) we introduce a natural topology on the state space which
is the weakest topology such that s ---+ p (a 1M, s) is continuous for any measurement;
finally, we assume (A6) there exists joint state ()) of the system A + B defining
a joint probability for each measurement MA and MB which satisfies the no-signaling
condition, i.e. the marginal probabilities for the outcomes of a measurement on A
do not depend on the measurements on B, and vice versa. Moreover, the joint state
is determined by joint probabilities for all pairs of measurements of A and B.
Having based on these assumptions, one can show the following [I, 3, 5, 9]:
(a) There exists a locally convex topological vector space V such that, in a suitable
representation, the state space S is a convex subset in V where q Sj + (I - q )S2
corresponds to the state described in (A3) above. An extreme point of S is called
a pure state. Moreover, without loss of generality, one can assume that S is compact
with a natural topology [13]. Notice that by the famous Krein-Milman theorem (see,
for instance, Theorem 10.4 in [23]) the set of extreme points Spure is nonempty and
S is the closed convex hull of extreme points. In particular, in finite-dimensional
cases, any state s E S has a convex decomposition with a finite number of pure states
(hereafter, a pure state decomposition): s = Lx PxSx where Px :::: 0, Lx Px = I,
SX E Spure (see, for instance, Theorem 5.6 in [24]).
A map j : S ---+ lR is called an affine functional if it satisfies
[tqs, + (I - q)S2) = qj(sd + (I - q)j(S2)
for any q E [0, I], S1, S2 E S. In particular, an affine functional e : S ---+ lR is called
an effect if the range is contained in [0, I]. We denote the sets of all the affine
functional and all the effects by A(S) and reS), respectively. It is easy to see
that res) is a convex subset of a real vector space A(S). We call an extreme
°
effect a pure effect. The zero effect and the unit effect u such that o(s) = and °
u (s) ::: I are trivially pure effects. It is easy to see that the effect u - e is pure
iff the effect e is pure. Moreover, we can introduce a natural topology on reS)
which is the weakest topology such that the map £(S) ---+ JR, e r-+ e(s), becomes
continuous for every s E S. It is shown that reS) is compact with respect to this
topology [13].
(b) It is often convenient to characterize a measurement without explicitly speci-
fying the measurement outcomes. In that case, any measurement M is characterized
by the set of effects m, such that p(ailM, s) = mi(s) and Li m, = u: In the fol-
lowing, we occasionally use the notation M = (m j) j (implicitly assuming conditions
m j E E (S) and L j m j = u) to denote the measurement on S meaning that m j (s) is
the probability of obtaining jth output (say a j) by a measurement M in a state s.
(c) Dynamics is described by an affine junction on the state space. In general,
the initial state space S and final state space S' might be different. Then, a time
evolution map is given by an affine map j from S to S'. We denote by A(S, S')
the set of all the affine maps from S to S'.
(d) The joint systems are described by a convex set in a tensor product of the
corresponding vector spaces. A joint state co on A + B with state spaces SA and SB
is described by a hi-affine map on [(SA) x [(SB). In particular, if to is a joint state
178 G. KIMURA, K. NUIDA and H. IMAI
I Although the proof is simple (see for instance [30, 9]), this property is important in its applications. For
instance, in the context of key distribution, Alice and Bob can assure to be safe if there joint state is pure,
since then their system does not have any correlations with another system (eavesdropper).
DISTINGUISHABILITY MEASURES AND ENTROPIES 179
[Hyper cuboid systems and squared system] Let Seb := {c E JRd I 0 :::: c, :::: 1
(i = 1, ... , d)}. The pure states are the 2d vertices. We call this hyper cuboid
system, and especially the squared system when d = 2 [11]. These might be the
easiest examples of GPT which are neither classical nor quantum. However, one can
construct a classical model such that a suitable restriction of measurements reduces
the hyper cuboid systems [22].
Finally, notice that the probabilistic theories with state spaces SA and S8 are
equivalent if they are affine isomorphic, i.e. there exists a bijective affine map
from SA to S8. For instance, any GPT which has a simplex state space is affine
isomorphic to some standard simplex, and therefore can be considered as a classical
system.
Indeed, De has a metric property and it follows that De(Pi, q j) = max- IP(S) - q (S) I
where the maximization is taken over all subsets S of the index set {i}. Thus
De(p, q) is considered as a metric for two probability distributions with an operational
meaning.
In any GPT, one can define [13] the Kolmogorov distance between two states
S1,52 E S by
(1)
180 G. KIMURA, K. NUIDA and H. IMAI
Ps(SI,S2):= max
(m\,mz)EM
(~ml(SI)+~m2(S2))
2 2
= (1
~2 + max(e(sl)
eE£
- e(S2)))' (3)
From (2) and (3), we have another operational meaning of the Kolmogorov distance.
Note that D(SI, S2) takes the maximum 1 iff Ps(SI, S2) = 1, i.e. when SI and S2
are completely distinguishable in a single measurement. On the other hand, D(SI, S2)
takes the minimum 0 (thus SI = S2) iff P, (SI, S2) = 1/2, i.e. SI and S2 are completely
indistinguishable (and indeed such states should be identified due to the separation
property of states).
In the following, we show the monotonicity, strong convexity, joint convexity,
and convexity for the Kolmogorov distance for any GPT.
DISTINGUISHABILITY MEASURES AND ENTROPIES 181
rs», q) := L JPiqi. (4 )
Note that (i) 0 :s Fc(p , q ) :s 1, where Fe(p, q) = 1 iff p = q ; (ii) Fe(p , q) = Fc(q , p ).
We say that two probability distributions p, q are orthogonal iff Fe(p , q) = O.
In any GPT, one can also define the fidelity [3, 12] between two states s" S2 E 5
as
F(S I, S2) = min Fe(PI (M), P2(M )), (5)
M={mi}EM
where PI (M ) := (mi(s l))i and P2(M ) := (mi(s2))i. Note that the minimization in (5)
is alway s attained by some measurement, which we call an optimal mea surement,
again due to the compactness of the effect set [13]. In quantum mechanics, one
182 G. KIMURA, K. NUIDA and H. IMAI
has the formula F(PI, P2) = trlp:/2p~/21 = tr[(p:/ 2p2P:/2)1/2] between two density
operators PI, P2 [26].
From the property of the Bhattacharyya coefficient and the separation property
of states, it follows that (i) 0 :s F(s], S2) :s 1 where F(SI, S2) = 1 iff SI = S2;
(ii) F (SI' S2) = F (S2, sd. We say that states SI and S2 are orthogonal (SI ..l S2) iff
F(SI, S2) = O.
PROPOSITION 6 (Monotonicity). For any states SI,S2 E S, and time evolution
I\. E A(S, 8'), it follows
F(I\.(SI), I\.(S2)) ::: F(SI, S2).
PROPOSITION 7 (Strong concavity [12]). Let P = (Pi)i and q = (qi)i be
probability distributions over the same index set, and si, t, E S be states of GPT
with the same index set. Then
This relation is famous for quantum systems [25, 26], but Proposition ]0 shows
that this holds for any GPT.
From (7), we have the corollary.
COROLLARY 11. (i) D(s, t) = 0 iff F(s, t) = 1 and (ii) D(s, t) = I iff F(s, t) = O.
In particular, the orthogonality of states turns out to be equivalent to the complete
distinguishability of states (P, = 1).
In this sense, the Kolmogorov distance and the fidelity are equivalent.
Similarly, it is straightforward to introduce another measures which are used in
quantum information theory. For instance, one can define Shannon's distinguishability
and can show the same relations (see for instance Theorem ] in [25]).
4. Applications
In this section, we give simple proofs using the fidelity for the no-cloning
theorem [9] and the information-disturbance theorem [13, 12] in any GPT.
THEOREM ] 2 (No-cloning). In any GPT, two states s I, S2 E S are jointly clonable
iff Sj = S2 or Sl and S2 are completely distinguishable.
Proof: Let the states Sl, S2 E S be jointly clonable. Namely, there exists a time
evolution map (a cloning machine) A E A(S, S ® S) satisfying
A(sd = Sl ® SI, A(sz) = S2 ® S2. (8)
From (6), we have
F(A(sj), A (S2)) = F(sj ®Sj,S2®S2):S F(Sj,S2)2.
From the monotonicity of F, it follows that F(SI, S2) :s F(A(s]), A(S2)) :s F(SI, S2)2,
which implies that F (Sj, S2) = 0 or 1. In other words, SI = S2 or SI and S2 are
completely distinguishable (cf. Corollary ] I).
Suppose that SI = S2, then one has a time evolution A E A(S, S ® S) defined by
A(s):= S®Sj. (Physically, this is nothing but a preparation of a fixed state SI.) It is
obvious that this jointly clones SI and S2. Next, suppose that SI and S2 are completely
distinguishable. Namely, there exists a measurement M = (m" m2) E M (S) such
that ml (SI) = l , m, (S2) = 0 (and thus m2(sd = 0, m2(s2) = I). Then, A(s) :=
m 1(s)s I ® SI + m2(s )S2 ® S2 for any S E S defines a time evolution A E A(S, S ® S)
satisfying the cloning condition (8). (Notice that mj (s), m2(s) ::: O. m, (s)+m2(s) = 1,
and thus ml (S)Sl ® Sj + m2(s)s2 ® S2 E S ® S from the convexity of S ® S. The
affinity of A follows from the affinity of m.) 0
LEMMA ] 3. For any GPT with at least two distinct states, there exist two distinct
states which are not completely distinguishable.
Proof: Let SI 1= S2 E S. Assume that any two distinct states are completely
distinguishable. Then, we have Fts«, S2) = O. From the convexity of S, there
exists a state S := ~SI + 4S2 1= s,. From the concavity of F, we have F(sJ, s) :::
184 G. KIMURA, K. NUIDA and H. IMAI
~ F (Sl, sd + ~ F (Sl' S2) = ~. Therefore, Sl and s are distinct states which are not
completely distinguishable. 0
A physical process which clones any unknown state we call a universal cloning
machine.
PROPOSITION 15. GPT is classical iff there is a universal cloning machine for
pure states.
Proof: Notice that classical systems are characterized by the fact that all the
pure states are completely distinguishable [9]. This fact and Theorem 12 complete
the proof. 0
Proof: Let Sl, S2 E SA be two pure states which are not completely distinguishable,
i.e. 0 < F (Sl, S2)' Assume that there is a physical way to get information to
discriminate Sl, S2 without causing any disturbance to the system. This implies that
we have a time evolution A E A(SA, SA 0SB) and initial state to E SB such that
the reduced states to the system A are the same,
A(sl 0 to)A = Sl, A(S2 0 to)A = S2·
Since Sl, S2 are pure states, there exists no correlations between the systems A an
B, and hence one gets
for some tl, tz E SB. From the monotonicity of F and Proposition 9, it follows that
F(SI, S2) = Fts, 0 to, S2 0 to) ~ F(A(sl 0 to), A(S2 0 to)) = F(SI, s2)F(tl, t2). Since
0< F(SI, S2), we have Ftt«, t2) = 1 and thus tl = ti. Therefore, to get information
to distinguish Sl and S2, one has inevitably to disturb at least one of these states. 0
From this, e := ejx is an effect which is neither e nor the zero effect O. Since we
have the identity
e=xe+(l-x)O,
this contradicts that e is a pure effect.
Let s = Li pis, (Pi> 0, Li Pi = 1) be a pure state decomposition of s. Then,
it is easy to see e(si) = 1 for any pure state s.. Thus, we can take a pure state s
such that e(s) = 1. 0
COROLLARY 19. Let e be a pure effect which is not u. Then there exists a state
S such that e(s) = O. Such state can be taken to be a pure state.
Proof: Since e(i= u) is pure, the effect e = u - e is a nonzero pure effect.
From Proposition 18, there exits a pure state s such that e(s) = 1 - e(s) = 1. Thus,
e(s) =0. 0
Next, we show that any nonzero effect has a decomposition with respect to
indecomposable effects.
PROPOSITION 20. In any GPT, for every 0 i= e E [(5), there exists a finite
collection of indecomposable effects e, E [(5), 1 :::: i :::: r, such that e = L~=l e.. In
particular, in any GPT, there exists an indecomposable effect.
(See Appendix A for the proof.) Moreover, we have the proposition.
PROPOSITION 21. In any GPT, there exists an indecomposable and pure effect.
Proof: To prove this, we use the following lemmata.
186 G. KIMURA, K. NUIDA and H. IMAI
Proof: First, let e be an effect such that there exists one pure state, say v". at
°
which the value of effect is nonzero. Then, one has e i= and e = 'Ae(/L) E £.(Sc)
for some 'A E (0, 1]. Let e = f + g for f, g E £.(Sc). Then f(p(v)) = ° for any
v i= u. and it follows that f = !(P:f1.)) e. Therefore, e is indecomposable. Next, let e
be an indecomposable effect. Assume that there are at least two nonzero pure states,
say p(/LO) , p(/L\) (p.,o i= p.,l = 1, ... , d) for which the effect values are nonzero. Let
x/L := e(p(/L)). Let f, g be effects defined by f(p(/L)) = x/LOD/L/LO and g = e - f.
Obviously e i= cf for any c E JR, and it contradicts that e is indecomposable. Since
e i= 0, there is the only one pure state at which the value of effect is nonzero. D
[Quantum Systems] Next, we show that indecomposable effects for quantum
systems are characterized by one-dimensional projections, i.e. rank-one POVM
elements. Let H be the d-dimensional Hilbert space and let Sq be the set of all
density operators on H. A nonzero POVM element E we call indecomposable iff
the corresponding effect eO := tr(E·) is indecomposable. It is easy to see that
a POVM element E is one-dimensional iff there exists 'A E (0, 1] and a unit vector
1/J E H such that E = 'A11/J} (1/JI·
PROPOSITION 25. A POVM element E E £.q is indecomposable if and only it is
a rank-one POVM element.
DISTINGUISHABILITY MEASURES AND ENTROPIES 187
Proof: Let E = AI1/F )(1/F 1be a rank-one POVM element with a unit vector 1/F E H
and A E (0, I]. Let E = E) + E2 for some POVM elements E I , E2 ,
(9)
Let {1/Fn In be an orthonormal basis of H such that 1/FI = 1/F. Then , from (9), it follows
2
that (1/FjIE I1/Fj) = IIE: / 1/FjII Z = 0 (Vj :::: 2), and hence E I1/F j = 0 (Vj :::: 2). For any
~ E H, we have El~ = E I ('Ltl ( 1/Ftll~ )1/Fn ) = ( 1/FI ~ ) E ( 1/F = I E I 1/F)(1/FI~. Thus, E 1 has
the form of 1t/>)(1/F1 (where t/> := E I 1/F ). Finally, since E( is Hermitian, it follows that
there exists c' E JR such that t/> = c'1/F and hence E ] = c'I1/F )(1/F 1= cE, where c:= ~ .
This implies that E is indecomposable. Next, let E be indecomposable. Assume
that E is rank I POVM element for some I :::: 2, and let E = 'L:I=1 cnl1/Fn) (1/Fn l
(c, E (0, I]) be an eigenvalue decompo sition of E . Let E I := ClI1/FI)(1/FlI and
E 2:= 'L~=z cn l1/Fn)(1/Fnl. Obviously, they are POVM elements satisfying E = E I+E z.
However, for any c E JR, we have E f. CEI (for instance, E1/Fz = Cz f. 0 while
cE 11/FZ = 0). This contradicts that E is indecomposable. Since E f. 0, we conclude
that E is rank one POVM element. 0
[Hyper cuboid systems] Finally, let Scb be the state space of a d- dimensional
hyper cuboid system introduced in Section 2. To determine the indecomposable
effects in Scb, we present a general lemma which is also useful in later arguments
(see Append ix A for the proof).
L EMMA 26. If the state space S of a GPT contains at least two states, then
for every indecomposable effect e E £(S) we have e(s) = 0 for some s E S.
By virtue of this lemma, we obtain the following characterization of indecom-
posable effects in Scb '
PROPOSITION 27. An effect e E £ (Scb) is indecomposable if and 'only if it is
nonzero and it takes 0 at a d - I dimensional face (facet) of Scb'
Proof: First we consider the 'if' part. Suppose that e is nonzero and e takes 0
at a facet F of Scb ' Fix a state s E Scb such that s rt F. Note that e(s) > 0 since
e is nonzero . If e decomposes as e = el + ez with el, ei E £(Scb), then el (hence
ez) also takes 0 at F. This implies that el = Ae where A = ej (s)/e(s) E JR, hence
e is indecomposable.
Second , we consider the "only if" part. By Lemma 26, an indecomposable effect
e takes 0 at some state, hence at some pure state in Scb. By symmetry, we may
assume without loss of generalit y that e(so) = 0 where So = (0, 0, . . . , 0) E Scb' Let
si (i E {I , 2, ... , d}) be the vertex of Scb such that its j -th component is oij . Then
we have e = 'L1=1 e(si)ei where e, E £(Scb) maps (C I. cz... . . Cd ) to c.. Since e is
indecomp osable, it follows that e = Aei for some I :::: i :::: d and A E IR, therefore e
takes 0 at the facet {(CI , "" Cd) E Scb I c, = O} of S cb. 0
For example, the indecomposable effects in the squared system (i.e. when d = 2)
are listed in Table I, where aI, ... , a4 E (0, I] are parameters.
188 G. KIMURA, K. NUIDA and H. IMAI
value at
effect (0,0) (0, I) (1,0) (1, I)
el 0 0 (Xl (Xl
e2 (X2 (X2 0 0
e3 0 (X3 0 (X3
e4 (X4 0 (X4 0
with equality iff density operators Pi are orthogonal to each other. See, for instance
[26], for the properties of the Shannon and von Neumann entropies.
In any GPT, let us define the following quantities for 5 E S:
51 (5) := inf H(m j(S)), (11)
M=(mj)jEMind
(13)
190 G. KIMURA, K. NUIDA and H. IMAI
In the following, we see that S2 coincides with the von Neumann entropy for
quantum systems. Thus, (14) gives a weaker bound than the Holevo bound for
quantum systems. (For the pure state ensemble, (14) gives exactly the Holevo bound
since the von Neumann entropy vanishes on pure states.)
DISTINGUISHABILITY MEASURES AND ENTROPIES 191
Now, we show that all three quantities (11)-(13) are generalizations of the
Shannon and von Neumann entropies for classical and quantum systems.
THEOREM 33. (i) In classical systems, SI (s), S2(S), S3(S) are the Shannon en-
tropies. (ii) For quantum systems, SI (s), S2(S), S3(S) are the von Neumann entropy.
Proof: (i) Let S; be the state space of a classical system. From Proposition 24,
any indecomposable measurement in classical system is given by O"i,fLe(fL)i,fL' where
Ai,fL :::: 0, Li Ai,fL = 1 for any f-l = 1, ... , d. Thus, for a state P = (PI, ... , Pd) ESc,
the probability distribution given by the indecomposable measurement is (Ai,fLPfL)i,fL"
Note that from the concavity of the function g(x) := -x logx (x E [0,1]) with the
convention g(O) = 0, it holds that g(h) :::: Ag(X), and thus we have
°
Since there exists a measurement M = (m j) discrimining all pure states in a classical
system, we have inf MEM H(XIJ) = (i.e. the uncertainty of X conditioned on the
information of J is zero). Therefore, we have S2(P) = H(X) = H(p).
Again from the unique pure state decomposition, there exists a unique ensemble
{PfL' p(fL)}~=1 for any state P = (PI, ... , Pd) ESc' Therefore, we have S3(P) = H(p).
(ii) Next, we consider a quantum system described by a Hilbert space H. First,
let f be a concave function on [0, 1] such that f(O) = 0, and let p be a density
operator on H. Then, it is easy to show' that for all vectors 1fr E H such that
111fr II :::: 1, we have
f ((1fr 1p 11fr ) :::: (1fr 1f (p ) 11fr).
3Let P = 2:1=1 Pjlr/>j)(rPj be an eigenvalue decomposition
1 of p, Notice that {rPj}1=1 is an orthonormal
basis of Hand 2:1=1 1(1/IIpj)1 2 = 111/111 2 S L Thus (qj)1~: where qj ;= 1(1/IIPj)1 2 (j = I,.,., d) and
qd+1 ;= I - Lj 1(1/IIpj)12 , is a probability distribution. From the concavity of I, we get
d+l d+l
!«1/IIpI1/I)) = !(LPNj):::: LqJ!(Pj) = (1/I1!(p)11/I),
j=1 j=1
where Pd+1 ;= O.
192 G. KIMURA, K. NUIDA and H. IMAI
H (ej(p)) = L g«(1/Jj Ipl1/Jj)) :?: L (1/Jj Ig(p )11/Jj) = tr (g(P) L E j ) = tr g(p) = S(p) .
j j i
By considering the indecomposable measurement given by (14)j ) (4)j Dj, where 4>j s
are complete eigenvectors of p, we obtain Sj(p) = S(p) .
Next, from the Holevo bound (15), we have
S2(P) ~ S(p) - inf (LPxS(Px)) = S(p).
(Px ,Px}E1)(p) x
The final equality follows from the eigenvalue decomposition p = Lx Px l4>x) (4)x 1 and
S(I4>x)(4>xD = O. Again with the decomposition {Px, Px = l4>x)(4)xll of eigenvalues
and eigenvectors, there exists an optimal measurement M, := l4>j) (4)jl to discriminate
Px, and thus one has H(X : J) = H(p). Since S(p) = H(p) , we have S(p) =
H(X: J) s S2(p).
Finally, let {Px, Px l E P(p) be a pure state decomposition of p. Then, from the
inequality (10) and the fact that S(Px) = 0 for pure states Px, we have
S(p) ~ H({PxD ~ S3(P)·
Moreover, an eigenvalue decomposition p = Lx Px l4>x) (4)x 1 of p gives a pure state
decomposition such that Px = l4>x)(4)x I are orthogonal to each other, we have the
equality S3(P) = S(p). This completes the proof. 0
Notice that the fact that Sj, S2, S3 coincide with the von Neumann entropy
for quantum systems shows that we have alternative expressions with operational
meanings for the von Neumann entropy. The characterization of S by S3 has
been noticed by Jaynes [28]. Here, we remark that Sl could be defined by the
infimum of Shannon's entropy not among indecomposable measurements but complete
measurements. Then, it is easy to restate the above mentioned proof to show that Sj
coincides with the Shannon and von Neumann entropies for classical and quantum
systems. However, as we have noticed in Section 5.2, there exists a GPT where no
complete measurements exist. This is the reason why we have defined Sl among
indecomposable measurements.
In order to see the properties of Sl, S2, S3 in a general GPT, let us again consider
the squared system Ssq. Let hex) := -x logx - (1 - x) 10g(1 - x) be the binary
Shannon entropy.
PROPOSITION 34. In the squared system, for s = (Cl, C2) E Ssq, we have
Sj (s) = min[h(cj), h(c2)], (16)
DISTINGUISHABILITY MEASURES AND ENTROPIES 193
(17)
k(O) s E Rui,
k(cI) S E RzB ,
53 (s) = (18)
k(cz) S E R 3U ,
k(cz) S E R 4L,
k(O) S E R4R'
where k(x) := H({x, CI-X, Cz-X, 1+X-CI-CZ}) and the regions RIL, ... , R4R C Ssg
are given in Fig. 1(4).
(See Appendix A for the proof). See the graphs of 51, Si and 53 in Fig. 1(1)-1(3).
Moreover, in Ssg, the following relations among 51, Si and 53 are true.
PROPOSITION 35. For any S E Ssg,
r-_---. s..
s.
Fig. I. In the squared systems, (1), (2) and (3) show the graphs of 51, 52 and 53. (4) specifies the region
RIL,···, R4R·
6.1. Concavity
In this section, we consider the concavity properties of 51, 5z and 53. It turns
out that 51 is concave on S in any GPT, while there exist GPT models where 5 z
and 53 are not concave.
PROPOSITION 36. In any GPT, 51 is concave on S.
Proof: Let (Px)x (x = 1, ... , m) be a probability distribution and let s, E S (x =
1, ... , m). Then, from the affinity of effects m j and the concavity of the Shannon
194 G. KIMURA, K. NUIDA and H. IMAI
entropy, we have
~ inf LPxH(mj(sx))
M=(mj)jEMind x
(19)
o
In contrast to SI, Sz and S3 are not concave for some GPT. It is easy to give
counter examples but it is obvious that concavity does not hold for the squared
systems from the Figs. 1(2) and (3).
Instead of concavity, we show that Sz satisfies the following weak concavities.
PROPOSITION 37 (Weak concavity). In any GPT Sz satisfies the following.
S o: s) > Lx p;si(sx)
Z Lx Px x - "
L....Jx PxSZ(Sx )'
(20)
for any {Px, SX}XEX E'D(S) (in (20), we interpret the right-hand side as 0 if SZ(Px)
are all 0).
Proof: To prove this proposition, we use the following lemma (see Appendix A
for the proof).
LEMMA 38. Let {Px, SX}XEX E'D(S), Px 1= O. Then for any value Jrx ~ 0, x EX,
such that Lx Jrx = 1, we have
To see the converse, let 5 (s ) = 0 for s E S and let .I' = PIS1 + P2S2, where
PI E (0, 1), PI + P2 = 1 and .1'1, .1'2 E S. Since 52(.1' ) = 0 and {Px; sx}x=1,2 E D(s ),
we have
H (X : J ) = 0
for the random variable X = 1, 2 and for any M = (m j) E M . This implies that
the joint probability ptx , j) := Pxln j (sx) is a product state, or equivalently. the
conditional probability p (j [r ) := p (x , j) / Px = In j (sx) is independent of x (notice
that PI , P2 =1= 0). In particular, we have In j (.1' 1) = p (j 11) = p (j 12) = In j (.1'2). Since
this holds for any effect In i - we have .1' 1 = .1'2 from the separating propert y of states.
Therefore, .I' has only the trivial decompo sition and is a pure state.
196 G. KIMURA, K. NUIDA and H. IMAI
(ii) Let s E S be a pure state and thus P(s) has essentially the unique
(trivial) decomposition {I; s}, where H(X) = O. Thus, we have 53(S) = H(X) = O.
Conversely, let 53(S) = O. Then, for any {Px; sx}x E P(S), it follows that H(X) = O.
Assume that s is not a pure state. Then, we have {Px, sxh E P(S) where PXl' PX2 > 0
for some XI, X2 E X. However, this contradicts that H(X) = O. Therefore, s is a pure
~~. 0
In contrast to 52 and 53, 51 does not have this property. For instance, from
(16), 51 (s) = 0 for any state s on the boundary (four edges) of Ssg. (see Fig. 1(1).
Note that s on edges but not on vertices is not a pure state.) For general GPT, we
show the following.
PROPOSITION 41. In any GPT, 51 (s) = 0 implies that s is on the boundary
of S.
Proof: It suffices to consider the case when S has at least two states. To prove
this proposition, we use the following two lemmata (see Appendix A for the proofs).
LEMMA 42. Let k, £ ::: 1 be integer. Let h(x) = -x logx. If XI, ... , Xe E [0, II k]
and Lj Xj = 1, then H(x) = Lj h(xj) ::: logk.
LEMMA 43. For any s E S, the map I, : [(S) x S ---+ llt fs(e, t) = e(s) - e(t),
is continuous.
Let s E S be such that 5 1(s) = O. First we show that sUPCe,t) fs(e, t) = 1. Let
k ::: 2 be any integer. Since 51 (s) = 0, there is an indecomposable measurement M =
(mdi E Mind such that H(mi(s)) < hOI k) « logk). Then we have mi(s) ::: I-II k
for some i, as otherwise we have a contradiction as follows: If mio (s) E (1I k, 1-1 I k)
for some io then we have H(mi(s)) ::: h(mio(s)) > hOI k); while if mi(s) .::: II k
for all i then we have H(mi(s)) ::: logk by Lemma 42. For this m., Lemma
26 implies that there is a state t E 5 such that m, (z) = O. This implies that
f~(mi,t)::: I-11k. Since k >: 2 is arbitrary, we have sUPCe,t)f~(e,t) = 1. Since
[(5) x S is compact, Lemma 43 implies that fs(e, t) = 1 for some e E [(5) and
t E S, therefore e(s) = I and e(t) = O. This implies that e is not constant and s lies
in a supporting hyperplane of S, hence s is on the boundary of S, as desired. 0
Note that, in Ssg, the converse is also true: all states on the boundary s satisfy
51 (s)= O. However, this is not the case for any GPT. In particular, one can
construct a GPT where 51 (s) i- 0 even for a pure state s. For instance, consider
a GPT introduced in Appendix B with the state space S C ]R2, which has the
four pure states (0,0), 0,0), (0,1), and (2,2). Then any indecomposable effect
in S is of the form se, such that 0 < 'A :s 1 and e, is one of the four effects
in Table 2 in Appendix B. This implies that for any indecomposable measurement
M = (mi)i E Mind we have m, (0, 0) :s 2/3 for all i, therefore 51 (0, 0) > 0 (see
Lemma 42). Thus, for general GPT, neither directions of "s is pure {} 51 (s) = 0"
do not hold in general. In the next section, we consider a class of GPTs with fairly
fine property.
DISTINGUISHABILITY MEASURES AND ENTROPIES 197
DEFINTION 44 (Equality for pure states). We say that GPT satisfy the principle
of equality for pure states if, for any pure states Sl, S2 E S, there exists a bijective
affine map f on S such that S2 = f(s]). A GPT satisfying this property we call
a symmetric GPT.
PROPOSITION 45. Classical, quantum, and hyper cuboid systems are all symmetric.
In particular, notice that, for quantum systems for any pure states PI =
/l/!J)(,ifill, P2 = !1/r2) (0/21,
there exists a unitary operator U such that P2 = UpIUt.
We show that 51 vanishes for any pure state in a symmetric GPT. To see this,
we first show the lemma.
LEMMA 46. For any GPT, there exists a pure state s such that 51 (s) = O.
Proof: Let el be an indecomposable and pure effect (see Proposition 21), and let
u - el = ei +...+ em be an indecomposable decomposition of u - e I (see Proposition
20). Then, M = (ej)r=l is an indecomposable measurement. From Proposition 18,
there exists a pure state s such that el (s) = 1. Thus, we have H (e j (s)) = 0, and
SI(S) = 0. 0
PROPOSITION 47. Let S be the state space of a symmetric GPT. Then, 51 (s) = 0
for any pure state s.
Proof: From Lemma 46, there exists a pure state So such that 51 (so) = O. For any
pure state s, there exists a bijective affine f such that So = f (s). Let M = (m j) j be
an indecomposable measurement such that H (m j (so)) = O. Then, it is easy to see
that it := (m j) j where m j := m j 0 f is an indecomposable measurement. Therefore,
it follows that H(mj(s)) = H(mj(so» = O. Thus, we have proved that SICS) = 0
for any pure state s. 0
8. Concluding remarks
We have discussed some distinguishability measures (especially, the Kolmogorov
distance and fidelity) in any GPT. In a similar way, for quantum information
theory, it will be convenient to use these measures in constructing an information
theory in GPT. Indeed, we have reformulated the no-cloning theorem and the
information-disturbance theorem using fidelity.
We have also proposed and investigated three quantities related to entropies
in any GPT. All of them are generalizations of the Shannon and von Neumann
entropies for classical and quantum systems, respectively. However, they are in
general distinct quantities, as the squared system gives the example. The concav-
ity of 51 in any GPT holds while it breaks for 52 and 53 in some GPT. 52
and 53 provide a measure of pureness, while 51 does not. However, in a sym-
metric GPT which satisfies the principle of equality of pure states, it follows
that 51 (s) = 0 for any pure state s. In the attempt to find principles of our
world, which are described by a quantum system at least for the present, we
think that symmetric GPTs are sufficient to consider by assuming the principle
of equality for pure states. However, let us remark here that both classical and
quantum systems satisfy a stronger principle, which we call strong equality for
pure states or equality for distinguishable pure states which can be formulated as
follows.
DEFINTION 48. We say that GPT satisfies the principle of strong equal-
ity for pure states if it satisfies the following: Let lSi E Spure}7=1 and {ti E
Spure}T=l (let n :::: m) be two distinguishable sets of pure states, i.e. there ex-
ists a measurement M = (mi)i (N = (ni)i) such that mi(Sj) = 8ij (ni(tj) = 8ij).
Then, there exists a bijective affine map f on S such that t, = f (s.) (i =
1, ... , m).
Notice that the squared GPT is symmetric but does not satisfy this strong equality
for pure states. (For instance, consider {(O, 0), (0, I)} and {(O, 0), (1, I)}.) It might
be interesting to consider these kinds of stronger conditions which classical and
quantum systems satisfy. In particular, we do not know any principles which make
the converse of Proposition 47 true.
Acknowledgment
We would like to thank for useful comments and discussions to Dr. Imafuku
and Dr. Miyadera. This work was partially supported by Grant-in-Aid for Young
Scientists (B) (No.20700017 and No.22740079), The Ministry of Education, Culture,
Sports, Science and Technology (MEXT), Japan.
Note added. Related but independent works for entropies in GPT have appeared
recently in [31, 32] while we were completing this paper. We will investigate the
relations between our work and the results there in the near future.
DISTINGUISHABILITY MEASURES AND ENTROPIES 199
Then, we have
1
D(A(sd, A(S2)) = "2 L Ini(S\) - ni(s2)! ::'S ot»; S2). o
I
Then, we have
where we have used (i) the affinity of m., (ii) the triangle inequality of I· I, and
(iii) Li m, = u. 0
Proof of Proposition 6: The same proof as for Proposition 2 can be used. 0
Proof of Proposition 7: Let M = (mdi be an optimal measurement which attains
the minimum,
Using the affinity of m, and the Schwarz inequality between vectors (JPimi.(S;))i
and (Jqimk(ti))i, one gets
Proof of Proposition 9: (i) Let SI, Sz E SA, t E SB, and let M = (mdi E
M(SA) be an optimal measurement such that F(SI, sz) = Li Jmi (Sdmi (sz). Then,
fi = mi ® UB gives a measurement F := Udi E M(SA ®SB) and F(SI, sz) =
Li J fi(sI ® t)fi(sz ® t) ?: F(SI ®t, sz®t). Conversely, let G = (gdi E M(SA ®SB)
be an optimal measurement such that F(SI ® t, si ® t) = Li Jgi(SI ® t)gi(SZ ® t).
Then, mi(S) := gi(S ® t), "Is E SA, gives a measurement M = (mdi E SA. Thus,
F(SI ® t, Sz ® t) = Li Jmi(SI) ® mi(sZ) ?: F(SI, sz).
(ii) Let s, t E SA ®SB and SA, tA be the reduced states from s, t to the
system A. Let M = (mdi E SA be an optimal measurement such that F(SA, tA) =
Li Jmi(sA)mi(tA). By noting that gi = m, ® UB gives a measurement on SA ®SB
and gi(S) = mi(sA), gi(t) = mi(tA), one has F(SA, tA) = Li Jgi (S)gi(t) ?: F(s, t).
(iii) Let SI, Sz E SA, tI, tz E SB, and let M = (mi)i E M(SA) and N =
(nj)j E M(SB) be optimal measurements such that F(s}, sz) = Li Jmi(sI)mi(sZ)
and F(tI, tz) = Lj Jni (t})nj (tz), respectively. Then, gij = m, e nj E [(SA ®SB)
gives a measurement G = (gij)ij E M (SA ® SB), and
e = k "L...... -Px-ex,
x kAJ1,
of e into a finite collection of effects (note that 0 :::: Px/ (kAJ1,) :::: 1).
Our remaining task is to show that each qxex, where qx = Px/(kAJ1, ), is an
indecomposable effect prov ided qx > O. Let qxex = e' + e" with e' , e" E £ (5),
e', e" #- O. Then we have ex = (qxJ1, )- I (e' + e"). By the property of C , there exist
v', v" > 0 and s, e" E C such that e' = v'e' and e" = «r'. We have ex = lJ'e' + lJ"e",
where IJ' = (qxJ1,) - lv' > 0 and IJ" = (qxJ1,) -IV" > O. Mor eover, by the definition of
C , we have
I = g(e x) = lJ'g(e' ) + lJ"g (e") = IJ' + n" .
Since ex is an extreme point of C , it follow s that ex = e' = s:
therefore
e' = v'e x = (v' / (J1,qx»qxex. Hence qxex is indecomposable as desired, concluding
the proof of Proposition 20. 0
Proof of Proposition 22: It is easy to show e e
is an effect. Let = el + e2 be
an effect decomposition of e. Then, e = qe, + qe2 is an effect decomposition of e
since q :::: 1. Since e is indecomposable, there exists c E JR such that qe , = ce, or
e] = ceo Thus, e is indecomposable. 0
Proof of Proposition 23: Let e = Ae] + (l - A)e2 be a convex decomposition of
e with A E (0,1). It is easy to see that e, (s) = e2(s) = 1. Since Ae], (l - A)e2 E £
and e is indecomposable, we have Ae] = ce for some c E R Applying this to s ,
we have A = c, and thus e, = e. Therefore, e is a pure effect. 0
Proof of Lemma 26: First we show that an indec omposable e is not con stant
on S . Since S has at least two states, the separation prop erty of states implies
that a nonconstant effect f E £(S) exists. If e takes constantly C E (0, 1], then the
decomposition e = cf + c(u - j) contradicts that e is indecomp osable. Hen ce e is
not constant. Second, if e does not take 0 at any state, then we have e(s) :::: C
for some c > 0 and all s E S since e is continuous and S is compact. Now the
202 G. KIMURA, K. NUIDA and H. IMAI
a = °
Since the right-hand side is concave on a E [0, 1], it takes the minimum at either
or a = 1, hence we have 51 (s) = min[h(c]) , h(C2)], as desired.
Second, we compute 52(S) for s = (CI , C2) E S sq. Let {Px, sx}x E pes) with
Sx = (Cx,l , Cx,2) and M = (m j)j E Mind, Again, it suffices to consider the case
when M contains at most one effect m j of each of the four types listed in Table
1; indeed, if m hand miz are of the same type (in the above sense), then by
replacing the pair of m hand m ii with mh + miz the value of H (X : J) does not
change, Thus we may assume without loss of generality that M consists of the
four effects in Table 1 with parameters al = a2 = a, a 3 = a4 = f3 := 1 - a for
some a E [0, 1]. Now a direct calculation implies that
H(X : J) = h(a) + ah(c]) + f3h(C2)
- LPx(h(a) +ah(cx,l) + f3h(Cx,2»
xeX
x x
Since all the pure states (Cx ,l, Cx,2) in Ssq satisfy Cx,l E {O, I} and Cx,2 E {O, I}, we
have H(X : J) = ah(cl) + f3h(C2) = ah(cI) + (1 - a)h(c2), which is independent
of the given decomposition {Px, sxlx of s. This implies that 52(S) = sUPX,J H(X :
J) = max[h(c]), h(C2)], as desired.
Finally, we compute 53(S) for s = (CI , C2) E S sq. By the reason similar to the case
of 51, to compute 53(S) it suffices to consider a decomposition {Px, sx}xeX E pes) such
that all Sx are different pure states. Thus we may assume that X = {00, 01, 10, 11},
Soo = (0,0), SOl = (0, 1), SIO = (1,0) and Sll = (1, 1). Now by putting Pll = P we
have
PIO = CI - P, POI = C2 - P, Poo = 1 - c, - C2 + p .
In the above expression, we have Px E [0, 1] for every x if and only if Pm :s P :s PM ,
DISTINGUISHABILITY MEASURES AND ENTROPIES 203
where
Pm = max[O, Cl + C2 - 1],
Hence we have S3(S) = infpm::op::OPM H(px). Now a direct calculation shows that
( dd )2 H(px) = - L -1 < 0
P XEX Px
for any P E (Pm, PM), therefore H(px) takes the minimum at either P = Pm or
P = PM: S3(S) = min[H(Px)lp=Pm' H(Px)lp=PM]'
First we consider the case that Cl .:s C2 and Cl + C2 .:s 1 (i.e. S E R2U or S E R2B),
therefore Pm = 0 and PM = Cl· If P = Pm then we have (Px)x = (1-c] -C2, C2, Cl, 0),
while if P = PM then we have (PxL = (1 - C2, C2 - Cl, 0, Cl). This implies that
a diff _ I (1 - C2)C2
- 1 M-m - og - - - - - - - -
aC2 (C2 - Cl)(1 - Cl - C2)
which is now nonnegative by the conditions for Cl and C2. Since diff M- m = 0 when
C2 = 1/2, it follows that diff M - m .:s 0 and S3(S) = H(Px)lp=PM when 0 .:s C2 .:s 1/2
(i.e. S E R2B), and diff M- m ~ 0 and S3(S) = H(px)!P=Pm when I/2.:s C2 .:s 1 (i.e,
S E R2U). Hence the expressions for S3(S) in (18) for S E R2U and S E R2B are
proved. The claim for the remaining cases follows by considering suitable symmetry
of the state space Ssg. 0
Proof of Proposition 35: The first inequality 51 (s) .:s 52(s) is obvious by (16)
and (17). For the second inequality S2 (s) .:s S3 (s), by symmetry, we may assume
without loss of generality that S = (Cl, C2) E R2U, i.e. I/2.:s C2 .:s 1 - ci. This
condition implies that h(Cl) .:s h(C2), therefore S2(S) = h(C2). On the other hand,
(18) implies that S3(S) = g(Cl) + g(C2) + g(1 - Cl - C2), where g(x) = -x logx.
Thus we have
Proof of Lemma 38: For each x E X, let {q;, t;hEYx E D(sx) and M; = (mj) }Ox E
M. Then we have {Pxq;, t;}XEX,yEYx E D(s) and M' = (JrXmj)XEx.}EJx E M. Let
Z = {(x, y) I x E X, Y E Yx } and K = {(x, j) I x E X, j E Jx } denote the index sets
204 G. KIMURA, K. NUIDA and H. IMAI
= LJrx(h(mj(s» - L px,q~'h(mj(t;'»).
(x,j) (x',y)
cp(S) c]Rn are homeomorphic via tp, By identifying S with cp(S) in this way, the
map Is is written as Is(e;AI,··.,A n) = Aje(tl)+"'+Ane(tn)' This implies that
I, is continuous, since both (e; AI, ... , An) H- e(tj) and (e; A], ... , An) H- Aj are
continuous. 0
value at
er 0 0 1/2 I
e: 0 1/2 0 I
e3 2/3 0 I 0
e4 2/3 I 0 0
REFERENCES