Ando and Kauffman 1965

Bayesian Analysis of the Independent Multinormal Process.
Neither Mean Nor Precision

Known
Author(s): Albert Ando and G. M. Kaufman
Source: Journal of the American Statistical Association, Vol. 60, No. 309 (Mar., 1965), pp. 347358
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2283159 .
Accessed: 07/04/2014 18:35
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp
.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.
American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.
http://www.jstor.org
This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

All use subject to JSTOR Terms and Conditions
BAYESIAN ANALYSIS OF THE INDEPENDENT

MULTINORMAL
MEAN NOR PRECISION KNOWN*
PROCESS-NEITHER
ALBERT ANDO
University
ofPennsylvania
AND
G. M. KAUFMAN
MassachusettsInstituteof Technology
Underthe assumptionthatneitherthe mealivectornorthe variancecovariance matrixare known with certainty,the natural conjugate
familyof priordensitiesforthe multivariateNormalprocessis identified. Prior-posterior
and preposterioranalysis is done assumingthat
the prioris in the natural conjugatefamily.A procedureis presented
for obtainingnon-degeneratejoint posteriorand preposteriordistributions of all parameterseven when the numberof objective sample
observationsis less thanthe numberofparametersofthe process.
IN
1. INTRODUCTION
THIS paper we develop the distributiontheory necessary to carry out

Bayesian analysis of the multivariateNormal processas definedin Section
1.1 below when neitherthe mean vectornor the variance-covariancematrixof
the processis knownwithcertainty.The developmentheregeneralizesRaiffa's
and Schlaifer'streatmentof the multivariateNormal process as done in Part
B, Chapter 12 of reference[5], in which it is assumed that the variancecovariance matrix is known up to a particular multiplicativeconstant. We
drop this assumptionhere.
In Section 1 we definethe process,identifya class ofnaturalconjugatedistributions,and do prior-posterioranalysis. The conditionaland unconditional
statisticsare presentedin Section 2.
samplingdistributionsof some (sufficient)
In particular,we prove that the distributionof the sample mean vectormarginal withrespectto the sample variance-covariancematrix,to the processmean
vector,and to the process variance-covariancematrixis multivariateStudent
We then
wheneverthe prioris in the naturalconjugatefamilyof distributionls.
use the resultsof Sections 1 and 2 to do preposterioranalysisin Section3.
We also show in Section 3 that Bayesiall joint inference-findingjoint
posteriorand preposteriordensitiesofthe mean vectorand the variance-covariance matrix-is possible even when classical joint inferenceis not, i.e. whenl
the numberof objective sample observationsis less than the numberof distinct elements of the mean vector and variance-covariancematrix of the
process.
Geisserand Cornfield[3] and Tiao and Zellner [7] analyze the multivariate
Normal process and multivariate Normal Regression process respectively
underidenticalassumptionsabout the state of knowledgeof the parametersof
the process.Their presentationsdifferfromthat given here in threerespects:
* The contribution
by Ando to thispaper is partiallysupportedby a grantfromthe National Science Foundation.The authorswishto thankthe refereesforhelpfulsuggestionsand JimMartin forproofreading
early drafts,
and to gratefully
acknowledgevaluable commentsand criticismsgivento themby membersofthe Decisions Under
Uncertainty
Seminar,conductedby ProfeesorHoward Raiffaat the Harvard BusinessSchool.
347

348
AMERICAN
STATISTICAL
ASSOCIATION
JOURNAL,
MARCH 1965
first,followingthe lead of Jeifreys[4], both sets of authors assume that the

joint prior on the mean vector and variance-covariancematrix is a special
(degenerate)case of a natural conjugatedensity;second,herewe findsampling
unconditionalas regardsthe parametersofthe processand do predistributions
posterioranalysis; third,by doingthe analysis forthe completenatural conjugate family,we are able to providea procedureforderivingjoint posteriorand
some joint preposteriordistributionsof all parametersunder conditionsmentionedin the paragraphimmediatelyabove.
1.1 Definitionof theProcess
As in Anderson [1], we define an r-dimensionalIndependent Multinormal processas one that generatesindependentrX1 randomvectorsx(l),
x(i), * *with identicaldensities
-oo
fN (X
t , h) =(2gr) iexp
[-(X-
p)'h(x - p)] h
<
x <
2 -??
< p<
oo,
??(1
h is PDS.
1.2 Likelihoodof a Sample
The likelihoodthat the process will generaten successive values xl), . . .
x(n) is
(27r)-irnexp [-(x(
)x(j)
h Iin.
-)]
(2)
If the stoppingprocess is non-informative,

as definedin reference[1], this is
,(n).
thelikelihoodof a sample consistingofn observationsx(l), * * , x(i),
When neitherh nor ti is known,we may computethese statistics:
1
Ix(P)
n - r (redundant),
(3a)
and
V -_(x
-m)(x()
m)t.
(3b)
It is well known' that the kernelofthe joiintlikelihoodof (m, V) is, provided

V>0P
exp [-Xn(m
h Iexp
tt)th(m-)]
[-2 tr hV] I h I(+r1),
(4a)
the kernelof the marginallikelihoodof m is, providedV(nu)

>O,
exp [-in(m
-h
)th(m
)]
hI
(4b)
and the kernelof the marginallikelihoodof V is, providedv> 0, and V is PDS,

exp [-tr
hV] I h 0,(+r- ).
(4c)
Formula (4c) is the kernelof a Wishartdistribution.

A randommatrixV of dimension(rXr) will be called "Wishartdistributed
I See forexample,Anderson[11,Theorem3.3.2 and pp. 154-60.

BAYESIAN
ANALYSIS
349
OF THE MULTIVARIATE
withparameter(h, v)" if
V fw (V| h, v)
fw(r,v) |
'exp
V|
Ih Ii+r-)
[-2trhV]
ifVisPDSandv>O?
otherwise,
where
w(r,
v) -I2i(y+r)r7r(r-1)/4
+ r-
r(,(v
(4e)
i_1
That (m, V, v) definesa set of sufficientstatisticsfor (La,h) is shown in

section3.3.3 of [1].
We will wish to express (4a) in such a fashionthat it automaticallyreduces
to (4b) whenonly (m, n) is available and to (4c) whenonly (V, n) is available,
In addition,we will wishto treatthe cases that arise whenV is singular.Hence
we define
{n -1
{V
t0
v< 0
O
if
if IJ> OX(5a)
V is non-singular
otherwise
(5b)
and
d=
<
t0
if
n-=0
(5c)
In termsof (5a), (5b), and (5c) we may rewrite(4a), (4b), and (4c) as
exp [-2n(m - Lt)th(m-L)] I h I| exp [-2 tr hV*] I h

exp [--n(m
- 8t)th(m-n
- iLa
i-4+t-1)
(4a')
(4b')
exp [--I tr hV*] |h|i-+1.(4c')

Notice that (4a) is now definedeven when v<0.
By adoptingthe conventionthat
(1) V* = 0 and b = n-1 when V is unknownor irrelevant,and
(2) n =0 when m is unknownor irrelevant,
the kernel(4a') reduces to (4b') in the firstcase and to (4c') in the second.
1.3 ConjugateDistributionsof (j, h), and h
When both U and ihare random variables,the natural conjugate of (4a') is
the Normal-Wishartdistributionff)((U, h m, V, n, v) definedas equal to
k(r,v)exp [-In(i-m)
-h(U-m)]
Ihlisexp [-ItrhV*]|V*hIIl-1
i(V+r-l) I
JfN (L I m, hn)fw(h I V, v)
if n > 0 and v > O
otherwise,

350
AMERICAN
STATISTICAL
ASSOCIATION
JOURNAL,
MARCH 1965
whereV* and 6 are definedas in (5), and

k(r, v)
v).
(2Yr)-IrnIrw(r,
(6b)
If (6a) is to be a properdensityfunctionV must be PDS, v>0 and n>0 so

that in this case a=1 and V* = V is PDS. We writethe firstexpressionin (6a)
with 6 and V* so that formulasfor posteriordensities will generalize autois such that one or moreof these
maticallyto the case wherepriorinformation
conditionshold: V is singular,n =0, v=0.
We obtain the marginalpriorof -i by integrating(6a) with respectto h; if
v>0, n >, 0V is PDS, and we defineH, = vnV-, then
D(ttim,
V, n, v)
H, v)
(|n,
[v + (ti -m)H,(#
- m)]2
(+r).
(6c)
This distributionof j is the non-degeneratemultivariateStudent distribution

definedin formula(8-26) of [5].2
Proof: We integrateover the regionRh -{ h h is PDS
D( |In, V, n, v)-A ff
}:
(i Im, hn)fw(h I V, v)dh
Rh
exp [-2 trh{n(ti - m) (p - m)t + V}I
oc
Rh
h I(v-+s)-ldh.
But as the integrandin the integral immediatelyabove is the kernel of a

Wishartdensity with parameter (n(p-m)(p-m)t+V,
v+8),
D(t
Im, V, n, v) X [1 +
(tt -m)
t(nV-I)(ti
m)]dv+1+r1).
Provided that v>O, V is PDS and n>O, H,v-nV-1 is PDS, 6=1, and we
have (6c).
If v> 0 but V is singular,V*-1 does not exist so neitherdoes the marginal
distributionof . And if n=O the marginaldistributionof does not exist.
Similarly,we obtain the marginalprioron h by integrating(6a) withrespect
to t. If v>0, n?0 and V is PDS, then
D(h I m, V, n, v)
fw(h I V, v) ocexp [-
trhV] I h 2 .
(6d)
If a Normal-Wishartdistributionwithparameter(zn', V', n', v') is assigned

to (ji, h) and ifa sample thenyieldsa statistic(m, V, n, v) the posteriordistribution of (Ui,h) will be Normal-Wishartwith parameter (mn",V*", n", v")
where
nt = n' + n,
6" =
V"/= v' + v + r +
n1
>
ml" = n"l1(n'm' + nm),
+6 S'-6"-4-1
(7a)
(7b)
5 Cornfield
and Geisser[3] prove a similar result: if the prioron (ui, h) has a kernel hi (112)v-1>0, and we
ofLuis multivarobservea samplewhichyieldsa statistic(in, V, n, v), v>0, then themarginalposteriordistribution
iate Student.

BAYESIAN
V*"I=
351
OF THE MULTIYARIATE
ANALYSIS
fV'+ V + n'm'mmt+ nmmt n"m"m"t _ V" ifV" is PDS

-
otherwise
. (7c)
Proof: When V' and V are both PDS, the priordensityand the sample likelihood combineto give the posteriordensityin the usual manner.When either
V' or V or both are singular,the priordensity(6a) or the sample likelihoodor
both may not exist.Even in such cases, we wishto allow forthe possibilitythat
the posteriordensity may be well defined.For this purpose, we definethe
posteriordensityin termsof V' and V ratherthan V'* and V*. Thus, multiplyingthe kernelof the priordensityby the kernelof the likelihood,we obtain
exp [--n'(L-m')th(
*exp[-4n(m
-
exp [-2S]
im')]
-
jh
i' exp [-4 trhV']| h |i'-1
p)th(m-p)]
Ih|*exp [-4 trhV]| h I(Y+r-l1)
(7d)
I h II/ exp [-4 trh(V' + V)] I h II(v'+1+r+5'+5-5"-?-1)-
where
n'(p -m')th(Lp
m') + n(m
t)th(m -).
Since h is symmetric,by using the definitionsof (7a), we may writeS as

(p- m")t(hn")(p
ml")
- mi"t(hn")mI"
+ m't(hn')m' + mt(hn)m.
Now, since
mn't(hn')m'+ mit(hn)m- m"t(hn")mI"
=
tr h[n'(m'mr") + n(mmz-n)
-n(m"m')],
by definingv" as in (7b), V" and V*" as in (7c), we may write the kernel
(7d) as
exp [-2(,j#- m") t(hn")(U - m")]
h|
exp [-4 tr hV*"] j h |
whichis the kernelof Normal-Wishartdensitywithparameter

(m'", V", n", v"').
We remarkherethat a priorof the form(6a) lacks flexibility

when v is small
because of the mannerin whichthis functionalforminterrelatesthe distributionof -iand h-.The natureof thisinterrelationship
is currentlybeingexamined
and willbe reportedin a later paper.'
2. SAMPLING
DISTRIBUTIONS
WITH FIXED
We assume herethat a sample of size n is to be drawnfroman r-dimensional

IndependentMultinormalprocess whose parameter (j, h) is a random variable havinga Normal-Wishartdistributionwith parameter(m', V', n', v').
2.1 ConditionalJointDistributionof (m,'V ,
h)
The conditionaljoint distributionof the statistic (mz, V) given that the

processparameterhas value (Ui,h) is, provided v>O,
3 Ando,A., and Kaufman,G., 'Extended Natural Conjugate Distributionsforthe MultinormalProcess.'

352
STATISTICAL
AMERICAN
D(m, V i t h, v) =
fN
ASSOCIATION
JOURNAL,
MARCH 1965
00~~~~~~~~1'
(m I U, hn)fw(V I h, v)
(r)
(8)
as shownin section 1.
2.2 Siegel's GeneralizedBeta Function
Siegel [6] establisheda class of integralidentitieswithmatrixargumentthat
generalizethe Beta and Gamma functions.We will use these integralidentities
in the proofsthat the unconditionalsamplingdistributionsof zm,of V and of
(m, V) are as shownin sections2.3, and 2.4, and 2.5. (In fact,the integrandin
Siegel'sidentityforthe generalizedGamma functionis the kernelof a Wishart
density.)
Let X be (rXr) and define
r)*
IPr(a) = Irr(r-1)14r(a)r(a-
),
(a-
(9a)
B,(a, b) = r,(a) r+(b)

rr(a + b)
(9b)
wherea > (r-1) /2, b> (r -1) /2. Siegelestablishedthe followingintegralidentix IX is PDS},
ties: lettingRx
RX I I
a-I(r+1)
X la+b
dX = Br(a, b).
(9c)
lettingV and B be real symmetricmatricesand letDefiningY = (I +X)'X,

tingv < Y < B denote the set I Y I B-Y, Y-V are PDS }; then
fwylat
(r+1) I
- y lS(r+1)dY
_ B,(a, b)
(9d)
Ry
wherethe domain of integrationRy { Y < Y <I }.

We shall definethe standardizedgeneralizedBeta densityfunctionas
fl*(Y
I a, b)
Br'(a, b) Y a
*r+1)
_2(r+1) -
a 2>(r
-1),
b > 2(r
1),
(9e)
Ry.
generalizedBeta densityfunctionis definedas

Similarlythe standardizedinverted
>(r-r1),
>
B(XI a, bV)-[Brr(a
)+)
>(
VeRvR
{VsVisPDS}.

(9g)
353
BAYESIAN ANALYSIS OF THE MULTIVARIATE
The functions (9e) and (9f) are related via the integrand transform
- YI -(r+l). The function(9g)
Y= (I+X)'-X, which has Jacobian J(Y, X)
into (9f) by an integrandtransformTXTt=V, where T
may be transformed
is a nonsingularupper triangularmatrixsuch that TTt = C.
2.3 UnconditionalJointDistributionof (m, V)
The unconditional(with respectto (~, h)) joint distributionof (mz,V) has
density
D(m, V Im', V', n', v'; n, v)
fRI
fhf (m
|t, hn)fw(V I h, v)fw(tp, h I|',
where the domain of integrationRt of U, is (-oo,

lhi h is PDS }.
D(Mi, Vj ni', V', n'; n, v)
V',,
v')dLadh (10a)
+ oo) and
Rh
oc V + C
of h is
(lOb)
where
n'n
nu
n' + n
+ V'.
(IOc)
.fw(V I h,v)fw(h
V', v')dh.
, and C = nu(m-m')(m
MI)t
Proof: Using (4a') and (8) we may write

D(m, V Im', n', PI; n, v)
=
Lh fX
(tsIm', hn')d
fIN(m I t hn)f;N
Rh
LLU
The innerintegralisf() (ml m', hnu).Hence the total integralmlaybe writtenas

fRx
(m
m hnu)fw (V I h, u)fw (h I V', v')dh

m',
whichupon replacingthefN andfw's by theirrespectiveformulasand dropping

constantsbecomes
I|Vn
fm exp [-nU(mRh
m')th(zm- m')]
*exp [-2 tr h(V' + V)] I h Ii(y++tra+V-V'-b-1)_ldh.
of (7), thisequals
Using the definitions
| Viw-f exp [Rh
LettingB
trh{ n,(m - m') (m - m') t + V' + V} ] I hjI"-1dh.

I
nu(rn- m') (m - m') t+V+V'
from(4c) we see that the integrand
constant
intheaboveintegral
is,asidefromthemultiplicative
w(r, i") I B I(i"+r-')

354
AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1965
a Wishartdensitywithparameter(B, v"). Hence,apartfroma normalizing

on neither
V norB,
depending
constant
D(m,VIm', V', n',v';n, v) M
xTjlv-i
I+ [1)
I Bli(p
lvl
| V + V' + n(n(m-m')(m
proving(lOb).
- ml)t
i v'+r_i)
2.4 UnconditionalDistributionof m~
ofrmn
can
(withrespectto (j-,h)) distribution
The kerneloftheunconditional
-j and U-are conditionally
the factthat'm-U
be foundthreeways:by utilizing
theunconditional
of m
distribution
givenh = h and thenfinding
independent
D(m, V|I', n', vI; n, v) overtherangeofvTwhen
as regards
h. By integrating
thisdistribution
exists,i.e., overRv {V V is PDS}; or we may findit by
the kernelof the marginallikelihoodof m-definedin (4b) multiintegrating
pliedby thekernelofthepriordensityof (U,h) overRUand Rh. The firstand
on whether
or notV is
thirdmethodshave the meritofin no way depending
whichofcourseis
singular-thatis, evenif ?<O (n<r), theproofsgo through,
notthe case ifwe proceedaccordingto the secondmethod.We showfirstby
thesecondmethodthatwhenv>0, v'>0, and V*'= V' is PDS,
n V', n', v'; n, v)
D(mn m',
j"D(m,
V I m', V', n', v'; n, v)dV
i n+(n-(m
[I1-
-m')
tV'l(r-nm
(1la)
(-'+r))]-
We thenshowby thefirstmethodthatthisresultholdsevenwhenv<0. If we
as
define
H,= v'n.V'-I,then(hla) maybe rewritten
[v'+ (m
m')tH( (m- m')]-i(;'+r)
(lib)
Providedv'>0, nu>O and H, is PDS thisis the kernelof the nondegenerate

Studentdensityfunctionwithparameter(m', H,, v') as definedin formula
(8-26) of [5].
and b= 1(v"+r-1)-a=-'(v'+r).
Proof: In (lOb) let a=(21v-1)+1(r+l)
to
distribution
is proportional
offlm
Thenthekerneloftheunconditional
|
IV
1)
|12(r+
Ia-
-C+ dV
(12)
whereRv= {VI V is PDS }. Thenby (9g),

D(m
Im', V', n';
n) M In.(m
-1+
m')(m
mn')I + V' I-b
(m -MI) tnuVf-l(M - M))-b,
establishing (1 la).
(11) is to use the factthat the

Proof:Anotherway of establishing
Alternate
forn>0 observations
ofm'n
giventheparamlikelihood
kernelofthe marginal

355
or notv <0
eter (t, h) is by (5) and (4b') whether
exp [-2n(m
U)Ith(m - tj|
h |it
(4b)
Furthermore,
conditionalon h= h, t and I --m-t are independentNormal
'
randomvectors;and so m U+ e is Normal withmean vector
E(mh) =E(U)
E(e) = rn' + O = m'
and variance-covariancematrix
V(mxi)= V(j) + V(t) = (hnu)-1.
Thus integratingwithrespectto h,
D(rn I V', n', v'; n, v)
Rh
exp [-2
trh{n,,(rn-m
')(m
rn')t + V'}]
h I"(V'+1)-ldh.
As the integrandin the above integralis the kernelof a Wishartdensitywith

parameter
v+1),
(n,(n(m-m')(m-m')t+V',
- m')(m
D(mI V', n', v'; n, v) oc n(nu(m
mn')t+ V,
-i(^'+r)
and (11) followsdirectly.

2.5 UnconditionalDistributionofV Whenv>0
The kernelof the unconditionaldistributionof V can be foundby integrating
or by integratingthe product
D(m, VI m', VI',n', v'; n, v) over the rangeof min,
of the kernelof the marginallikelihood(4c) of V and the kernelof the distribution (5a) of (ui, hL)with parameter (m', V', n', v') over the range of (U, h).
We show by the formermethod that for a>-(r-1)

v>O and V is PDS
and b>-(r-1),
IlV |a-j(r+1)(ia
t+Vl+
D(VD(VIrn,V',n'n)
I m' VtIv n ; n) oc
LXIVI +VI a+b
wherea = 2 (v+r-1) and b= (v'+r-1).
when
(16a)
upper
LettingK be a non-singular
triangularmatrixsuch that KKt = V' we may make the integrandtransform

KZKt=V and write(16a) as
|
Z |a-i(r+1)(1b
D(ZIrn',V',n';n)
, n ; n); ozl+
D(Z|I m,
(16b)
Formula (16b) is the kernelof a standardizedinvertedgeneralizedBeta func-
tionwithparameter(a, b).
Proof: From (lOb)
D(m, V I rn', V', n'; n, v)
K-i("+r-l).
I|V it'-1 V' + V + n(r(m-m')(m-rn')
is the kernelof a Student
Conditionalon V = V, the seconddeterminant
densityof in with parameter(mn,nu(v"-l)[V'+V], v"-l). That part

356
AMERICAN
STATISTICAL
ASSOCIATION
MARCH 1965
JOURNAL,
of the constantwhichnormalizesthis Studentdensityand whichinvolvesV is

IV'+V1 -( +-); hence (16a) follows.
Now since V' is PDS thereis a non-singulartriangularmatrixK of orderr
such that KKt=V'. If we make the integrandtransformKZKt=V', and let
kernelmay then
J(Z, V) denotethe Jacobianofthe transform,
the transformed
be writtenas
I KZKt
Ia-+(r+1)
(r+1)
IaV) 'J(Z,V) = I KK' 1-b-i(1+1)J(Z,

KKt + KZKt ia+b
I + zJl+b
V
Since J(Z, V) = IKi r+1=1 KKtj 2(r+l) = I VIT I(r+l),

formedkernelas shown in (16b.).
we may write the trans-
3, PreposteriorAnalysis with Fixed n>O
We assume that a sample of fixedsize n>O is to be drawnfroman r-dimensional Multinorinalprocess whose mean vector ti and matrixprecisionh are
not knownwith certaintybut are regardedas randomvariables (t, h) having
a priorNormal-Wishartdistributionwith parameter (in', V', n', v') where
n'>O, v'>O, but V' may or may not be singular.
3.1 JointDistributionof (mi", V")
The joint densityof (mi", V") is, providedn'>O, v'>O, and v'>O is
D(m", V" I ml) V', n', v'; n, v)
|V"-VI-*(m"-
m')("-t
VVI/~1
oc
1)t
l*v
(17a)
where
n* = ntn"/n,
(17b)
and therangeof (m", V") is

R(rn,,v
{ (m" V")
oo < m" < +
oo
and V"
-C
is PDS}.
(17c)
Proof:Followingthe argumentof subsections'12.1.4and 12.6.1 of [5] we can

establishthat
nu(m
m')th(m
m')
n*(m"
m')th(m"
M')
(18a)
wheren*= n'n"/n. The same line of argumentallows us to write

V"
= V' +
V + n'(m'm')t
V' + V + n*(m"
+ n(mmt)
m')(m
n"(M"i"t)
(18b)
- m')t.
From (7) we have

(m, V) =
-(n"n"
nIM'),
V"
V-
n*(m"
m')(m"
-m)
LettingJ(m", V"; m, V) denotethe Jacobianofthe integrandtransformation

in (lOb), obtaining
from(m, V) to (m", V"), we make this transformation
(17a). Since J(m", V"; m, V) =J(m", m) J(V", V) and both J(m", m)

357
and J(V", V) are constantsinvolvingneitherm" nor V", neitherdoes J(m",

V't; rn, V). When v<0 and V is singularthe numeratorin (17a) vanishes, so
that the densityexistsonly if v>0. However,if v>0, the kernelin (17a) exists
even if V' is singular(V'* = 0). Hence we writethe kernelas shown in (17a).
That the range of (m", V") is R(m,,"v,j as definedin (17c) followsdirectly
fromthe definitions
of mi" and V", of Rm and Rv, and of zn' and V'.
3.2 Some Distributionsof m'u"
The unconditionaldistributionofxm"is easily derivedfronm
the unconditional
distribution(1lb) of m~:providedn > 0, n'> 0, vI> 0, and V' is PDS,
D (m
rn'' VIn' , VIn, f)
' (n"/ _H V)
|I
(19a)
where
Hi=
(19b)
v'nuV'1-
and the conditionaldistributionof m-n"

givenV"
'V" is, provided '>0, n'>0,
Yt>O,
D(m" Im', V', n', V'; n, v', V") = f.s (n"
Im', H*, v)
(20a)
where
H=
V')1.
vn*("-
(20b)
The right-handside of (20a) is the invertedStudent density functionwith

parameter(m', H*, v) as definedin formula(8-34) of [5].
Proof: Since m" = (n")-' (n'im'+n)
vI>0, V' is PDS, aind
m
~V(m
(lib) when n > 0 n'> 0

anldsince fromn
| zn', H., V)
by Theorem 1 of subsection8.3.2. of [1],

2
-1 ,-I (r)
fs (izn" I m', (n"/n) H,,
fl1
V,).
To prove (20) observethat the kernelof the conditionaldistributionof m"

givenV" V" is proportionalto (17a), and so
D(mn" m', V', n', V; n, v,V")

|
- V' - n*((m" -
n')(in"
-M
I)t1.
Since
= v + n*(m"
-V'
Mn')(MI -
T)t
(V"-V') will be PDS, as long as v>0, and so (V"-V')-' is also PDS. Using
a well known determinentalidentityand lettingH* be as definedin (20b),
whenV" -V' is PDS we may writethe densityof mi" as
[l-
n*(n"
m')t(V"
V')'J(rn" -MI)PP-_
= P-j+1l[(in"
rn')tH*(trn -m')]i

l,
358
AMERICAN
STATISTICAL
ASSOCIATION
JOURNAL,
MARCH 1965
which,aside fromthe constant vzr,4, is the kernel of an inverted Student

densityfunctionwith parameter(in', H*, v).
3.3 Analysis Whenn < r
Even whenn <r, it is possibleto do Bayesian inferenceon (Ur,
h) by appropriately structuringthe prior so that the posteriorof (U, h) is nondegenerate.
For example, if the data generatingprocess is Multinormal,if we assign a
prioron ('j, hi)withparameter(0, 0, 0, 0), and thenobservea sample x(l),
z(10wheren <r, then v<0 and the posteriorparametersdefinedin (7) assume
values
V"*-0.
1 =O,
" = p + rmr" = m,
n" =n,
The posteriordistributionof (Ui,h) is degenerateunder these circumstances.
If, however,we insiston assigninga verydiffuseprioron (U, h), but are willing to introducejust enough priorinformationto make V" non-singularand
v">O then the posteriordistributionis non-degenerate;e.g., assign J= 1,
V'=MI, M>?, and leave n'=O and m'=O, so that P"'=l, V"=V+V'=V
+MI. In this case we have a bona fide non-degenerateNormal-Wishart
posteriordistributionof (Ui,h).
the unconditionaldistributionof the next sample observation
Furthermore,
j(n+l)
existsand is, by (l lb), multivariateStudent withparameter
(m,(n+
)(V
+ MI)-', 1).
In addition, for this example the distribution,unconditional as regards

(U,Ih), of the mean x of the next no observationsexistseven thoughn <r and
is by (llb) multivariateStudent with parameter(m, n,(V+MI)-', 1), where
n,,,=non/no+n. This distributionis, in effect,a probabilisticforecast of i.
From (19a) it also follows that the distribution,unconditional as regards
(ti,h), ofthe posteriormean ri" priorto observingx(n+l), * *, x(n+n ) is multivariate Student with parameter
(m, n(l +
) (V + M I)-', ) .
REFERENCES
to MultivariateStatisticalAnalysis.New York: John

[1] Anderson,T. W., Introduction
Wileyand Sons, Inc., 1958.
useful
[2] Deemer,W. L., and Olkin,I., "The Jacobiansofcertainmatrixtransformations
in multivariateanalysis,Biometrika,40 (1953), 43-6.
[3] Geisser, Samuel, and Cornfield,Jerome, "Posterior distributionsfor multivariate
normalparameters,"JournaloftheRoyal StatisticalSociety,Series B, 25 (1963), 36876.
H., Theoryof Probability.London, OxfordUniversityPress,Amen House,
[4] Jeifreys,
1961.
[5] Raiffa, H., and Schlaifer,R., Applied StatisticalDecision Theory.Boston, Massachusetts;Division of Research,Harvard Business School, 1961.
[6] Siegel, C. L., "tJberdie analytischeTheorie der quadratischenFormen," Annals of
36 (1935), 527-606.
Mathematics,
[7] Tiao, George C., and Zellner,Arnold,"On the Bayesian estimationof multivariate
JournaloftheRoyal StatisticalSociety,Series B, 26 (1964), 277-85.
regression,"


Ando and Kauffman 1965

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ando and Kauffman 1965

Uploaded by

Copyright:

Available Formats

Bayesian Analysis of the Independent Multinormal Process.

Neither Mean Nor Precision

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

BAYESIAN ANALYSIS OF THE INDEPENDENT

THIS paper we develop the distributiontheory necessary to carry out

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

first,followingthe lead of Jeifreys[4], both sets of authors assume that the

If the stoppingprocess is non-informative,

It is well known' that the kernelofthe joiintlikelihoodof (m, V) is, provided

[-2 tr hV] I h I(+r1),

the kernelof the marginallikelihoodof m is, providedV(nu)

and the kernelof the marginallikelihoodof V is, providedv> 0, and V is PDS,

Formula (4c) is the kernelof a Wishartdistribution.

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

That (m, V, v) definesa set of sufficientstatisticsfor (La,h) is shown in

exp [-2n(m - Lt)th(m-L)] I h I| exp [-2 tr hV*] I h

exp [--I tr hV*] |h|i-+1.(4c')

if n > 0 and v > O

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

whereV* and 6 are definedas in (5), and

If (6a) is to be a properdensityfunctionV must be PDS, v>0 and n>0 so

This distributionof j is the non-degeneratemultivariateStudent distribution

(i Im, hn)fw(h I V, v)dh

exp [-2 trh{n(ti - m) (p - m)t + V}I

But as the integrandin the integral immediatelyabove is the kernel of a

If a Normal-Wishartdistributionwithparameter(zn', V', n', v') is assigned

ml" = n"l1(n'm' + nm),

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

fV'+ V + n'm'mmt+ nmmt n"m"m"t _ V" ifV" is PDS

i' exp [-4 trhV']| h |i'-1

Ih|*exp [-4 trhV]| h I(Y+r-l1)

I h II/ exp [-4 trh(V' + V)] I h II(v'+1+r+5'+5-5"-?-1)-

Since h is symmetric,by using the definitionsof (7a), we may writeS as

exp [-4 tr hV*"] j h |

whichis the kernelof Normal-Wishartdensitywithparameter

We remarkherethat a priorof the form(6a) lacks flexibility

We assume herethat a sample of size n is to be drawnfroman r-dimensional

The conditionaljoint distributionof the statistic (mz, V) given that the

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

B,(a, b) = r,(a) r+(b)

lettingV and B be real symmetricmatricesand letDefiningY = (I +X)'X,

wherethe domain of integrationRy { Y < Y <I }.

generalizedBeta densityfunctionis definedas

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

BAYESIAN ANALYSIS OF THE MULTIVARIATE

|t, hn)fw(V I h, v)fw(tp, h I|',

where the domain of integrationRt of U, is (-oo,

Proof: Using (4a') and (8) we may write

The innerintegralisf() (ml m', hnu).Hence the total integralmlaybe writtenas

m hnu)fw (V I h, u)fw (h I V', v')dh

whichupon replacingthefN andfw's by theirrespectiveformulasand dropping

| Viw-f exp [Rh

trh{ n,(m - m') (m - m') t + V' + V} ] I hjI"-1dh.

nu(rn- m') (m - m') t+V+V'

from(4c) we see that the integrand

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM

AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1965

a Wishartdensitywithparameter(B, v"). Hence,apartfroma normalizing

V I m', V', n', v'; n, v)dV

m')tH( (m- m')]-i(;'+r)

Providedv'>0, nu>O and H, is PDS thisis the kernelof the nondegenerate

Proof: In (lOb) let a=(21v-1)+1(r+l)

whereRv= {VI V is PDS }. Thenby (9g),