You are on page 1of 13

Bayesian Analysis of the Independent Multinormal Process.

Neither Mean Nor Precision


Known
Author(s): Albert Ando and G. M. Kaufman
Source: Journal of the American Statistical Association, Vol. 60, No. 309 (Mar., 1965), pp. 347358
Published by: American Statistical Association
Stable URL: http://www.jstor.org/stable/2283159 .
Accessed: 07/04/2014 18:35
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .
http://www.jstor.org/page/info/about/policies/terms.jsp

.
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of
content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms
of scholarship. For more information about JSTOR, please contact support@jstor.org.

American Statistical Association is collaborating with JSTOR to digitize, preserve and extend access to Journal
of the American Statistical Association.

http://www.jstor.org

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

BAYESIAN ANALYSIS OF THE INDEPENDENT


MULTINORMAL
MEAN NOR PRECISION KNOWN*
PROCESS-NEITHER
ALBERT ANDO

University
ofPennsylvania
AND

G. M. KAUFMAN
MassachusettsInstituteof Technology
Underthe assumptionthatneitherthe mealivectornorthe variancecovariance matrixare known with certainty,the natural conjugate
familyof priordensitiesforthe multivariateNormalprocessis identified. Prior-posterior
and preposterioranalysis is done assumingthat
the prioris in the natural conjugatefamily.A procedureis presented
for obtainingnon-degeneratejoint posteriorand preposteriordistributions of all parameterseven when the numberof objective sample
observationsis less thanthe numberofparametersofthe process.

IN

1. INTRODUCTION

THIS paper we develop the distributiontheory necessary to carry out


Bayesian analysis of the multivariateNormal processas definedin Section
1.1 below when neitherthe mean vectornor the variance-covariancematrixof
the processis knownwithcertainty.The developmentheregeneralizesRaiffa's
and Schlaifer'streatmentof the multivariateNormal process as done in Part
B, Chapter 12 of reference[5], in which it is assumed that the variancecovariance matrix is known up to a particular multiplicativeconstant. We
drop this assumptionhere.
In Section 1 we definethe process,identifya class ofnaturalconjugatedistributions,and do prior-posterioranalysis. The conditionaland unconditional
statisticsare presentedin Section 2.
samplingdistributionsof some (sufficient)
In particular,we prove that the distributionof the sample mean vectormarginal withrespectto the sample variance-covariancematrix,to the processmean
vector,and to the process variance-covariancematrixis multivariateStudent
We then
wheneverthe prioris in the naturalconjugatefamilyof distributionls.
use the resultsof Sections 1 and 2 to do preposterioranalysisin Section3.
We also show in Section 3 that Bayesiall joint inference-findingjoint
posteriorand preposteriordensitiesofthe mean vectorand the variance-covariance matrix-is possible even when classical joint inferenceis not, i.e. whenl
the numberof objective sample observationsis less than the numberof distinct elements of the mean vector and variance-covariancematrix of the
process.
Geisserand Cornfield[3] and Tiao and Zellner [7] analyze the multivariate
Normal process and multivariate Normal Regression process respectively
underidenticalassumptionsabout the state of knowledgeof the parametersof
the process.Their presentationsdifferfromthat given here in threerespects:
* The contribution
by Ando to thispaper is partiallysupportedby a grantfromthe National Science Foundation.The authorswishto thankthe refereesforhelpfulsuggestionsand JimMartin forproofreading
early drafts,
and to gratefully
acknowledgevaluable commentsand criticismsgivento themby membersofthe Decisions Under
Uncertainty
Seminar,conductedby ProfeesorHoward Raiffaat the Harvard BusinessSchool.

347

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

348

AMERICAN

STATISTICAL

ASSOCIATION

JOURNAL,

MARCH 1965

first,followingthe lead of Jeifreys[4], both sets of authors assume that the


joint prior on the mean vector and variance-covariancematrix is a special
(degenerate)case of a natural conjugatedensity;second,herewe findsampling
unconditionalas regardsthe parametersofthe processand do predistributions
posterioranalysis; third,by doingthe analysis forthe completenatural conjugate family,we are able to providea procedureforderivingjoint posteriorand
some joint preposteriordistributionsof all parametersunder conditionsmentionedin the paragraphimmediatelyabove.
1.1 Definitionof theProcess
As in Anderson [1], we define an r-dimensionalIndependent Multinormal processas one that generatesindependentrX1 randomvectorsx(l),
x(i), * *with identicaldensities
-oo

fN (X

t , h) =(2gr) iexp

[-(X-

p)'h(x - p)] h

<

x <

2 -??
< p<

oo,

??(1

h is PDS.
1.2 Likelihoodof a Sample
The likelihoodthat the process will generaten successive values xl), . . .
x(n) is

(27r)-irnexp [-(x(

)x(j)

h Iin.

-)]

(2)

If the stoppingprocess is non-informative,


as definedin reference[1], this is
,(n).
thelikelihoodof a sample consistingofn observationsx(l), * * , x(i),
When neitherh nor ti is known,we may computethese statistics:
1

Ix(P)

n - r (redundant),

(3a)

and

V -_(x

-m)(x()

m)t.

(3b)

It is well known' that the kernelofthe joiintlikelihoodof (m, V) is, provided


V>0P

exp [-Xn(m

h Iexp

tt)th(m-)]

[-2 tr hV] I h I(+r1),

(4a)

the kernelof the marginallikelihoodof m is, providedV(nu)


>O,
exp [-in(m

-h
)th(m

)]

hI

(4b)

and the kernelof the marginallikelihoodof V is, providedv> 0, and V is PDS,


exp [-tr

hV] I h 0,(+r- ).

(4c)

Formula (4c) is the kernelof a Wishartdistribution.


A randommatrixV of dimension(rXr) will be called "Wishartdistributed
I See forexample,Anderson[11,Theorem3.3.2 and pp. 154-60.

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

BAYESIAN

ANALYSIS

349

OF THE MULTIVARIATE

withparameter(h, v)" if
V fw (V| h, v)
fw(r,v) |

'exp

V|

Ih Ii+r-)

[-2trhV]

ifVisPDSandv>O?

otherwise,

where
w(r,

v) -I2i(y+r)r7r(r-1)/4

+ r-

r(,(v

(4e)

i_1

That (m, V, v) definesa set of sufficientstatisticsfor (La,h) is shown in


section3.3.3 of [1].
We will wish to express (4a) in such a fashionthat it automaticallyreduces
to (4b) whenonly (m, n) is available and to (4c) whenonly (V, n) is available,
In addition,we will wishto treatthe cases that arise whenV is singular.Hence
we define

{n -1

{V

t0

v< 0

O
if

if IJ> OX(5a)
V is non-singular
otherwise

(5b)

and

d=

<

t0

if

n-=0

(5c)

In termsof (5a), (5b), and (5c) we may rewrite(4a), (4b), and (4c) as

exp [-2n(m - Lt)th(m-L)] I h I| exp [-2 tr hV*] I h


exp [--n(m

- 8t)th(m-n

- iLa

i-4+t-1)

(4a')
(4b')

exp [--I tr hV*] |h|i-+1.(4c')


Notice that (4a) is now definedeven when v<0.
By adoptingthe conventionthat
(1) V* = 0 and b = n-1 when V is unknownor irrelevant,and
(2) n =0 when m is unknownor irrelevant,
the kernel(4a') reduces to (4b') in the firstcase and to (4c') in the second.
1.3 ConjugateDistributionsof (j, h), and h
When both U and ihare random variables,the natural conjugate of (4a') is
the Normal-Wishartdistributionff)((U, h m, V, n, v) definedas equal to
k(r,v)exp [-In(i-m)

-h(U-m)]

Ihlisexp [-ItrhV*]|V*hIIl-1
i(V+r-l) I

JfN (L I m, hn)fw(h I V, v)

if n > 0 and v > O

otherwise,

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

350

AMERICAN

STATISTICAL

ASSOCIATION

JOURNAL,

MARCH 1965

whereV* and 6 are definedas in (5), and


k(r, v)

v).
(2Yr)-IrnIrw(r,

(6b)

If (6a) is to be a properdensityfunctionV must be PDS, v>0 and n>0 so


that in this case a=1 and V* = V is PDS. We writethe firstexpressionin (6a)
with 6 and V* so that formulasfor posteriordensities will generalize autois such that one or moreof these
maticallyto the case wherepriorinformation
conditionshold: V is singular,n =0, v=0.
We obtain the marginalpriorof -i by integrating(6a) with respectto h; if
v>0, n >, 0V is PDS, and we defineH, = vnV-, then
D(ttim,

V, n, v)

H, v)

(|n,

[v + (ti -m)H,(#

- m)]2

(+r).

(6c)

This distributionof j is the non-degeneratemultivariateStudent distribution


definedin formula(8-26) of [5].2
Proof: We integrateover the regionRh -{ h h is PDS

D( |In, V, n, v)-A ff

}:

(i Im, hn)fw(h I V, v)dh

Rh

exp [-2 trh{n(ti - m) (p - m)t + V}I

oc

Rh

h I(v-+s)-ldh.

But as the integrandin the integral immediatelyabove is the kernel of a


Wishartdensity with parameter (n(p-m)(p-m)t+V,
v+8),
D(t

Im, V, n, v) X [1 +

(tt -m)

t(nV-I)(ti

m)]dv+1+r1).

Provided that v>O, V is PDS and n>O, H,v-nV-1 is PDS, 6=1, and we
have (6c).
If v> 0 but V is singular,V*-1 does not exist so neitherdoes the marginal
distributionof . And if n=O the marginaldistributionof does not exist.
Similarly,we obtain the marginalprioron h by integrating(6a) withrespect
to t. If v>0, n?0 and V is PDS, then

D(h I m, V, n, v)

fw(h I V, v) ocexp [-

trhV] I h 2 .

(6d)

If a Normal-Wishartdistributionwithparameter(zn', V', n', v') is assigned


to (ji, h) and ifa sample thenyieldsa statistic(m, V, n, v) the posteriordistribution of (Ui,h) will be Normal-Wishartwith parameter (mn",V*", n", v")
where
nt = n' + n,

6" =

V"/= v' + v + r +

n1

>

ml" = n"l1(n'm' + nm),

+6 S'-6"-4-1

(7a)
(7b)

5 Cornfield
and Geisser[3] prove a similar result: if the prioron (ui, h) has a kernel hi (112)v-1>0, and we
ofLuis multivarobservea samplewhichyieldsa statistic(in, V, n, v), v>0, then themarginalposteriordistribution
iate Student.

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

BAYESIAN

V*"I=

351

OF THE MULTIYARIATE

ANALYSIS

fV'+ V + n'm'mmt+ nmmt n"m"m"t _ V" ifV" is PDS


-

otherwise

. (7c)

Proof: When V' and V are both PDS, the priordensityand the sample likelihood combineto give the posteriordensityin the usual manner.When either
V' or V or both are singular,the priordensity(6a) or the sample likelihoodor
both may not exist.Even in such cases, we wishto allow forthe possibilitythat
the posteriordensity may be well defined.For this purpose, we definethe
posteriordensityin termsof V' and V ratherthan V'* and V*. Thus, multiplyingthe kernelof the priordensityby the kernelof the likelihood,we obtain
exp [--n'(L-m')th(
*exp[-4n(m
-

exp [-2S]

im')]
-

jh

i' exp [-4 trhV']| h |i'-1

p)th(m-p)]

Ih|*exp [-4 trhV]| h I(Y+r-l1)

(7d)

I h II/ exp [-4 trh(V' + V)] I h II(v'+1+r+5'+5-5"-?-1)-

where

n'(p -m')th(Lp

m') + n(m

t)th(m -).

Since h is symmetric,by using the definitionsof (7a), we may writeS as


(p- m")t(hn")(p

ml")

- mi"t(hn")mI"

+ m't(hn')m' + mt(hn)m.

Now, since
mn't(hn')m'+ mit(hn)m- m"t(hn")mI"
=

tr h[n'(m'mr") + n(mmz-n)
-n(m"m')],

by definingv" as in (7b), V" and V*" as in (7c), we may write the kernel
(7d) as
exp [-2(,j#- m") t(hn")(U - m")]

h|

exp [-4 tr hV*"] j h |

whichis the kernelof Normal-Wishartdensitywithparameter


(m'", V", n", v"').

We remarkherethat a priorof the form(6a) lacks flexibility


when v is small
because of the mannerin whichthis functionalforminterrelatesthe distributionof -iand h-.The natureof thisinterrelationship
is currentlybeingexamined
and willbe reportedin a later paper.'
2. SAMPLING

DISTRIBUTIONS

WITH FIXED

We assume herethat a sample of size n is to be drawnfroman r-dimensional


IndependentMultinormalprocess whose parameter (j, h) is a random variable havinga Normal-Wishartdistributionwith parameter(m', V', n', v').
2.1 ConditionalJointDistributionof (m,'V ,

h)

The conditionaljoint distributionof the statistic (mz, V) given that the


processparameterhas value (Ui,h) is, provided v>O,
3 Ando,A., and Kaufman,G., 'Extended Natural Conjugate Distributionsforthe MultinormalProcess.'

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

352

STATISTICAL

AMERICAN

D(m, V i t h, v) =

fN

ASSOCIATION

JOURNAL,

MARCH 1965

00~~~~~~~~1'

(m I U, hn)fw(V I h, v)

(r)

(8)

as shownin section 1.
2.2 Siegel's GeneralizedBeta Function
Siegel [6] establisheda class of integralidentitieswithmatrixargumentthat
generalizethe Beta and Gamma functions.We will use these integralidentities
in the proofsthat the unconditionalsamplingdistributionsof zm,of V and of
(m, V) are as shownin sections2.3, and 2.4, and 2.5. (In fact,the integrandin
Siegel'sidentityforthe generalizedGamma functionis the kernelof a Wishart
density.)
Let X be (rXr) and define

r)*

IPr(a) = Irr(r-1)14r(a)r(a-

),

(a-

(9a)

B,(a, b) = r,(a) r+(b)


rr(a + b)

(9b)

wherea > (r-1) /2, b> (r -1) /2. Siegelestablishedthe followingintegralidentix IX is PDS},
ties: lettingRx

RX I I

a-I(r+1)

X la+b

dX = Br(a, b).

(9c)

lettingV and B be real symmetricmatricesand letDefiningY = (I +X)'X,


tingv < Y < B denote the set I Y I B-Y, Y-V are PDS }; then

fwylat

(r+1) I

- y lS(r+1)dY

_ B,(a, b)

(9d)

Ry

wherethe domain of integrationRy { Y < Y <I }.


We shall definethe standardizedgeneralizedBeta densityfunctionas
fl*(Y

I a, b)

Br'(a, b) Y a

*r+1)
_2(r+1) -

a 2>(r

-1),

b > 2(r

1),

(9e)

Ry.

generalizedBeta densityfunctionis definedas


Similarlythe standardizedinverted

>(r-r1),

>
B(XI a, bV)-[Brr(a

)+)

>(
VeRvR

{VsVisPDS}.

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

(9g)

353

BAYESIAN ANALYSIS OF THE MULTIVARIATE

The functions (9e) and (9f) are related via the integrand transform
- YI -(r+l). The function(9g)
Y= (I+X)'-X, which has Jacobian J(Y, X)
into (9f) by an integrandtransformTXTt=V, where T
may be transformed
is a nonsingularupper triangularmatrixsuch that TTt = C.
2.3 UnconditionalJointDistributionof (m, V)
The unconditional(with respectto (~, h)) joint distributionof (mz,V) has
density
D(m, V Im', V', n', v'; n, v)
fRI

fhf (m

|t, hn)fw(V I h, v)fw(tp, h I|',

where the domain of integrationRt of U, is (-oo,


lhi h is PDS }.
D(Mi, Vj ni', V', n'; n, v)

V',,

v')dLadh (10a)

+ oo) and

Rh

oc V + C

of h is

(lOb)

where
n'n

nu

n' + n

+ V'.

(IOc)

.fw(V I h,v)fw(h

V', v')dh.

, and C = nu(m-m')(m

MI)t

Proof: Using (4a') and (8) we may write


D(m, V Im', n', PI; n, v)
=

Lh fX
(tsIm', hn')d
fIN(m I t hn)f;N
Rh

LLU

The innerintegralisf() (ml m', hnu).Hence the total integralmlaybe writtenas


fRx

(m

m hnu)fw (V I h, u)fw (h I V', v')dh


m',

whichupon replacingthefN andfw's by theirrespectiveformulasand dropping


constantsbecomes

I|Vn

fm exp [-nU(mRh

m')th(zm- m')]
*exp [-2 tr h(V' + V)] I h Ii(y++tra+V-V'-b-1)_ldh.

of (7), thisequals
Using the definitions

| Viw-f exp [Rh

LettingB

trh{ n,(m - m') (m - m') t + V' + V} ] I hjI"-1dh.


I

nu(rn- m') (m - m') t+V+V'

from(4c) we see that the integrand

constant
intheaboveintegral
is,asidefromthemultiplicative
w(r, i") I B I(i"+r-')

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

354

AMERICAN STATISTICAL ASSOCIATION JOURNAL, MARCH 1965

a Wishartdensitywithparameter(B, v"). Hence,apartfroma normalizing


on neither
V norB,
depending
constant
D(m,VIm', V', n',v';n, v) M

xTjlv-i

I+ [1)
I Bli(p

lvl

| V + V' + n(n(m-m')(m

proving(lOb).

- ml)t

i v'+r_i)

2.4 UnconditionalDistributionof m~

ofrmn
can
(withrespectto (j-,h)) distribution
The kerneloftheunconditional
-j and U-are conditionally
the factthat'm-U
be foundthreeways:by utilizing
theunconditional
of m
distribution
givenh = h and thenfinding
independent
D(m, V|I', n', vI; n, v) overtherangeofvTwhen
as regards
h. By integrating
thisdistribution
exists,i.e., overRv {V V is PDS}; or we may findit by
the kernelof the marginallikelihoodof m-definedin (4b) multiintegrating
pliedby thekernelofthepriordensityof (U,h) overRUand Rh. The firstand
on whether
or notV is
thirdmethodshave the meritofin no way depending
whichofcourseis
singular-thatis, evenif ?<O (n<r), theproofsgo through,
notthe case ifwe proceedaccordingto the secondmethod.We showfirstby
thesecondmethodthatwhenv>0, v'>0, and V*'= V' is PDS,
n V', n', v'; n, v)
D(mn m',

j"D(m,

V I m', V', n', v'; n, v)dV

i n+(n-(m
[I1-

-m')

tV'l(r-nm

(1la)

(-'+r))]-

We thenshowby thefirstmethodthatthisresultholdsevenwhenv<0. If we
as
define
H,= v'n.V'-I,then(hla) maybe rewritten
[v'+ (m

m')tH( (m- m')]-i(;'+r)

(lib)

Providedv'>0, nu>O and H, is PDS thisis the kernelof the nondegenerate


Studentdensityfunctionwithparameter(m', H,, v') as definedin formula
(8-26) of [5].
and b= 1(v"+r-1)-a=-'(v'+r).

Proof: In (lOb) let a=(21v-1)+1(r+l)

to
distribution
is proportional
offlm
Thenthekerneloftheunconditional
|

IV

1)

|12(r+
Ia-

-C+ dV

(12)

whereRv= {VI V is PDS }. Thenby (9g),


D(m

Im', V', n';

n) M In.(m
-1+

m')(m

mn')I + V' I-b

(m -MI) tnuVf-l(M - M))-b,

establishing (1 la).

(11) is to use the factthat the


Proof:Anotherway of establishing
Alternate
forn>0 observations
ofm'n
giventheparamlikelihood
kernelofthe marginal

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

355

BAYESIAN ANALYSIS OF THE MULTIVARIATE

or notv <0
eter (t, h) is by (5) and (4b') whether
exp [-2n(m

U)Ith(m - tj|

h |it

(4b)

Furthermore,
conditionalon h= h, t and I --m-t are independentNormal
'
randomvectors;and so m U+ e is Normal withmean vector
E(mh) =E(U)

E(e) = rn' + O = m'

and variance-covariancematrix
V(mxi)= V(j) + V(t) = (hnu)-1.
Thus integratingwithrespectto h,
D(rn I V', n', v'; n, v)

Rh

exp [-2

trh{n,,(rn-m

')(m

rn')t + V'}]

h I"(V'+1)-ldh.

As the integrandin the above integralis the kernelof a Wishartdensitywith


parameter
v+1),
(n,(n(m-m')(m-m')t+V',
- m')(m
D(mI V', n', v'; n, v) oc n(nu(m

mn')t+ V,

-i(^'+r)

and (11) followsdirectly.


2.5 UnconditionalDistributionofV Whenv>0
The kernelof the unconditionaldistributionof V can be foundby integrating
or by integratingthe product
D(m, VI m', VI',n', v'; n, v) over the rangeof min,
of the kernelof the marginallikelihood(4c) of V and the kernelof the distribution (5a) of (ui, hL)with parameter (m', V', n', v') over the range of (U, h).

We show by the formermethod that for a>-(r-1)


v>O and V is PDS

and b>-(r-1),

IlV |a-j(r+1)(ia

t+Vl+
D(VD(VIrn,V',n'n)
I m' VtIv n ; n) oc
LXIVI +VI a+b

wherea = 2 (v+r-1) and b= (v'+r-1).

when

(16a)

upper
LettingK be a non-singular

triangularmatrixsuch that KKt = V' we may make the integrandtransform


KZKt=V and write(16a) as
|

Z |a-i(r+1)(1b
D(ZIrn',V',n';n)
, n ; n); ozl+
D(Z|I m,

(16b)

Formula (16b) is the kernelof a standardizedinvertedgeneralizedBeta func-

tionwithparameter(a, b).
Proof: From (lOb)
D(m, V I rn', V', n'; n, v)

K-i("+r-l).
I|V it'-1 V' + V + n(r(m-m')(m-rn')
is the kernelof a Student
Conditionalon V = V, the seconddeterminant
densityof in with parameter(mn,nu(v"-l)[V'+V], v"-l). That part

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

356

AMERICAN

STATISTICAL

ASSOCIATION

MARCH 1965

JOURNAL,

of the constantwhichnormalizesthis Studentdensityand whichinvolvesV is


IV'+V1 -( +-); hence (16a) follows.
Now since V' is PDS thereis a non-singulartriangularmatrixK of orderr
such that KKt=V'. If we make the integrandtransformKZKt=V', and let
kernelmay then
J(Z, V) denotethe Jacobianofthe transform,
the transformed
be writtenas

I KZKt

Ia-+(r+1)

(r+1)

IaV) 'J(Z,V) = I KK' 1-b-i(1+1)J(Z,


KKt + KZKt ia+b
I + zJl+b
V

Since J(Z, V) = IKi r+1=1 KKtj 2(r+l) = I VIT I(r+l),


formedkernelas shown in (16b.).

we may write the trans-

3, PreposteriorAnalysis with Fixed n>O

We assume that a sample of fixedsize n>O is to be drawnfroman r-dimensional Multinorinalprocess whose mean vector ti and matrixprecisionh are
not knownwith certaintybut are regardedas randomvariables (t, h) having
a priorNormal-Wishartdistributionwith parameter (in', V', n', v') where
n'>O, v'>O, but V' may or may not be singular.
3.1 JointDistributionof (mi", V")
The joint densityof (mi", V") is, providedn'>O, v'>O, and v'>O is
D(m", V" I ml) V', n', v'; n, v)
|V"-VI-*(m"-

m')("-t

VVI/~1

oc

1)t

l*v

(17a)

where
n* = ntn"/n,

(17b)

and therangeof (m", V") is


R(rn,,v

{ (m" V")

oo < m" < +

oo

and V"

-C

is PDS}.

(17c)

Proof:Followingthe argumentof subsections'12.1.4and 12.6.1 of [5] we can


establishthat
nu(m

m')th(m

m')

n*(m"

m')th(m"

M')

(18a)

wheren*= n'n"/n. The same line of argumentallows us to write


V"

= V' +

V + n'(m'm')t

V' + V + n*(m"

+ n(mmt)

m')(m

n"(M"i"t)

(18b)

- m')t.

From (7) we have


(m, V) =

-(n"n"

nIM'),

V"

V-

n*(m"

m')(m"

-m)

LettingJ(m", V"; m, V) denotethe Jacobianofthe integrandtransformation


in (lOb), obtaining
from(m, V) to (m", V"), we make this transformation
(17a). Since J(m", V"; m, V) =J(m", m) J(V", V) and both J(m", m)

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

357

BAYESIAN ANALYSIS OF THE MULTIVARIATE

and J(V", V) are constantsinvolvingneitherm" nor V", neitherdoes J(m",


V't; rn, V). When v<0 and V is singularthe numeratorin (17a) vanishes, so
that the densityexistsonly if v>0. However,if v>0, the kernelin (17a) exists
even if V' is singular(V'* = 0). Hence we writethe kernelas shown in (17a).
That the range of (m", V") is R(m,,"v,j as definedin (17c) followsdirectly
fromthe definitions
of mi" and V", of Rm and Rv, and of zn' and V'.
3.2 Some Distributionsof m'u"
The unconditionaldistributionofxm"is easily derivedfronm
the unconditional
distribution(1lb) of m~:providedn > 0, n'> 0, vI> 0, and V' is PDS,
D (m

rn'' VIn' , VIn, f)

' (n"/ _H V)

|I

(19a)

where
Hi=

(19b)

v'nuV'1-

and the conditionaldistributionof m-n"


givenV"

'V" is, provided '>0, n'>0,

Yt>O,

D(m" Im', V', n', V'; n, v', V") = f.s (n"

Im', H*, v)

(20a)

where
H=

V')1.

vn*("-

(20b)

The right-handside of (20a) is the invertedStudent density functionwith


parameter(m', H*, v) as definedin formula(8-34) of [5].
Proof: Since m" = (n")-' (n'im'+n)
vI>0, V' is PDS, aind
m

~V(m

(lib) when n > 0 n'> 0


anldsince fromn

| zn', H., V)

by Theorem 1 of subsection8.3.2. of [1],


2
-1 ,-I (r)
fs (izn" I m', (n"/n) H,,

fl1

V,).

To prove (20) observethat the kernelof the conditionaldistributionof m"


givenV" V" is proportionalto (17a), and so

D(mn" m', V', n', V; n, v,V")


|

- V' - n*((m" -

n')(in"

-M

I)t1.

Since
= v + n*(m"

-V'

Mn')(MI -

T)t

(V"-V') will be PDS, as long as v>0, and so (V"-V')-' is also PDS. Using
a well known determinentalidentityand lettingH* be as definedin (20b),
whenV" -V' is PDS we may writethe densityof mi" as
[l-

n*(n"

m')t(V"

V')'J(rn" -MI)PP-_
= P-j+1l[(in"

rn')tH*(trn -m')]i

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

l,

358

AMERICAN

STATISTICAL

ASSOCIATION

JOURNAL,

MARCH 1965

which,aside fromthe constant vzr,4, is the kernel of an inverted Student


densityfunctionwith parameter(in', H*, v).
3.3 Analysis Whenn < r
Even whenn <r, it is possibleto do Bayesian inferenceon (Ur,
h) by appropriately structuringthe prior so that the posteriorof (U, h) is nondegenerate.
For example, if the data generatingprocess is Multinormal,if we assign a
prioron ('j, hi)withparameter(0, 0, 0, 0), and thenobservea sample x(l),
z(10wheren <r, then v<0 and the posteriorparametersdefinedin (7) assume
values
V"*-0.
1 =O,
" = p + rmr" = m,
n" =n,
The posteriordistributionof (Ui,h) is degenerateunder these circumstances.
If, however,we insiston assigninga verydiffuseprioron (U, h), but are willing to introducejust enough priorinformationto make V" non-singularand
v">O then the posteriordistributionis non-degenerate;e.g., assign J= 1,
V'=MI, M>?, and leave n'=O and m'=O, so that P"'=l, V"=V+V'=V
+MI. In this case we have a bona fide non-degenerateNormal-Wishart
posteriordistributionof (Ui,h).
the unconditionaldistributionof the next sample observation
Furthermore,
j(n+l)
existsand is, by (l lb), multivariateStudent withparameter
(m,(n+

)(V

+ MI)-', 1).

In addition, for this example the distribution,unconditional as regards


(U,Ih), of the mean x of the next no observationsexistseven thoughn <r and
is by (llb) multivariateStudent with parameter(m, n,(V+MI)-', 1), where
n,,,=non/no+n. This distributionis, in effect,a probabilisticforecast of i.
From (19a) it also follows that the distribution,unconditional as regards
(ti,h), ofthe posteriormean ri" priorto observingx(n+l), * *, x(n+n ) is multivariate Student with parameter
(m, n(l +

) (V + M I)-', ) .

REFERENCES

to MultivariateStatisticalAnalysis.New York: John


[1] Anderson,T. W., Introduction
Wileyand Sons, Inc., 1958.
useful
[2] Deemer,W. L., and Olkin,I., "The Jacobiansofcertainmatrixtransformations
in multivariateanalysis,Biometrika,40 (1953), 43-6.
[3] Geisser, Samuel, and Cornfield,Jerome, "Posterior distributionsfor multivariate
normalparameters,"JournaloftheRoyal StatisticalSociety,Series B, 25 (1963), 36876.
H., Theoryof Probability.London, OxfordUniversityPress,Amen House,
[4] Jeifreys,
1961.
[5] Raiffa, H., and Schlaifer,R., Applied StatisticalDecision Theory.Boston, Massachusetts;Division of Research,Harvard Business School, 1961.
[6] Siegel, C. L., "tJberdie analytischeTheorie der quadratischenFormen," Annals of
36 (1935), 527-606.
Mathematics,
[7] Tiao, George C., and Zellner,Arnold,"On the Bayesian estimationof multivariate
JournaloftheRoyal StatisticalSociety,Series B, 26 (1964), 277-85.
regression,"

This content downloaded from 130.113.126.253 on Mon, 7 Apr 2014 18:35:43 PM


All use subject to JSTOR Terms and Conditions

You might also like