You are on page 1of 6

Kafkas Univ Vet Fak Derg

18 (4): 627-632, 2012

RESEARCH ARTICLE

Predicting Breeding Values in Animals by Kalman Filter:


Application to Body Condition Scores in Dairy Cattle [1]
Burak KARACAREN * Luc L.G. JANSS ** Haja N. KADARMIDEEN ***
[1]
*
**
***

Dr Karacaren acknowledges funding from Akdeniz University and Eidgenssische Technische Hochschule
Department of Animal Science, Faculty of Agriculture, University of Akdeniz, TR-07070 Antalya - TURKEY
Department of Genetics and Biotechnology, Research Centre Foulum Blichers Alle 20, University of Aarhus, DENMARK
Section of Quantitative and System Genetics, Department of Animal and Veterinary Sciences, Faculty of Life Sciences,
University of Copenhagen, DENMARK

Makale Kodu (Article Code): KVFD-2012-6107


Summary
The aim of this study was to investigate usefulness of Kalman Filter (KF) Random Walk methodology (KF-RW) for prediction of
breeding values in animals. We used body condition score (BCS) from dairy cattle for illustrating use of KF-RW. BCS was measured by
Swiss Holstein Breeding Association during May 2004-March 2005 for 7 times approximately at monthly intervals from dairy cows
(n=80) stationed at the Chamau research farm of Eidgenssische Technische Hochschule (ETH), Switzerland. Benefits of KF were
demonstrated using random walk models via simulations. Breeding values were predicted over days in milk for BCS by KF-RW. Variance
components were predicted by Gibbs sampling. Locally weighted scatter plot smoothing (LOWESS) and KF-RW were compared under
different longitudinal experimental designs, and results showed that KF-RW gave more reasonable estimates especially for lower
smoother span of LOWESS. Estimates of variance components were found more accurate when the number of observations and
number of subjects increased and increasing these quantities decreased standard errors. Fifty subjects with 10 observations each,
started to give reasonable estimates. Posterior means for variance components were found (with standard errors) 0.03 (0.006) for
animal genetic variance 0.04 (0.007) for permanent environmental variance and 0.21 (0.02) for error variance. Since KF gives online
estimation of breeding values and does not need to store or invert matrices, this methodology could be useful in animal breeding
industry for obtaining online estimation of breeding values over days in milk.

Keywords: Kalman filter, Body condition score, Bayesian methods

Damzlk Deerlerinin Kalman Szgeci ile Tahmini:


St Srlarndan Toplanan Vcut Kondsyon Puanlar in
Bir Uygulama
zet
Bu almann ana amac; rassal yry taslamn kalman szgeci (RY-KS) ile kullanarak iftlik hayvanlarnda zamana dayal
damzlk deerlerinin tahmin edilebilmesinin uygunluunun aratrlmasdr. Veri seti Chamau enstitsndeki st ineklerinden, svire
Holtayn birlii tarafndan elde edilmitir (n=80). Yerel arlkl uzanm tahmincisi (LOWESS) yntemi ve RY-KS benzeim yolu ile farkl
deneme desenleri iin karlatrld, sonular benzer bulunsa da, baz artlar altnda RY-KS daha uygun sonular verdi. Birey ve gzlem
saylarnn varyans unsurlar ve rete tahminine olan etkileri incelendi ve her ikisinin arttrlmasnn daha doru tahminleri daha kk
standart hatalarla verdii saptand. Varyans unsurlar gibbs rneklemesi ile tahmin edildi. Soncul ortalamalar genetik varyans iin 0.03
(0.006), kalc evre iin 0.21 (0.02) ve hata varyans iin 0.21 (0.02) olarak hesapland. Kalman Szgeci damzlk deerlerinin gncel
tahminlerini vermesi ve matrislerin devrilmesine ihtiya gstermemesi nedeni ile byk veri setlerinden zamana dayal damzlk
deerlerinin tahmin edilmesinde faydal olabilir.

Anahtar szckler: Kalman szgeci, Vcut kondsyon puanlar, Bayesi yntemler

letiim (Correspondence)
+90 242 2274560
burakkaracaoren@akdeniz.edu.tr

628

Predicting Breeding Values...

INTRODUCTION
A filter is a device that is used for example to separate
water from particles. By analogy this idea was extended to
separation of signals from noise 1 in engineering context.
Kalman Filter (KF) was defined with Bayesian features
by Harrison and Stevens 2 as Bayesian Dynamic Linear
Model3 although West and Harrison 4 quoted as Bayesian
forecasting is Kalman Filtering is akin to saying that
statistical inference is least squares. Migon et al.5 described
some special characteristics of KF as following; i) all
relevant informations are used, including history, factual
or subjective experiences, and knowledge of forthcoming
events ii) routine forecasting is produced by a statistical
model and exceptions can be considered as an anticipation
or in a retrospective base iii) prospective (what happened)
and retrospective (what if) analysis are easily accommodated
iv) model decomposition, a full Bayesian forecasting model
may be decomposed into independent dynamic linear
models, each one describing particular features of the process
under analyses. More details about some of literatures about
Kalman Filter could be found in 6.
KF methodology could be useful in animal breeding
industry when analyzing very big time series data sets since
it does not need store or invert matrices. In addition online
estimation of breeding values could be useful for selection
schemes over time. As was shown by Van Bebber et al.7 KF
could also be used for detecting false measurements in
experimental farms.
We investigated the possible application of KF-RW in
field conditions. In this context, we chose body condition
score (BCS) in dairy cows 8,9. Body weight, feed (dry matter)
intake and milk production level together form an important
cluster of functional traits that determines the amount of
fat reserves stored in the body; this is now recorded in many
countries as body condition score or BCS 8,9. A common
body condition scoring system has been developed to
estimate the BCS of cows in a herd. This system provides
producers a relative score based on an evaluation of fat
deposits in relation to skeletal features. The scoring method
involves a manual assessment of the thickness of fat cover
and prominence of bone at the tail head and loin area.
The most widely used body condition scoring system for
dairy cattle assigns scores from 1 to 9 in North America and
from 1 to 5 in most European countries, with the lowest
score meaning emaciated and carrying virtually no fat
and the highest score meaning excessively fat. Veerkamp
and Brotherstone 10 estimated variance components for
BCS at calving and for average BCS over the first 26 wk
of lactation; they reported heritability estimates for BCS
ranging between 0.24 and 0.43. Jones et al.11 reported for
Holstein Friesian heifers moderate heritability estimates of
BCS, varying with stage of lactation from 0.23 to 0.28.
Sallas and Harville 12 suggested to use KF for prediction

of breeding values for consecutive lactations. Forni et al.13


used dynamic linear model via KF to modeling of cattle
growth data. The main aim of this study was to provide
theoretical developments of KF-RW in the context of adapting
it to analysis of animal breeding data with an emphasis
on application of KF-RW in predicting breeding values over
DIM for BCS. This would be the first use of KF-RW for predicting
breeding values over DIM. Simulations are performed
to check the validity of methods and comparison of different
sub-models.

MATERIAL and METHODS


Statistical Models and Analyses
Random Walk Model: For demonstration purposes we
used random walk model and it is given below.

y t = a t + t , t N (0, e2 )

a t +1 = a t + t , t N (0, n2 )

(1)

In (1) the first equation is called the observation equation
and the second equation is called the state equation. We
assumed that observations yt, depends on unobservable
quantity at , and our aim was to do statistical inference on
at (states). We assumed constant variances for t and t as
e2 and n2 respectively with independent, identically and
normally distributed random variables with zero means.
LOWESS 14 smoother was used to make comparisons with
random walk model (1) in recursive form (KF-RW).
Parameter values assumed were n =100 and e2 =25
for simulating observations, y, from random walk model. A
total of 1000 Gibbs cycles with first 50 cycles used as burn
in period used to obtaining estimates of states, , and
2
2
variance components n , e . For genetic analyses of
traits following mixed model is normally used in animal
breeding;
2

y = X + Z a a + Z p p + e (2)

where y is the vector of observations,


is the vector of
2
t a ist ,the
t vector
N 0of
, breeding
e
fixedy teffects,
values, p is the
vector of random permanent environmental
effects, X,
t matrices
N 0e, is n2the vector of random
Za, Zp tare
1 design
t , t and
residual effects.

y the
Xrandom
Z a aeffects
Z p pitwas
e assumed that
For

A a2
a0


Var p 0 ~ N 0; 0
e
0
0

0
I
0

a t 1 A a2
at

Var p t ~ N p t 1 ; 0

2
p

0
I e2

0
I p2

t
tt , t t ,ttN
0NN
e0N,0e2,,022e,2e2n2
yyyttt1
,

t
tat ,tt ,

N
0
t
A

yt t X
t t 0Zt t a, tt Z pN0e2,e
0 0
at Np0, n a e 2 2 2
t,
t 1 y
t
tt11
tN
,
Var
pt 0,~t,ttt,N
;
t 1y
tt
p0N0,NN,N00ee02,0n02,,,n2nnI2 p 0
N

0t
t
ta
tZ
tttZ
a
,

yt 1t1X

629
e 0Zt p t ep 2 0 2 n 0
Ie2

y X

Z
a
a0a Zt a,aapt Z
ApN
0
, 0
a0e
y1X
X
KARACAREN, JANSS

aZ
y yytX
Zt

e e e n 2
pZpZpppp p

X
Z
a
aa
a
KADARMIDEEN
y X
p 2 e I0 p
Var
pa0 ~ZNa a0; ZAp0
00
0
a
2

y a0 X
Z
0e220 00 20 I0
a aA
eaa00 Z
p
apA
2
0;
02aa 0 I0 p 20 00e

A
A
~ N
LOWESS: The LOWESS model 12 was used to capture the
Var
0

a 0ap

a
a
0
0

0 t a ta1 2 20A
2
a

Var pVar
0;N00; A0Ia pI2200p 20 I002
~p
paN~
local variability by weighted least square regression using

0
Var
N
0;0 020 III p 02 0 0e 2
~~N
pe~0000Np
N
;
0
VarVar
pVar
;
0

Var

~
;

p
0
I
0
2
p

0
a

A
0
0
2

different smoother spans.

Ip e I22e02 p
e00 t~ N0; a00t010 I0
e 00ep

0
I

2
2

0
e
Var
0;000e t 1 I00p 2 0II0e e02 Ie
e0pe0ae0~eN
t
Implementation
a 0 A 0 0I 0

t
e t01 0 a I 2 e
2
0p2 ~ N
2
Var
I0e permanent
00 environAll computations were made in R 14, using the package
patt11e2;; are
atta , p2, and
p
A0
where
genetic,
a
15
2
2

I0t1e2 t, Ygenetic
aand
, Y0tthe
implemented in R, for sampling from
at t,Y
naevariances.
p 02atA201 is
tt
0
1 20padditive
t 1 p Yt called
t , Yt 1 MASS
a t pep
ment
tt error
1 aAt
A
1

2
Var
~
N
;

p
0
I

a
A
0
0

a
A
0
0

relationship
t t t matrix
p 01
multivariate
Normal
distributions. We compared KF-RW

t 1tfor

1A
1t
a a 0

2
a
a

~paNt ~
0I2 2 p2II2is0p 220an identity
0N02matrix.
1
Var pVar
pNt1p;eat t211the
0; 2aanimals;

A
2 different

and
LOWESS
approach.
We investigated

t
a
2
2
2
0
0
I

e
fp~ttyN~
b~,NaN
,
p
I0I
0 02exp
p / of
Z a a Z peffect
Var
;0p,0a e I
pt1;t1a1,;A
e y Xb Z a a Z p p y Xb
p,p

p ep 0
e

a
0

VarVar
pIntapthe

t e
2
t

Var

2
1
t

p
ept we
0 Iassumptions
of subjects and different number of observations
0show

2Y
; 0t 0100,general
t1 used
IptIe0I12Ie2e0t e,Proportional
1
Y0It 102Ip
p2Yt in t , Ynumber
,~YeN
pe t pteepttt~tfollowing,
eetp1tt1;11on
nt
t
t 1
0
Var
p
0
per
subject
on
estimation
of
both
states
and
variance
t N

KF-RW
method,
based
Bayesian
principles.

e
0
0
e

1
t

p
t

e 2
t e t1 e 0
1
0
2I

components.
Finally
we
applied
the
theory
to
animal
1
t

e
t

joint
posterior
distribution
without
constant
terms
given

N
1

p et tt , Yne1
p I
, Yt T1 p Yt 1 Nt , Yt 1 1

2t 1 p
2
02 t 11, Y
t
t 0
1 2
t 1e
p2,t
N,a
2,tYt

f
y
b
,
a
,
p

y
Xb
a
Z
p
y
Xb
p
/

a Z
2,nbased
2 e
12t 1 exp
2
1Z a a Z

,
Y

,
Y
p

p
Y
,
Y
a
e
e
a
p
p
breeding
data
to
predict
breeding
values
over
DIM.
t

1
t

1
t

1
t
t
t

1
in
(3)
using
(2)
on
following
recursive
relationship
(4);
2
2
tppexp
Y21tYpt taY, Yt ,tY1t ,1Yexp a t a t 1 A a t a t 1 / a
p t p t , Yxn ,
t 1 pa1
At 1 a11t ,/Yt,aY1 p
YYna p
tt 1,Y
t 1 , Y,t Y
1 p
1N t t,Y
1 p

2
t nt ,
n p
t 2
1
t 12 p1t
t 1, Y t
t 12 ,t Y t t 1
2 2 t
2Y
2

p tpfttyt b,Y
,
p
p
2
t
t

1
t

1
t

1
t
t

1
ty
t Xb
t 1 Z a Z p y Xb Z a Z p2 /

N
1

1
,
a
,
p
,

exp

2a
2p
2
2
, YttY
e ,Y
a
p y Xb Z a a Z p /p
e

,
Y

p
Y

,
Y
2N1exp
t N1e1p
1

f
y
b
,
a
,
p
,

y
Xb
Z
a
Z
p

n
t
t

1
t

1
t
1
t
t
t

1
2
2
2
2
2

1
a ZZp ppp yyXb
, Yn , 2a2p, 2 tp2t2e12,Yt 1 2pe1
t,Yt 1y p12YXb
t , YZ
p
N
1exp
t 1Z
f yp bf,tay,bp,t,a
//
ee2e2 2
ZZaaaaa Z
2t
, pa ,1 ap ,2, pe ,2, e 1
Zappp
pZ
1
e
a aa a
1T 2y t yXb
2Nexp
21
p Z p Xb
2 e
exp
Z
a
y
Xb
Z
p /

N1Xb
1

a a 1,
a Z a
2
2
1 2e1 exp
2

1
p e
a
p
a
T

f y bf ,xya,bp, a,2, pa22,,N

2
y
Xb
Z
p
y
Xb
Z
a
Z
p

N
1
2a 2 2 Nexp

1 A 1aa t a t p1 //p2a2e2e2
p2 N pe2 2a A
e1 a2N /
p

exp
a
a
1

p a

2
2

2
2
2
aa , p ,2
1
1
a
t
t

f
y
b
,
,

exp

y
Xb
Z
a
Z
p
y
Xb
Z
a
Z

Xb
2
e
a
/ eTt2
I a ap
f y b, a, xp,11 ap , a2p ,
e p112ep 1exp
Xb
tZap ppt/1pe/p
e
p/
1 Z exp
T2y
21p p pypt p
t 1 Z
e 2
a aZ
pexp
1p N a
2
Na

2 2 N
2
2
2

1
2
2
2
N2

1
1

2
t

2
2
2

1
2

1 a 12
x
exp
exp
T
1 A

1 a tatat 1at A
aa121//2aTa
a t aat t1 a /t 12a/ a
2T21aNa1a 2N 1 exp

11 2aa11 1AA
1

1 a exp
2 x2N

1
2aa 2 N
1

1
N a
exp
N a
exp
1a1 A
x ax2a12N aexp
a 1 /a11 /a 2 a
tT 2ta22 a 2122N 1aexp
21a1 A
2at 12taat t11 A
aa 1aa 1 / a 2 2
1a
2T
exp
21 1
tA
1 A
1 At t att1
T 12a N21p

a
a
/

a
a
/

N
N exp
T
1
1
2
2
2a 12 exp
2
2

1
1
a
t

1
1

2
2

2
t

x xa x2
a
A
a
/

exp
a
a
A
at tp1 a /t
N a 1
1pv p tp1 1 Iv1atp
2 a N
tv
2 Nvae 2 a 1 1 1 11 1 a 2
2a2
1
t
2 /a
2 2 2 2 2 exp
a2

exp
p
p
/

2
2
2

2
t

v
S
v
S

x
exp
a
A
a
/

exp
a
a
A
a
a
p
1
1
e
p
t
t

1
t
t

1
2T
p t t1 / ta1 /
x a2a 2 exp
a1 A
a a 1 exp
2
(3)
e 1e a 1 / 21 a 2
a a 22a t 2a tt1 A
t 1 a p a
pa
ta
2
2

exp t

x e 2 11NN pexp

1 N
T

t 2exp

2
2
2
2

a
p

1
1

2
t

2
2
2

N
t 2

1
1

p
2
2
2

2
2
2

1
2 1 22
T
aexp
1p 1/ / Te
1 N
p p 111 Ip2t pppt t1p /t 122p/ p
x
exp
exp
pep1 p
1 p1 2N 22exp

t tp t 1 tI
2Np 2 1
1122p
112p1p
T2 22pN 2p 1exp
1/p1 1/2 e2e2T
2
2
p t 1pptt11// pp 2
pt t11 II p
exp

12 exp
N p

N p

2
t

2
t

x xp2 22p12Np pexp


p
p

p
t
a Tt 2 pT 2 21 N p 21 exp
1 12 1 p pe 2/v
12t v p p

1 It

ve 2p 1 21
2/ p2

x
exp
p
p

2
2
e21
N
ve1Spe121p 1 1/ 2 1e2t
p 12 p 21t 1pt t 1 tI11 vppSt p1ppt t1p /t
x p 22 122 Nexp

exp
N
N
2 2pv
21 p
p
2 a S

12p
1
t

2 2
2 exp
2a exp
2p

2tI p

2 exp

p
p
/

exp
p
I
p
/

2p 2exp

exp

p
p
/

p
p
p

x bxe2 x

exp

2
2

1
1
e
p
t
t

1
t

1
p

t
t 1
t
t 1 /
p p , X
2

, p , e, a t2,2p t2,2y t ~1 aN1 vvX


Xyp2Z2 aa 2Z
tetX
p
t
tp X 2e 2
vp v2

ave 1 p

vveeSSeee 2 2v a11 t 2 v avSaaSaa 2 2v 1 1 v p Svpp Spp


1
v
22

2
2
2
2 exp

vpSS p2

SveS e 22
vS
va Sa22 2pvpp21 1 2 exp
exp
exp
xxv2eee1 2 21 exp

2vaaa1 2 1 exp
v
v
2

p
ve exp e e 22
va exp
2 p 2v v pexp
a 2av22

ee2 a 2v2a exp


p 1 vp22
1
1
x e2x22e v2ev1exp
a2S2aaa Saa
Spp
ea
S2 p2vp2pp
2 v2evS22S
vee2S
p2 v2 1exp
v
v 1v2 1
2v12
2p exp
p
v
2
e
1

v
S

exp

exp

2
2
2
v
S

e
a

2XX
2 ,X

p
pv
2

a
e2 ,

p
2a
2
2
e
e
a
a

2
2

v
S
v
S
exp

exp

exp
x

,
a
,
p
,
y
~
N

X
y

Z
a

Z
p
X
eb2t
a 2
p S2p
pe(3)
scaled
,exp
, are
,yaof
22 2exp
exp
exp
ep tt e
aa
a p p 22 t exp
exp e2p 22
ea2t 2line

x ext
t,
et
t
2

2 distributions
product
density
of
inverted
chi-square
assumed
prior for variance parameters.
a p2 of

eLast
a
p

2
2

e
a
p

22e 2 a
2 p 2
p
2 manipulations
1 2
1
2
Zaa 2a2Z pcould
2
2
122
e
p 2

1 y distributions
X
2as
2algebraic
2 , 2 ,

After
conditional
be
written
following,

a
,
p
,
y
~
N
,
X
X

X
X

1
1

e
a
p

1
1

X 1 X
X1X2e12e e 1 1
btt 2aa,2 pp1, , 2ee, a, a,tt p, p,t t y, yt2~t ~N NXX
t ,X

yyZa Z
aa a aZp ppZt pt,pX
2 1
1 X
2bt 2a ,
2p
1 X
e
t
t
t
y21 ZZ
bt a ,2 N
, e 22,2 a2 tZ,22pa ttZ
,y ~
Z pa b
p , Z
XpX
N 2X
e 212 A2 at 1 , 2 Z a t Z a t 2 A
AX X
pt
t , X
1X
XX1X
e1yyXyZya aat aZX

b 2paa222,,
,pbatt ,a tytp tt ,~
~ NX
a pptZ,p,pXX
1 1X
2 p ,
2
y~N
aZ
t X

bt
,
Z
X
pe2e, a,teea
ttaNX

btaatt ,
,

,
p
,
y
e2ea1 e2
X
X

e

2
2
p
t
t
p
t
a p
e
t
t
t
a
p t
bat
2 , 2p , 2e , a t , p t , y t ~ N X X X y Z a a Z p p t , X X e
a
, 2 p2 , 2e2, bt , p t , y t
1
t
, p p, , e e, b, tb,t p, tp,2t y, ty t ~ 1 1
a2a2a,1

a2at t

1
1
2
2
2

1
att ,Z2patt ,y t 2 A 11 2 Z y t X a b t Z p t p t 2 A at 1 , 2 Z a t Z a t 2 A1 1
a t aN,22p 2,222e Z
2, b
1
2 2 2 , b2 , p2 , y 1 1 1
1
2 1 1
,1,e
2

t,
p ,,
e, p
t, y
t

a t ataN
b,Z
a
p ,Z
t , tp
2

1,
2a 11
p,a
,2a
,t, ,2aptyt2t2a,t,bA
p1t,yt121e Z y t X a b t Z p p t 1 21Aa11 at11, 1 2 Z1ea Z a 2 2 A
a
p1
e,2
tb

aep
,1
t
a
e
t

,
,
b
y
,2eZ a 2ZZa a tZ a
A 121A
Zae a2t
At 1 111 2e Z2 Z
NN1t a2e2ZZpa aZ
t tA
1 1

tZp Z
yt yt XaX
b ta b
p pt tpt1 2a A 2 aAt 1a,t11

a
a
2
2
t
t
2
2


1
1
1

e Z
a1e eZ a 2
a21a 1111

1
1
1
2

1
1
1
2

N 21Z
a
A
Z
y
X
b
Z
p
A
,
Z
A
e
a
e
a
e
a
a

1
1
2
1
1
1
2
1

1
t
a
t
p
t
t
a
a
1
1
2

2
2
2
2
2
t
t
t
t
t

at pZ
at ,a1,t 1, ,Z
Z
IA 1 ZZ212Z
X
AA212IaaA
2A2I2A
Ztyyt X
Xta ba
bt Z
Zpt ttpat
ptt
Zt Z
1 1
a Z
a 1Z

y
X
b
p
Z
A
pat aZ
p
p

at a 2AA
t
tpZ
a

aZ

2Z
2 Z
2

t
2

N NNe2
Z
y
b
Z
p
Z
ta
2

1
a
t
a

e
a
e

21
2
2
2
2

t
pt t
2 2p
2
2 2
t 2
1e,21e ea t Zaat Z
1e e2 Zt y t a X
2e 1,e2t 2Z,a
at

pN
p a 1

Z,aat a
A,1y1~e2e
Z p t p
A1t 1 at
A 1
t
ab ,a2p
aa 1p a2
at Z
a b t
t
,

2 Zt
2
e
e Z
a

N
a

Z
I
Z
y
X
b
Z
a
I
,
I

a
p
e
t
t
t
t

12 a 1 t 1 12 e p t p t
22 a 1
pt 2 pt
a t
p tt t
212
2, a2p ,1y 12 e t
e 2 ,

, b

Npt2
a

I
Z
y
X
b
Z
a
I
,
Z
Z
a2 ,e2
p2 p 2Z
e2 ,pa t
t p
t
t
2Z
e
p
e
p I
t
a t
p tt t
t 1
p
p
pptt
, , t , att ,tb,tb,t p,p2tp, ty,ty1 t e2
a
p2 e2 t t p21

a ,ep p , e e , a

2 2 1 2

pt N
2p,2
,22Z
, p t2, yIt1 1 Z y t X a b t Z p a t 1 I 1 at 1 , 1 Z p Z p 2 I 1
2,pa2Z
2bt
a ,2
e,
t ,,pa
2 ,e
2,
2p,t ,py t
2Q v S
2
2
2
p
,
b
,

,
a
,
b
,
y

t
a
p
t
t
2
2
2
a
a
a
p
p
pt at ,
ap2pe,,,
pee2 ,, aptet2,,bat tt,,pbpttt ,, ypttt , y tt~ e
e

2
pat
a p , e , a t , bt , p t , y t

v
S
a a
a22 p22 , e22 , p t , a t , bt , p t , y t ~ Qaa DF
2v a S a
Application of KF-RW Method to BCS Data
DF
a p , e , p t , a t , bt , p t , y t ~
2

QQ vDFSv S
BCS was measured by the Swiss Holstein Breeding
a2p2p2a2, ,e2e2, ,ppt ,t a, at ,tb, tb,t p, tp, ty, ty~t ~ a p 2 a2 ap p

Association
as described by Trimberger 16 using 1 to 5 scale,
Q pDF v p S p
p22 a22 , e22 , p t , a t , bt , p t , y t ~ Q p DF

v
S
during
May
2004-March 2005 for 7 times approximately
2 p p
DF
p a , e , p t , a t , bt , p t , y t ~
at monthly intervals from multiparous dairy cows (n=80)
Q p vp2S p
stationed at the Chamau research farm of the Swiss Federal
p22 a22, e2 2, p t , a t , bt , p t , y t ~ Qe 2 DF
ve S e
e a , p , p t , a t , bt , p t , y t ~ DF2
Institute of Technology, Switzerland. The experimental
Qe DFve S e
2
2
2
procedures of the farm followed the Swiss Law on Animal
e2 a2 , p2 , p t , a t , bt , p t , y t ~ Qe 2ve S e
Protection and were approved by the Committee for the
where
in
the
last
lines
Q
stands
for
quadratic
form
of
the

e a , p , p t , a t , bt , p t , y t ~
2S
Qe vofeDF
2
2
2error terms and DF degrees
e
Permission of Animal Experiments of the Canton of Zug,
respective
freedoms.
DF
ye X
, a t , bt , p t , y t ~
a , p ,Zp t a
2
a Zpp e
DF
y X Z a a Z p p e
y X Z a a Z p p e
t

yt

y X Z a a Z p p e

tt

630

a2 p2 ,Breeding
e2 , p t , aValues...
t , bt , p t , y t ~
Predicting

Qa va S a
2
DF

Q p v p Ssplit
p
Zug, Switzerland.
into 4
2
2
~ sessions

, e2 , p t ,Results
a t , bt , poft ,7yBCS
p
a
t
2
periods over DIM to reduce number ofmissing
values.
DF
Missing values filled in with subject specific averages. A
summary of the dataset used for BCS analysis is given in
Table 1. The pedigree file included 637 animals.

to state and variance components estimates (Table 3) from


KF-RW. Lowest number of observations (n=5) with lowest
number of subjects (n=50) gives reasonable predictions of
the states but incearesing both sampling sizes also decrease
the standard errors of the estimates of the states. However
Qe ve S e
2
2
2
prediction of variance components was found far from
e a , p , p t , a t , bt , p t , y t ~
2
For genetic analyses of BCS following
mixed
model was their parameters values with the lowest number of
DF
used;
observations (n=5) and this effect gradually decreased by
increasing the number of subject holding the number of
observations as fixed. Standard errors of predicted variance
y

Z
a

Z
p

e

(4)
a
p
components were found decreased by increasing the both
where y is the vector of observations included 320 number of observations and number of subjects.
(80 animals, with 4 repeated measurements each)
Prediction of Breeding Values for Body Condition
observations, is the vector of fixed effects including
Scores by Kalman Filter
breed (n=80; b1=Holstein-Friesian, n1=42, b2 = Brown Swiss,
n2 =38) age at calving, and year-season interaction, a is
We analyzed BCS data for permanent environmental
the vector of breeding values, p is the vector of random variance without genetic effects in order to decide number
permanent environmental effects, X, Za, Zp are design of Gibbs cycle and burn in period. Repeated runs of the
matrices and e is the vector of random residual effects.
same analyses for the same priors showed that monte carlo
errors were small. Two thousand Gibbs cycles with 200
burn-in period were found reasonable. However, probably,
RESULTS
since each cow consisted of small number of repeated
measurements (n=4) different prior values gave different
Random Walk Model for Simulated Data Sets
estimates, especially for smaller values of scaling factors,
We compared predictions of simulated observations s, (Table 4).
from random walk model with different smoother spans for
Increasing the scaling factor, s, gave more stable
LOWESS and KF-RW (Table 2). Predictions were found to be
estimates,
hence v=5 with s=0.5 chosen for parameters of
both reasonable by KF-RW models and LOWESS approach.
scaled inverted chi-square prior distribution for genetic
We also investigated effect of different sampling sizes analyses of the BCS dataset. Scatter plots of gibbs samples
Table 1. Number of records, means, and standard errors for body condition scores for some selected days of first lactation
Tablo 1. Vcut puanlar iin baz gnlere ait gzlem saylar, ortalamalar ve bunlara ait standart hatalar
DIM

Body Condition Score


n

SE

80

3.24

0.006

75

80

3.16

0.005

150

80

3.22

0.005

305

80

3.21

0.006

Table 2. Means of predictions of observations simulated from random walk model by different smoother spans for LOWESS and Kalman Filter random walk
methodologies
Tablo 2. Kalman Szgeci rassal yry taslam ve LOWESS (farkl reteler iin) yntemlerinin rassal yry benzeimini tahmin ortalamalar
Smoother Span

Observations

LOWESS

Kalman Filter

f=0.1

201.19 (0.12)

202.96 (0.02)

202.78 (0.08)

f=0.5

199.40 (0.11)

201.32 (0.06)

199.71 (0.07)

f=0.7

261.40 (0.25)

264.42 (0.21)

263.57 (0.23)

f=0.9

157.08 (0.17)

162.35 (0.12)

159.15 (0.15)

f=0.01

203.62 (0.13)

202.93 (0.06)

204.71 (0.09)

f=0.05

160.75 (0.15)

161.41 (0.11)

161.51 (0.13)

f=0.07

184.61 (0.16)

182.30 (0.13)

185.08 (0.14)

f=0.09

116.91 (0.18)

119.25 (0.15)

117.42 (0.17)

631
KARACAREN, JANSS
KADARMIDEEN
were visually investigated (figures are not shown), since no
patterns were observed for all variance components; it was
decided that sample size was reasonable. Posterior means
for variance components found 0.21 (0.02) for error variance,
0.03 (0.006) for animal genetic variance and 0.04 (0.007) for
permanent environmental variance. In this data set variation
was found quite small within and between animal.

when number of observations and number of subjects


increased. Again increasing these quantities decreased
standard errors. Fifty subjects with 10 observations each,
started to gave reasonable estimates.
In real biological dataset, different from simulated
dataset, number of observations per animals was low (n=4)
and this was in effect for predicting variance components
(Table 4). Different prior values gave very different variance
component predictions especially for smaller values of
scaling factors, however it was stabilized by increasement of
both degree of belief and scaling factors. Based on various
runs we decided to use v=5, s=0.5 for genetic analyses of the
BCS dataset.

DISCUSSION
In the simulation study predictions of observations were
found to be both reasonable by KF-RW models and LOWESS
approach. However mostly KF-RW estimates were found
better than those of LOWESS estimates (Table 2). It is difficult
to compare two methodologies directly, since each of them
has their own specifications, but KF-RW model does not
need any tuning parameters. However predicting variance
components needs additional computing power in KF-RW.

We used 2000 Gibbs cycles with 200 burn in period,


based on results of different runs (results not shown). Since
dataset were obtained under homogenized, controlled
experimental farm; small amount of variances predicted
for both within and between animals. Visual inspection
showed that predictions of observations were agreed
with actual observations.

We also investigated effect of different sampling sizes to


state and variance components estimates (Table 3) from KFRW. Estimates of states were getting closer to observations
as number of observations and number of subjects
increased. Also it was observed that increasing both
quantities decreased associated standard errors. Variance
components estimates were found to be more accurate
Table 3a. Means of estimates of states , observations
Tablo 3a. Durumlarn ve gzlemlerin

We provided a general theoretical framework for use


of KF-RW in the analyses of animal breeding data, with an
emphasis on application of KF-RW in predicting breeding
values for BCS measured as a longitudinal trait. Simulations

y . Standard errors given in bracets

tahminlerinin ortalamalar. Standart hatalar parantez iinde verilmitir


Number of Observations

Number of
Subjects

10

20

50

50

198.80 (0.04)

198.85 (0.05)

201.08 (0.02)

201.02 (0.02)

200.67 (0.02)

200.67 (0.01)

196.57 (0.009)

196.52 (0.001)

100

200.77 (0.02)

200.87 (0.03)

200.19 (0.01)

200.24 (0.01)

200.02 (0.008)

199.92 (0.009)

197.75 (0.005)

197.55 (0.005)

200

199.58 (0.01)

200.38 (0.02)

199.33 (0.006)

199.58 (0.07)

200.00 (0.004)

200.04 (0.004)

200.72 (0.002)

200.68 (0.002)

500

199.46 (0.004)

199.54 (0.005)

199.66 (0.003)

199.63 (0.002)

199.42 (0.001)

199.35 (0.002) 200.89 (0.0009) 200.98 (0.001)

1.000

199.63 (0.002)

199.86 (0.002)

199.95 (0.001)

200.08 (0.001) 199.48 (0.0009) 199.53 (0.0007) 200.05 (0.0005) 200.04 (0.0005)

2
2
Table 3b. Means of predictions of variance components ( e n ). Standard errors given in bracets

Tablo 3b. Varyans unsurlarnn tahminlerinin ortalamalar e2 n2 . Standart hatalar parantez iinde verilmitir
Number of Observations
Number of
Subjects

10

20

n2

e2

n2

e2

n2

50

71.00 (0.13)

40.52 (0.12)

26.01 (0.03)

102.58 (0.07)

29.43 (0.02)

100

95.58 (0.10)

35.71 (0.11)

25.42 (0.01)

101.56 (0.04)

200

97.38 (0.05)

33.92 (0.05)

27.91 (0.01)

94.67 (0.02)

500

100.31 (0.02)

36.02 (0.02)

25.47 (0.003)

99.64 (0.009)

1.000

97.75 (0.007)

20.94 (0.01)

23.16 (0.001)

101.98 (0.004)

e2 : Estimates of error variance n2 : Estimates of states variance

50

e2

n2

e2

99.97 (0.01)

26.07 (0.004)

103.99 (0.01)

31.34 (0.01)

93.88 (0.02)

26.39 (0.003)

92.62 (0.007)

23.44 (0.003)

98.43 (0.009)

26.92 (0.001)

102.15 (0.003)

23.18 (0.001)

102.09 (0.001)

25.10 (0.0005)

99.32 (0.001)

23.71 (0.0007)

100.93 (0.002)

24.47 (0.0003) 100.59 (0.0008)

632

Predicting Breeding Values...

Table 4. Estimates of state ( n2 ) and error variances ( e2 ) for different parameter values of scaled inverted chi-square prior distributions (v degree of belief, s
scaling factors)
Tablo 4. Durumlarn ( n2 ) ve hata varyanslarnn ( e2 ) leklenmi tersinir kikare ncl dal ile tahmini (v inan derecesi, s lekleme faktr)
v=0.01
s

v=1

v=2

n2

e2

n2

e2

v=3

v=4

v=5

n2

e2

n2

e2

n2

e2

n2

e2

s=1

0.0005

0.18

0.02

0.18

0.04

0.19

0.04

0.20

0.05

0.21

0.06

0.21

s=0.1

0.00009

0.18

0.04

0.18

0.009

0.17

0.01

0.17

0.02

0.17

0.02

0.18

s=0.5

0.0003

0.18

0.01

0.18

0.03

0.19

0.03

0.19

0.04

0.19

0.04

0.19

Standard errors varied between 0.000001-0.14

were also performed to check the validity of methods


and comparison of different sub-models. Although we
used random walk model; model choice depends on the
variability and prior information about dataset, different
models could be more realistic in different applications.
Since random walk model is not stationary it may not be
suitable for animal breeding data under certain conditions.
We assumed constant variances for state and error
components over DIM, however this assumptions could
be extended for time dependent variance components
models, and it could be claimed that it would give more
realistic results. Since KF gives online estimation of breeding
values and does not need to store or invert matrices, this
methodology could be useful in animal breeding industry for
obtaining online estimation of breeding values over DIM.

Acknowledgements
The authors thank to anonymous reviewers for useful
comments. Authors thank all personnel in the research
station of Swiss Federal Institute of Technology in Chamau,
Zug, Switzerland for their help in data collection over number
of years. Authors thank Swiss Holstein Association, Patrick
Rttiman and Dr Silvia Wegmann for providing BCS data.

REFERENCES
1. Grewal MS, Andrews AP: Kalman Filtering: Theory and practice using
MATLAB. JohnWiley&Sons, 2001.
2. Harrison PJ, Stevens C: A Bayesian approach to short term forecasting.
Oper Res Quart, 22, 341-362, 1971.
3. Meinhold RJ, Singpurwalla ND: Understanding the Kalman Filter. Am.

Stat, 37, 123-127, 1983.


4. West M, Harrison PJ: Bayesian forecasting and dynamic models.
Springer. 1997.
5. Migon H, Gamerman D, Lopes H, Ferreira M: Dynamic models. In Dey
D, Rao C (Eds): Handbook of Statistics. p. 25. Elsevier, 2005.
6. Karacaren B: Phenotypic and genetic analysis of functional traits in
dairy cattle under experimental management system. PhD Thesis. Swiss
Federal Institute of Technology Zurich, 2006.
7. Van Bebber J, Reinsch N, Junge W, Kalm E: Monitoring daily milk yields
with a recursive test day repeatability model (Kalman Filter). J Dairy Sci,
82, 2421-2429, 1999.
8. Kadarmideen HN, Wegmann S: Genetic parameters for body condition
score and its relationship with type and production traits in Swiss Holsteins.
J Dairy Sci, 86, 3685-3693, 2003.
9. Kadarmideen HN: Genetic correlations among body condition score,
somatic cell count, production, reproduction, and conformation traits in
Swiss Holsteins. Anim Sci, 79, 191-201, 2004.
10. Veerkamp RF, Brotherstone S: Genetic correlations between linear
type traits, food intake, live weight and condition score in Holstein
Friesian dairy cattle. Anim Sci, 64, 385-392 1997.
11. Jones HE, White IMS, Brotherstone S: Genetic evaluation of holstein
friesian sires for daughter condition-score changes using random regression
model. Anim Sci, 68, 467-475, 1999.
12. Sallas WM, Harville DA: Best linear recursive estimation for mixed
linear models. J Amer Statist Assoc, 76, 860-869, 1981.
13. Forni S, Gianola D, Rosa GJM, de los Campos G: A dynamic linear
model for genetic analysis of longitudinal traits. J Anim Sci, 87, 3845-3853,
2009.
14. R Development Core Team. R: A language and environment for
statistical computing. R Foundation for Statistical Computing, Vienna,
Austria. 2004
15. Venables WN, Ripley BD: Modern Applied Statistics with S. Fourth
ed,. Springer, New York, 2002.
16. Trimberger GW: Dairy Cattle Judging Techniques. Prentice-Hall,
Prospect Heights, Waveland Press, 1977.

You might also like