Professional Documents
Culture Documents
Results
Mahmoud Abou-Nasr
June 4th,2008
The Challenge
Data samples from an automotive subsystem
were collected in batches of 500 samples per
diagnostic session.
The objective is a classifier that will determine
whether a certain symptom exists or does not
exist after examining the samples.
To this end, batches of 500 samples were
collected when the symptom exists and batches
of 500 samples were collected when the
symptom does not exist.
WCCI2008: Ford Classification Challenge
June 4th,2008
June 4th,2008
June 4th,2008
June 4th,2008
June 4th,2008
Validation Set
Testing Set
Ford_A
3271
330
1320
Ford_B
3306
330
810
June 4th,2008
Evaluation
The classification performance used in the
evaluation is the accuracy of the classifier and
in case of a tie the false positive rate of the
classifier is also used.
June 4th,2008
Definitions
Predicted Negative
Predicted Positive
Negative Examples
Positive Examples
Accuracy
False positive rate
=
=
(a + d) / (a + b + c +d)
b / (a + b)
June 4th,2008
(1)
(2)
Ford_A Results
Rank
Name
Accuracy
%
False
Positive %
Dyakonov Alexander
100
99.6
Lv Jun
97.8
1.8
Cristian Grozea
96.7
2.1
Teesid Korsrilabutr
96.5
2.9
Schuichi Kurogi
96.0
3.7
Paulo Adeodato
95.5
4.6
Joerg D. Wichard
95.4
5.0
Anthony Bagnall
94.9
3.5
10
Gavin Cawley
94.5
6.6
June 4th,2008
Ford_A Results
Rank
Name
Accuracy
%
False
Positive %
11
Paavo Nieminen
93.9
6.8
12
David Verstraeten
92.5
7.6
13
Bo Jin
89.6
9.5
14
Dmitry Zhora
85.6
17.2
15
Ukil Abhisek
83.8
15
16
81.6
27.3
17
Paul Chandra
80.7
39
18
Dongrui Wu
76.9
25.8
19
Dymitr Ruta
75.3
40.2
20
65.9
24.2
June 4th,2008
Ford_B Results
Rank
Name
Accuracy
%
False
Positive %
Gavin Cawley
86.2
12.7
Anthony Bagnall
84.3
23.7
Paavo Nieminen
83.8
25.9
Joerg D. Wichard
83.2
23.4
Lv Jun
83.2
27.4
Cristian Grozea
82.8
27.2
David Verstraeten
82.5
24.9
Schuichi Kurogi
79.7
33.7
Paulo Adeodato
78.9
27.4
10
Dymitr Ruta
68.8
50.4
June 4th,2008
Ford_B Results
Rank
Name
Accuracy
%
False
Positive %
11
Dyakonov Alexander
68.1
35.9
12
Dongrui Wu
67.7
45.9
13
Bo Jin
66.7
34.4
14
Dmitry Zhora
64.6
68.8
15
Ukil Abhisek
63.6
75.6
16
Paul Chandra
62.3
47.3
17
61.9
75.8
18
59.3
72.3
19
54.9
49.1
20
Teesid Korsrilabutr
51.2
48.1
June 4th,2008
Name
Accuracy
%
Lv Jun
92.2
Cristian Grozea
91.4
Gavin Cawley
91.3
Anthony Bagnall
90.9
Joerg D. Wichard
90.8
Paavo Nieminen
90.1
Schuichi Kurogi
89.8
Paulo Adeodato
89.2
David Verstraeten
88.7
10
Dyakonov Alexander
87.9
June 4th,2008
Name
Accuracy
%
11
84.3
12
Bo Jin
80.9
13
Teesid Korsrilabutr
79.3
14
Dmitry Zhora
77.6
15
Ukil Abhisek
76.1
16
74.1
17
Paul Chandra
74.0
18
Dongrui Wu
73.4
19
Dymitr Ruta
72.8
20
61.7
June 4th,2008
In Conclusion
This challenge problem was motivated by a potential
automotive application. Abstractly, this problem
amounts to classification of finite data sequences, in
contrast to the more commonly encountered problem of
classification based on feature vectors.
The length of the sequences reflects the time available
for making the classification decision. Presumably, the
task would be easier if the sequence length were
increased, but this would violate the requirements of the
application.
This problem does not appear to have a simple solution
that emerges from visual inspection of the sequences.
This distinguishes it from others in our experience
where, at least in some range of operation, examples
from opposite classes are readily differentiated.
WCCI2008: Ford Classification Challenge
June 4th,2008