You are on page 1of 16

Ford Classification Challenge

Results
Mahmoud Abou-Nasr

WCCI2008: Ford Classification Challenge

June 4th,2008

The Challenge
Data samples from an automotive subsystem
were collected in batches of 500 samples per
diagnostic session.
The objective is a classifier that will determine
whether a certain symptom exists or does not
exist after examining the samples.
To this end, batches of 500 samples were
collected when the symptom exists and batches
of 500 samples were collected when the
symptom does not exist.
WCCI2008: Ford Classification Challenge

June 4th,2008

About the Data


The 500 samples collected in each diagnostic
session represent a set of sequential values of
the measured variable, where sample n+1
occurs after sample n.
The beginning of the sampling process is not
aligned with any external circumstance or any
aspect of the observed pattern.

WCCI2008: Ford Classification Challenge

June 4th,2008

The Data Sets


Ford_A
Data samples of known and hidden
classification were collected in typical operating
conditions, with minimal noise contamination.
Ford_B
Data samples of known classification were
collected in typical operating conditions, while
data samples of hidden classification were
collected under noisy conditions.
WCCI2008: Ford Classification Challenge

June 4th,2008

Example of a Symptom Free Pattern

WCCI2008: Ford Classification Challenge

June 4th,2008

A Pattern that Exhibits the Symptom

WCCI2008: Ford Classification Challenge

June 4th,2008

What was provided to the Contestants?


Training Set

Validation Set

Testing Set

Ford_A

3271

330

1320

Ford_B

3306

330

810

WCCI2008: Ford Classification Challenge

June 4th,2008

Evaluation
The classification performance used in the
evaluation is the accuracy of the classifier and
in case of a tie the false positive rate of the
classifier is also used.

WCCI2008: Ford Classification Challenge

June 4th,2008

Definitions
Predicted Negative

Predicted Positive

Negative Examples

Positive Examples

Accuracy
False positive rate

=
=

(a + d) / (a + b + c +d)
b / (a + b)

WCCI2008: Ford Classification Challenge

June 4th,2008

(1)
(2)

Ford_A Results
Rank

Name

Accuracy
%

False
Positive %

Dyakonov Alexander

100

Eva Alfaro Cid

99.6

Lv Jun

97.8

1.8

Cristian Grozea

96.7

2.1

Teesid Korsrilabutr

96.5

2.9

Schuichi Kurogi

96.0

3.7

Paulo Adeodato

95.5

4.6

Joerg D. Wichard

95.4

5.0

Anthony Bagnall

94.9

3.5

10

Gavin Cawley

94.5

6.6

WCCI2008: Ford Classification Challenge

June 4th,2008

Ford_A Results
Rank

Name

Accuracy
%

False
Positive %

11

Paavo Nieminen

93.9

6.8

12

David Verstraeten

92.5

7.6

13

Bo Jin

89.6

9.5

14

Dmitry Zhora

85.6

17.2

15

Ukil Abhisek

83.8

15

16

Hugo Jair Escalante

81.6

27.3

17

Paul Chandra

80.7

39

18

Dongrui Wu

76.9

25.8

19

Dymitr Ruta

75.3

40.2

20

Cota Flores Suarez

65.9

24.2

WCCI2008: Ford Classification Challenge

June 4th,2008

Ford_B Results
Rank

Name

Accuracy
%

False
Positive %

Gavin Cawley

86.2

12.7

Anthony Bagnall

84.3

23.7

Paavo Nieminen

83.8

25.9

Joerg D. Wichard

83.2

23.4

Lv Jun

83.2

27.4

Cristian Grozea

82.8

27.2

David Verstraeten

82.5

24.9

Schuichi Kurogi

79.7

33.7

Paulo Adeodato

78.9

27.4

10

Dymitr Ruta

68.8

50.4

WCCI2008: Ford Classification Challenge

June 4th,2008

Ford_B Results
Rank

Name

Accuracy
%

False
Positive %

11

Dyakonov Alexander

68.1

35.9

12

Dongrui Wu

67.7

45.9

13

Bo Jin

66.7

34.4

14

Dmitry Zhora

64.6

68.8

15

Ukil Abhisek

63.6

75.6

16

Paul Chandra

62.3

47.3

17

Hugo Jair Escalante

61.9

75.8

18

Eva Alfaro Cid

59.3

72.3

19

Cota Flores Suarez

54.9

49.1

20

Teesid Korsrilabutr

51.2

48.1

WCCI2008: Ford Classification Challenge

June 4th,2008

Results Both Data Sets: Ford_A and Ford_B


Rank

Name

Accuracy
%

Lv Jun

92.2

Cristian Grozea

91.4

Gavin Cawley

91.3

Anthony Bagnall

90.9

Joerg D. Wichard

90.8

Paavo Nieminen

90.1

Schuichi Kurogi

89.8

Paulo Adeodato

89.2

David Verstraeten

88.7

10

Dyakonov Alexander

87.9

WCCI2008: Ford Classification Challenge

June 4th,2008

Results Both Data Sets: Ford_A and Ford_B


Rank

Name

Accuracy
%

11

Eva Alfaro Cid

84.3

12

Bo Jin

80.9

13

Teesid Korsrilabutr

79.3

14

Dmitry Zhora

77.6

15

Ukil Abhisek

76.1

16

Hugo Jair Escalante

74.1

17

Paul Chandra

74.0

18

Dongrui Wu

73.4

19

Dymitr Ruta

72.8

20

Cota Flores Suarez

61.7

WCCI2008: Ford Classification Challenge

June 4th,2008

In Conclusion
This challenge problem was motivated by a potential
automotive application. Abstractly, this problem
amounts to classification of finite data sequences, in
contrast to the more commonly encountered problem of
classification based on feature vectors.
The length of the sequences reflects the time available
for making the classification decision. Presumably, the
task would be easier if the sequence length were
increased, but this would violate the requirements of the
application.
This problem does not appear to have a simple solution
that emerges from visual inspection of the sequences.
This distinguishes it from others in our experience
where, at least in some range of operation, examples
from opposite classes are readily differentiated.
WCCI2008: Ford Classification Challenge

June 4th,2008

You might also like