Professional Documents
Culture Documents
Introduction
Every person walks in a unique manner. This uniqueness is what we want to quantify in this
project. Nowadays, all smart phones have a range of sensors build-in such as accelerometers,
gyroscopes, or GPS. We are going to build on the data generated by these sensors to solve our
task at hand: Can you detect which person is walking around with a phone based on how the
phone behaves in their trousers pocket? For this to be possible, it is necessary to apply machine
learning techniques to detect a combination of characteristics that is unique to each person.
Your task is to classify 10 unlabeled walks based on a training set of 58 walks. In total 30 persons
have participated in the data collection, but in this project you only have to consider three labels:
Wannes, Leander and Other.
Data is collected using an Android app sampling the accelerometer, gyroscope and magnetometer
at a speed of 50Hz (the .json-files). Acceleration is measured along three axes according to the
phone coordinate system with the gravitational component removed. Since this would mean
that the data are different depending on how you put the phone in your pocket (e.g., screen inor outside), this is not an ideal representation. Therefore, the data is preprocessed to express
acceleration according to the geographical coordinate system (the .csv-files):
X
Y
Z
Accelerometer data
In order to use raw accelerometer data for classification techniques like Naive Bayes, SVMs
or Decision trees it is often necessary to first extract features from the data (see [1, 2, 3, 4] for
0
Collecting Data
With rotation correction
g
-2
-4
-2
-4
-6
Collecting Data
-6
-8
-8
-10
30073
30074
30075
30076
30077
30078
30079
30073
time (s)
30074
30075
g
31123
31124
31125
31126
31127
31128
time (s)
30073
30074
30075
30076
30077
Y
Z
time (s)
30078
m/sg 2
2
m/s
g g
-4
-4
-6
-6
-8
-8
-4
-4
-6
-- 68
-8
30113
30114
30115
30116
30117
2
m/s
gg g
30118
time (s)
31123
30073
31124
30074
31125
30075
31126
30076
Y (s)
Z
time
time (s)
31127
30077
31125
30075
31126
30076
31127
30077
Y
Z
time (s)
31128
30078
30079
Accelerometer
data
z
xy
30112
-- 192
-14
31128
30078
16
4
0
4
2
-1
2
0
-2
- 02
-3
-- 2
4
-4
-4
- -56
-- 6
- 68
-8
-- 170
-- 180
30079
30113
30114
31123
30074
31124
30075
31125
Y30076Z
time (s)
time (s)
30116
31126
30077
30117
31127
30078
30118
31128
30079
z
xy
FFT(z)
4
10
8
8
30115
time (s)
30073
z
xy
FFT(z)
10
8
2
6
6
6
0
0
-2
-2
1
-4
-4
m/s2
ROTATION_VECTOR, http://developer.android.com/reference/android/hardware/
SensorManager.html
4
24
2
2
amplitude
gg
m/s
30074
10
-4
-8
-6
2
amplitude
gg
31124
time (s)
30073
1380
28
6
60
0
-2
4
-- 24
2
-6
-0
4
-8
20
- --16
0
2 4
2
-2
0
-- 42
-4
1- 6
-6
-- 88
-6
-6
-10
-10
-12
-8
0
31123
30079
2
82
2
m/s
gg g
-2
-2
Accelerometer data
z
xy
10
1 64
8
1 26
4
1 04
30079
00
00
-2
-2
-10
30078
2
2
22
Top-up,
screen inside
Top-up, screen
inside
30077
6
4
4
44
Top-down, screen
Top-down, inside
screen inside
Accelerometer data
z
Accelerometer
data
z
30076
time (s)
30112
0
31123 30113
3
4
5
6
31124 30114 31125 30115 31126 30116 31127 30117 31128 30118
7
8
9 1 0 1 1 1 2time
1 3 (s)
14 15 16 17 18 19 20 21 22 23 24 25 26
time
(s)
30112
30113
331123
4
5
6
30114
30115
30116
30117
30118
731124
8
9 1 0 131125
1 1 2time
1 3 (s)
1 4 131126
5 1 6 1 7 1 8 131127
9 2 0 2 1 2 2 31128
23 24 25 26
time
(s)
freq
(Hz)
freq (Hz)
xy
FFT(z)
WPD(z)
10
xy
FFT(z)
WPD(z)
14
12
15
21 0
8
10
26
8
10
6
45
20
amplitude
g
g
amplitude
amplitude
--140
-6
-15
-8
6
- 1- 0
-8
-- 1
15
0
-20
- 01 0
0
30112
1
2
0
30113
5
6
8 30114
9 1 0 1 1 1 2 30115
1 3 1 4 1 5 1 6 30116
1 7 1 8 1 9 2 0 30117
2 1 2 2 2 3 2 430118
25 26
time(Hz)
(s)
freq
scaling coeff
FFT(z)
WPD(z)
Delta rotation of phone
10
3
5
4
5
2
0
0
1
-2
-5
-4
30112
0 0 1
30113
30114
30115
30116
scaling
coeff
freq (Hz)
15
2
10
30117
30118
1 0 1 1 1 time
2 1 3 (s)
1 4 1 5 1 6 1 7 1 8 1 9 2 0 2 1 2 2 2 3 2 41 2 5 2 6
FFT(z)
Delta rotation
of phone
WPD(z)
z
g
-2
-4
-6
-8
-10
-2
1880
-4
1886
1888
1890
1892
1894
xy
1-08
14
12
10
1880
1882
1884
1886
1888
1890
1892
1894
time (s)
2
0
6
4
xy
-2
14
-4
12
-6
10
-8
8
- 1 06
1884
time (s)
8
-10
6
-2
-4
Features:
FFT
Fourier Transformation
-6
-8
1880
1882
1884
1886
1888
1890
1892
1894
-10
time (s)
Statistics
1882
Accelerometer data
-6
-2
1880
1882
1884
1886
1890
1892
1894
FFT(z)
-4
1888
time (s)
-6
-48
1882
1884
1886
1888
1890
1892
1894
time (s)
FFT(z)
-2
10
-4
-10
2
0
Peak mean
-2
1880
1882
1884
1886
1888
1890
1892
1894
xy
haar
daubechies{2,3,4,5,6,7,8,9,10,20}
01 0
0
2
4
6
8
10
12
14
16
18
20
Wavelet
Decomposition
biorthogonal{11,13,15,22,24,26,28,31,33,35,37,39,44,55,68}
8
Features:
10
Wavelet
22
12
15
1882
10
1880
24
1884
1886
- 65
-8
- 1 00
amplitude
DWT(z)
WPD(z)
16
18
20
1888
1890
1892
1894
time (s)
26
105
-2
- 140
14
22
24
26
WPD(z)
-8
-10
freq (Hz)
freq (Hz)
6
4
-6
Peak distance
12
-4
time (s)
14
Hidden
Markov
Models
Y
Y
Z
Z
-5
0.343
0.7
0.15
-10
-1.522
0.62
2
0.26
-15
-3.771
-20
1880
-5
1882
1884
1886
1888
1890
1892
g 2
m/s
amplitude
Accelerometer data
1
-6
2
-8
amplitude
amplitude
stddev
Peaks
1880
mean 0
m/s
g 2
stddev
-10
2
1894
time (s)
0.05
FFT(z)
0.02
0.05
0.99
0.01
0.01
-7.595
0.86
0.13
0.14
scaling coeff
-4
-3.379
0.49
0.51
-6
-10
-15
-2
0.12
-0.457
0.53
0.72
0.06
0.22
-8
1.153
0.62
0.33
-10
-20
0
1880
1882
Degrees
amplitude
2.170
scaling coeff
Average Energy
1884
1886
1888
1890
1892
0.32
1894
0.71
0.12 0.03
0.17
time (s)
3.134
0.76
0.22
1.587
0.09
14
12
10
8
0
10
12
14
16
18
20
22
24
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
Time
Degrees
26
freq (Hz)
-2
1
-4
WPD(z)
-6
inspiration). Useful features are for example simple statistics like average, variance or kurtosis,
average peakSignal
distance
or height, peaks in the frequency domain, energy in a wavelet component,
without gravity (derived from signal)
FFT(z)
HMM likelihood, . . . .
A file is not guaranteed to be pure walking data. Putting the phone in a pocket, turning in the
hallway or taking the phone out of a pocket are also part of the data. Therefore, it will be necessary
Gravity (derived from signal)
to device a method to extract windows of clean walking data. To achieve this you can, for example,
Delta rotation of phone
train a separate classifier to recognize walking in general and use this classifier for pre-processing
the data.
Gravity (derived from signal)
WPD(z)
Additionally, you also get a test data set where the walks are not labeled.
Your task is to predict
the labels for these walks. Extra points will be rewarded to the teams with the highest scores
(divided by the number of teams that achieve this score).
-8
01 5
-10
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
1880
Time
10
amplitude
1
1884
1886
1888
1890
1892
1894
time (s)
-5
-10
amplitude
-15
-20
scaling coeff
1882
10
12
14
16
18
20
22
24
26
Degrees
freq (Hz)
15
10
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
Time
amplitude
-5
-10
Tasks
-15
-20
scaling coeff
3.1
Form Groups
By October 24th
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
time (s)
Degrees
You can work on this project in groups of two and divide the work. Form the groups as quickly
as possible and email the names of the members of your group to davide.nitti@cs.kuleuven.be.
(derived from signal)
You work on thisGravity
assignment
alone or in team, do not share your experimental results or implementations with your fellow students.
0
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
time (s)
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
Time
3.2
Literature Study
Familiarize yourself with the concept of machine learning for accelerometer data and activity
recognition. Get a feeling for what has been done before in this context, and what are the open
Quaternions (as given)
questions. Divide the work between the two people in the team.
0
1880 1881 1882 1883 1884 1885 1886 1887 1888 1889 1890 1891 1892 1893 1894 1895
time (s)
W
3.3
First Report
By November 7th
In a first phase of the project you explain in a brief report (about 2 pages) how you will address
these tasks. Describe what literature you have read and what you have learned from it. Give
examples of questions that you want to find answers for, and give an idea of how you intend
to find the answers (What data representation, features and machine learning techniques will
you use? How will you evaluate your classifier?). Please send your report in PDF format to
the aforementioned email address. The quality of this report influences your final score for the
project, so do your best to come up with a good plan. After the reports are handed in, you will get
feedback on your report, and, where necessary, we may give more concrete guidelines on how to
proceed.
3.4
Machine Learning
Use at least two different machine learning techniques to classify the data. Make sure that you
use a correct experimental setup to assess the accuracy of the classifier (e.g., cross-validation,
test/train split).
3.5
Analyze results
What is the accuracy you can achieve when applying cross-validation on the training data? Do
you understand why your parameters work? What is the computational cost of pre-processing,
learning and classifying? What is your prediction for the 10 unlabeled walks?
3.6
By December 12th
The default setting is to use the classifier you have found to be the best.
Your code is self-contained and includes all dependencies.
Your code is print-friendly (e.g., max 80 columns).
It is easy to give as input a path to a new file and let your software print the classification
(this will be used to test your software on an unseen walk).
3.7
Peer assessment
Send by email a peer-assessment of your partners efforts. This should be done on a scale from
0-4 where 0 means I did all the work, 2 means I and my partner did about the same effort,
and 4 means My partner did all the work. Add a short motivation to clarify your score. This
information is used only by the professor and his assistants and is not communicated further.
3.8
Discussion
By December 19-20
There will be an oral discussion of your project report in week 13 where you will also get the
chance to demo your prototype.
Time
The time allocated for completing this project is 60 hours per person. This does not include
studying the Machine Learning course material, but it does include the literature study specific
for this projects topic.
Questions
Please direct any questions that you may have about the project to the Toledo forum.
Good luck!
References
[1]
Mark V. Albert, Konrad Kording, Megan Herrmann, and Arun Jayaraman. Fall Classification by Machine Learning Using Mobile Phones. In: PLoS ONE 7.5 (2012).
[2]
Jonathan Lester, Tanzeem Choudhury, Nicky Kern, Gaetano Borriello, and Blake Hannaford.
A Hybrid Discriminative/Generative Approach for Modeling Human Activities. In: Proceedings of IJCAI. 2005.
[3]
Stephen J Preece, John Yannis Goulermas, Laurence PJ Kenney, and David Howard. A
comparison of feature extraction methods for the classification of dynamic activities from
accelerometer data. In: Biomedical Engineering, IEEE Transactions on 56.3 (2009), pp. 871879.
[4]
Nishkam Ravi, Nikhil Dandekar, Preetham Mysore, and Michael L Littman. Activity recognition from accelerometer data. In: Proceedings of AAAI. 2005.