Professional Documents
Culture Documents
Criminisi et al.
(54)
US 8,954,365 B2
Feb. 10, 2015
OTHER PUBLICATIONS
LEARNING
Cambridge (GB)
(Us)
(*)
Notice:
(Continued)
(52)
(51)
(57)
Int. Cl.
G06F 17/00
G06K 9/62
US. Cl.
(2006.01)
(2006.01)
(58)
.......................................................... ..
706/45
ABSTRACT
.................................................... .. 706/12, 45
References Cited
(56)
A
B1
A1
7/1999 H0
5/2001 Cham et al.
8/2010 Sculley et al.
A1
A1
2013/0166481 A1*
6/2013
$18
using parameters
for each olulu node, recursively
execute blocks 610 In 622 for
subset eldaia elements dlrecied
626
wall (or all branches In mmplete
recurslon
US 8,954,365 B2
Page 2
(56)
References Cited
OTHER PUBLICATIONS
* cited by examiner
US. Patent
Sheet 1 0f 12
US 8,954,365 B2
116
unlabelled
observations
mapped to lower
dimensional
space
unknown
unlabelled
probability
density
observations
eg images, text
function
documents
mapping engine
104
100
\A
forest density
106
is an estimate of
unknown
probability density
\
112
sampling
engine
110
function
samples from
forest density
FIG. 1
US. Patent
200
US 8,954,365 B2
Sheet 2 0f 12
202
206
??-Tr-iff. .. _ '
Qf?g"''~--\
|
people with
beach scene
boat on beach
||
and hut
(1"
l|
. . , _
d
beach scene
r:
castle and landscape
Il e
\'-_
l
l \\
\..
|I
"I.
l.
landscape
31"., d d
208 {\J??H~;~~_-u'}
f
3/
beach scene
no
andpeople
boat
<1
cityscape
and water
f
j
&._L".~'u I; i.)
c|tyscape
J
1
FIG. 2
J
0,;-'-"_'-_- -_. .--j
US. Patent
Sheet 3 0f 12
US 8,954,365 B2
304
FIG. 3
US. Patent
Sheet 4 0f 12
FIG. 4
US 8,954,365 B2
US. Patent
Sheet 5 0f 12
500
Tree v1;
504
FIG. 5
US 8,954,365 B2
US. Patent
Sheet 6 0f 12
US 8,954,365 B2
N 600
I
select number of trees in decision forest
N 602
I
select a tree from forest
N 604
|
go to root node of tree
N 606
I
select data elements from training set
I
N608
N 610
I
apply test parameters to data elements at current node to N612
obtain clusters
I
select parameters optimizing
unsupen/ised entropy
N 614
616
gain < threshold;
or depth
YES
= max volume?
'\, 620
618
I
perform binary test on data
elements at current node
I\, 622
using parameters
626
l
for each child node, recursively
recurs'on
N630
compute partition
I function
@632
terminate
training
FIG. 6
634
636
N
US. Patent
Sheet 7 0f 12
US 8,954,365 B2
N700
N 702
N 704
FIG. 7
US. Patent
Sheet 8 0f 12
US 8,954,365 B2
N 800
N 802
NO
leaf reached?
806
FIG. 8
N 808
US. Patent
Sheet 9 0f 12
US 8,954,365 B2
N 900
FIG. 9
N 902
US. Patent
Sheet 10 0f 12
US 8,954,365 B2
Nmoo
N1002
N1004
N1006
if a y-node, select one of the two children
with probability related to the number of
NO
leaf reached?
FIG. 10
US. Patent
Sheet 11 0f 12
US 8,954,365 B2
N1100
distances
eigen-maps
FIG. 11
N 1104
N1106
US. Patent
Sheet 12 0f 12
US 8,954,365 B2
Network
11
1202
N
1200
1214
N
Communication
rocessor
interface
v
A
1212
[12118
1216
b\
Memory
1204
Operating
Mapping
system
engine
1206
T
1208
1210
, ,
ra'nm
engine
lnput/
Output
, D'SP'ay
devrce
controller
1222
S
Data store
I.
amp mg
engine
FIG. 12
\
User
inpllt
device
[12,20
US 8,954,365 B2
1
BACKGROUND
20
forest;
FIG. 7 is a ?ow diagram of a method of obtaining a forest
density forest;
FIG. 9 is a ?ow diagram of another method of sampling
from a density forest;
35
learning.
DETAILED DESCRIPTION
40
45
US 8,954,365 B2
3
in the sense that there is no ground truth data available for the
the dotted and dashed lines without the use of any labeled
A
lines
treeinwhose
FIG. 2leaves
yields are
the denoted
partitionby
{{a,
theb,dotted
c, d}, lines
{e, f},in{g,
FIG. 2
The overlap
35
40
on the line 304 are close to geodesic distances along the black
curve in the two dimensional space 302. The density forest
50
55
65
ated with a leaf node, one leaf node from each of ?ve leaves
in a tree in this example. Other numbers of leaves may be used
in a tree; ?ve are shown in this example for clarity of descrip
tion. The probability distributions are domain bounded in that
each is restricted within a speci?ed range of values variables
x1 and x2 so that the speci?ed area of x1, x2 is covered and so
that the volume under the surfaces of the probability distri
butions is normalized to l.
US 8,954,365 B2
6
5
FIG. 5 is a schematic diagram of a random decision forest
A tree from the forest is selected 604 and the process moves
606 to the root node of that tree. Data elements are selected
606 from the training set. For example, where the training set
comprises images the data elements may be image elements
nodes. In FIG. 5 the internal nodes and the root nodes are
are applied 612 to the data elements. This acts to cluster the
of the previous question and the training data set and this
relationship is represented graphically as a path through the
If the depth of the tree is at a threshold then the node may also
be set as a leaf node. If there are too few samples then the node
may be set as a leaf node. Otherwise the node is set as a split
stored data).
25
sive.
amounts of training data are available. That is, labeled train
35
45
50
each leaf If there are more trees in the forest the process
prises obtaining 700, for each tree in the forest, the tree
density and aggregating 702 the tree densities. Any suitable
clarity.
node.
When all branches of the tree have completed 626 the
node 620.
Once the current node is set as a split node the split function
and parameters selected at step 614 are applied to the training
data elements to give a binary test 622. The process then
recursively executes 624 blocks 610 to 622 for each subset of
than a threshold then the node may be set as a leaf node 618.
60
ability density function from which such data has been gen
65
US 8,954,365 B2
7
being 1.
with the leaves only. Thus the tree traversal step may be
replaced by direct random selection of one of the leaves. The
be calculated as:
20
bol D) times the set of training points that have reached the jth
split node (represented by symbol Lj) minus the sum over the
25
data points that reach the children of the jth split node divided
by the training data points that reach the jth split node multi
plied by the log of the covariance matrix D multiplied by the
training data points that reach the children of the jth split
node.
As mentioned above a partition function may be calculated
to make sure that the integral of the tree density is 1. As each
data point reaches exactly one leaf node the following parti
tion function may be used:
35
case, these binary signals are smoothed out by the many trees
in the forest. The mapping engine 112 computes 1102 an
a?inity matrix for each tree in the density forest using the leaf
node clusters. An a?inity matrix is an array of numbers which
points pass down one edge to a child node and the rest of the
training points pass down the other edge to a child node (in the
case of a binary tree). The number of training points associ
ated with an edge is the number which passed down that edge
US 8,954,365 B2
10
the mapping function and apply that to map the unlabeled
observations. In an example, the unlabeled observations in
the high dimensional space are magnetic resonance images of
hood size. When using the binary af?nity model the point
neighborhood remains de?ned automatically by the forest
elements:
Wiyjlie
i
*1
(ow)
.
pro?le.
symbol t is the index of the tree in the forest, i and j are the
indexes of the matrix and v represents a data point.
20
they are in the same leaf cluster and null af?nity otherwise.
As mentioned above the af?nity matrix of each tree in the
de?ned as:
herein. Despite the lack of labeled training data the images are
found to cluster together into three clusters, one for each of
bicycles, aeroplanes and cars. In another example, news
reports from the web saved as unstructured text ?les with no
associated labels were used as training data. Each document
35
constructed as:
40
50
Kii = z Wij
j
55
60
after the mapping, and is also less than k) are used to construct
a k><d' matrix E with j indexing the eigenvectors. Mapping a
65
US 8,954,365 B2
11
12
25
that the method steps may be carried out in any suitable order,
or simultaneously.
This acknowledges that software can be a valuable, sepa
rately tradeable commodity. It is intended to encompass soft
30
desired functions.
Those skilled in the art will realize that storage devices
35
the local terminal and some at the remote computer (or com
puter network). Those skilled in the art will also realize that
50
55
those that solve any or all of the stated problems or those that
have any or all of the stated bene?ts and advantages. It will
65