You are on page 1of 51

Learning From Observations

Marco Loog

ai in game programming

it university of copenhagen

Learning from Observations

Idea is that percepts should be used for improving agents ability to act in the future, not only for acting per se

ai in game programming

it university of copenhagen

Outline
Learning agents
Inductive learning Decision tree learning

ai in game programming

it university of copenhagen

Learning
Learning is essential for unknown environments, i.e., when designer lacks omniscience Learning is useful as a system construction method, i.e., expose the agent to reality rather than trying to write it down

Learning modifies the agents decision mechanisms to improve performance


ai in game programming it university of copenhagen

Learning Agent [Revisited]


Four conceptual components
Learning element : responsible for making improvements Performance element : takes percepts and decides on actions Critic : provides feedback on how agent is doing and determines how performance element should be modified Problem generator : responsible for suggesting actions leading to new and informative experience

ai in game programming

it university of copenhagen

Figure 2.15 [Revisited]

ai in game programming

it university of copenhagen

Learning Element
Design of learning element is affected by
Which components of the performance element are to be learned What feedback is available to learn these components What representation is used for the components

ai in game programming

it university of copenhagen

Agents Components
Direct mapping from conditions on current state to actions [instructor : brake!] Means to infer relevant properties about world from percept sequence [learning from images] Info about evolution of the world and results of possible actions [braking on wet road] Utility indicating desirability of world state [no tip / component of utility function] ... Each component can be learned from appropriate feedback
ai in game programming it university of copenhagen

Types of Feedback
Supervised learning : correct answers for each example Unsupervised learning : correct answers not given Reinforcement learning : occasional rewards

ai in game programming

it university of copenhagen

Inductive Learning
Simplest form : learn a function from examples
I.e. learn the target function f Examples : input / output pairs (x, f(x))

ai in game programming

it university of copenhagen

Inductive Learning
Problem
Find a hypothesis h, such that h f, based on given training set of examples

= highly simplified model of real learning


Ignores prior knowledge Assumes examples are given

ai in game programming

it university of copenhagen

Hypothesis
A good hypothesis will generalize well, i.e., able to predict based on unseen examples

ai in game programming

it university of copenhagen

Inductive Learning Method

E.g. function fitting


Goal is to estimate real underlying functional relationship from example observations

ai in game programming

it university of copenhagen

Inductive Learning Method


Construct h to agree with f on training set

ai in game programming

it university of copenhagen

Inductive Learning Method


Construct h to agree with f on training set

ai in game programming

it university of copenhagen

Inductive Learning Method


Construct h to agree with f on training set

ai in game programming

it university of copenhagen

Inductive Learning Method


Construct h to agree with f on training set
h is consistent if it agrees with f on all examples

ai in game programming

it university of copenhagen

Inductive Learning Method


Construct h to agree with f on training set
h is consistent if it agrees with f on all examples

ai in game programming

it university of copenhagen

So, which Fit is Best?

ai in game programming

it university of copenhagen

So, which Fit is Best?


Ockhams razor : prefer simplest hypothesis consistent with the data

ai in game programming

it university of copenhagen

So, which Fit is Best?


Ockhams razor : prefer simplest hypothesis consistent with the data
Whats consistent? Whats simple?

ai in game programming

it university of copenhagen

Hypothesis
A good hypothesis will generalize well, i.e., able to predict based on unseen examples Not-exactly-consistent may be preferable over exactly consistent
Nondeterministic behavior Consistency even not always possible

Nondeterministic functions : trade-off complexity of hypothesis / degree of fit

ai in game programming

it university of copenhagen

Decision Trees
Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm
Good intro to the area of inductive learning

ai in game programming

it university of copenhagen

Decision Tree
Input : object or situation described by set of attributes / features Output [discrete or continuous] : decision / prediction
Continuous -> regression Discrete -> classification
Boolean classification : output is binary / true or false

ai in game programming

it university of copenhagen

Decision Tree
Performs a sequence of tests in order to reach a decision Tree [as in : graph without closed loops]
Internal node : test of the value of single property Branches labeled with possible test outcomes Leaf node : specifies output value

Resembles a how to manual

ai in game programming

it university of copenhagen

Decide whether to wait for a Table at a Restaurant


Based on the following attributes
Alternate : is there an alternative restaurant nearby? Bar : is there a comfortable bar area to wait in? Fri/Sat : is today Friday or Saturday? Hungry : are we hungry? Patrons : number of people in the restaurant [None, Some, Full] Price : price range [$, $$, $$$] Raining : is it raining outside? Reservation : have we made a reservation? Type : kind of restaurant [French, Italian, Thai, Burger] WaitEstimate : estimated waiting time [0-10, 10-30, 30-60, >60]
ai in game programming it university of copenhagen

Attribute-Based Representations
Examples of decisions

ai in game programming

it university of copenhagen

Decision Tree
Possible representation for hypotheses Below is the true tree [note Type? plays no role]

ai in game programming

it university of copenhagen

Expressiveness
Decision trees can express any function of the input attributes E.g., for Boolean functions, truth table row path to leaf

ai in game programming

it university of copenhagen

Expressiveness
There is a consistent decision tree for any training set with one path to leaf for each example [unless f nondeterministic in x] but it probably wont generalize to new examples
Prefer to find more compact decision trees [This Ockham again...]

ai in game programming

it university of copenhagen

Attribute-Based Representations
Is simply a lookup table Cannot generalize to unseen examples

ai in game programming

it university of copenhagen

Decision Tree
Applying Ockhams razor : smallest tree consistent with examples

ai in game programming

it university of copenhagen

Decision Tree
Applying Ockhams razor : smallest tree consistent with examples Able to generalize to unseen examples
No need to program everything out / specify everything in detail

true tree = smallest tree?


ai in game programming it university of copenhagen

Decision Tree Learning


Unfortunately, finding the smallest tree is intractable in general New aim : find a smallish tree consistent with the training examples Idea : [recursively] choose most significant attribute as root of [sub]tree Most significant : making the most difference to the classification
ai in game programming it university of copenhagen

Choosing an Attribute Tests


Idea : a good attribute splits the examples into subsets that are [ideally] all positive or all negative

Patrons? is a better choice


ai in game programming it university of copenhagen

Using Information Theory


Information content [entropy] :
I(P(v1), , P(vn)) = i=1 -P(vi) log2 P(vi) For a training set containing p positive examples and n negative examples
I( p n p p n n , ) log 2 log 2 pn pn pn pn pn pn

Specifies the minimum number of bits of information needed to encode the classification of an arbitrary member
ai in game programming it university of copenhagen

Information Gain
Chosen attribute A divides training set E into subsets E1, , Ev according to their values for A, where A has v distinct values

remainder ( A)
i 1

p i ni pi ni I( , ) p n pi ni pi ni

Information gain [IG] : expected reduction in entropy caused by partitioning the examples

p n IG ( A) I ( , ) remainder ( A) pn pn
ai in game programming it university of copenhagen

Information Gain
Information gain [IG] : expected reduction in entropy caused by partitioning the examples
p n IG ( A) I ( , ) remainder ( A) pn pn

Choose the attribute with the largest IG


[Wanna know more : Google it...]

ai in game programming

it university of copenhagen

Information Gain [E.g.]


For the training set : p = n = 6, I(6/12, 6/12) = 1 bit Consider Patrons? and Type? [and others]
2 4 6 2 4 IG( Patrons) 1 [ I (0,1) I (1,0) I ( , )] .0541 bits 12 12 12 6 6 2 1 1 2 1 1 4 2 2 4 2 2 IG(Type) 1 [ I ( , ) I ( , ) I ( , ) I ( , )] 0 bits 12 2 2 12 2 2 12 4 4 12 4 4

Patrons has the highest IG of all attributes and so is chosen as the root
Why is IG of Type? equal to zero?
ai in game programming it university of copenhagen

Decision Tree Learning

Plenty of other measures for best attributes possible...


ai in game programming it university of copenhagen

Back to The Example...


Training data

ai in game programming

it university of copenhagen

Decision Tree Learned


Based on the 12 examples; substantially simpler solution than true tree

More complex hypothesis isnt justified by small amount of data

ai in game programming

it university of copenhagen

Performance Measurement
How do we know that h f?
Or : how the h*ll do we know that our decision tree performs well? Most often we dont know... for sure

ai in game programming

it university of copenhagen

Performance Measurement
However
prediction quality can be estimated using theory from computational / statistical learning theory / PAC-learning Or we could, for example, simply try h on a new test set of examples
The crux being of course that there should actually be new test set...

If no test set is available several possibilities exist for creating training and test sets from the available data
ai in game programming it university of copenhagen

Performance Measurement
Learning curve : % correct on test set as function of training set size

ai in game programming

it university of copenhagen

Bad Conduct in AI

Training on the test set!


May happen before you know it Often very hard justifiable... if at all possible All I can say is : try to avoid it

ai in game programming

it university of copenhagen

Ensemble-Learning-in-1-Slide
Idea : collection [ensemble] of hypotheses is used / predictions are combined Motivation : hope that it is much less likely to misclassify [obviously!]
E.g. independence can be exploited

Examples : majority voting / boosting Ensemble learning simply creates new, more expressive hypothesis space
ai in game programming it university of copenhagen

Summary
In general : learning needed for unknown environments or lazy designers Learning agent = performance element + learning element [Chapter 2] Supervised learning : the aim is to find simple hypothesis [approximately] consistent with training examples Decision tree learning using IG Difficult to measure learning performance
Learning curve
ai in game programming it university of copenhagen

Next Week

More...

ai in game programming

it university of copenhagen

ai in game programming

it university of copenhagen

ai in game programming

it university of copenhagen

You might also like