Professional Documents
Culture Documents
OKKAMoids Part II
Delving Deeper: Models, Parameter Estimation, and ML in Practice George Giannakopoulos
George Giannakopoulos
Material
Thanks Simon Colton [Colton, 2010] T. Palpanas Wikipedia Commons [wik, 2010] S. Theodoridis and K. Koutroumbas [Theodoridis and Koutroumbas, 2003]
Please cite http://www.iit.demokritos.gr/~ggianna if you reuse.
George Giannakopoulos
Learning from Experience Various tasks Dierentiation based on input (classication, clustering, ...) Dierentiation based on inductive bias (strategy) Several ways to evaluate performance
George Giannakopoulos
Purpose
Part II: Delving Deeper We will touch the following subjects: How can I classify sequences? Aspects of modeling and parameter estimation Good Practices for Using Machine Learning Techniques Matching Problems to Algorithms Available Tools
George Giannakopoulos
Pattern Recognition
Pattern Recognition is the scientic discipline whose goal is the classication of objects into a number of categories or classes. [Theodoridis and Koutroumbas, 2003]
George Giannakopoulos
Sequence
Strings vs. time-series abcdacde: A string or symbol sequence (0.2, 0.4, 0.3, 0.1, 2.1): A time series
George Giannakopoulos
Pattern determination / mining Modeling Measuring best model for an instance, implying class
George Giannakopoulos
Deterministic Grammars and Automata [Cicchello and Kremer, 2003] Probabilistic State Machines and Bayesian Networks [Heckerman, 2008] Constraint Satisfaction (e.g. N-gram Graphs [Giannakopoulos, 2009])
George Giannakopoulos
Observations, i.e. what we see, the output Optionally (Hidden) States (also termed labels), i.e. what causes the observations Parameters, i.e. the details of the cause that determines the output or of the model that explains it
George Giannakopoulos
George Giannakopoulos
Discriminative model: does not take into account previous observations Generative model: takes into account previous observations
George Giannakopoulos
Discriminative model: No assumption for the observations (e.g. Conditional Random Fields: see [Wallach, 2004] for an introduction) Generative models: Some assumptions for the observations (e.g. Hidden Markov Models: see [Rabiner, 1989] for a introduction tutorial) Which is better? See: [Long and Servedio, 2006], [Jebara and Meila, 2006]1
GRASS WET GRASS WET SPRINKLER RAIN F F T T F T F T T 0.0 0.8 0.9 0.99 F 1.0 0.2 0.1 0.01
George Giannakopoulos
The Question
George Giannakopoulos
Assumptions (Independence, Underlying Probability Distributions, etc.) A priori knowledge (Previous Studies, Expert Knowledge, etc.)
George Giannakopoulos
Parametric approaches: Stable set of unknown parameters (e.g. Power-law parameters) Non-parametric approaches: Determined set of unknown parameters, based on the learning (e.g. histogram)
George Giannakopoulos
Example For a gaussian distribution nd the best parameters (, ) that desribe the following values: 0.7168090 0.6515225 0.6213850 -0.6626706 -1.1918936 0.7711588 -3.1388009 0.2561228 1.1569174 0.6771980
George Giannakopoulos
Best parameters Mean: 0.01422517, St. Dev: 1.311977 Usually, we have to search in the parameter space.
George Giannakopoulos
Small search space Brute force methods High-speed approximation Greedy techniques No local maxima Gradient descent Small plateau Simulated annealing Little known Evolutionary (genetic) algorithms
George Giannakopoulos
ML tools
WEKA: Many algorithms HMM: JaHMM, Hidden Markov Model Toolbox CRF: CRF for Java SVM: LibSVM, SVMLite Time Series: Gnu Regression, Econometrics and Time-series Library (Gretl), Rapid-I (former YALE) Constraint-based: JINSECT2 MLOSS3
2 3
Questions
What data do I have/need? What is the data type: sequence or What do I want to learn? What do I know beforehand?
George Giannakopoulos
Considerations
George Giannakopoulos
Considerations
George Giannakopoulos
Considerations
What are the features? What do they mean? Is there an obvious connection to the class?
George Giannakopoulos
Considerations
What are the features? What do they mean? Is there an obvious connection to the class? Feature vector: what does every dimension represent in simple words?
George Giannakopoulos
Considerations
What are the features? What do they mean? Is there an obvious connection to the class? Feature vector: what does every dimension represent in simple words? Can I describe in a sentence what my instance is?
George Giannakopoulos
Recapitulation
George Giannakopoulos
Yes, we are done! If you need more, see the references in Section 6
Thank you!
Please check the feedback form4 to help me improve.
http://tinyurl.com/ycommj3
George Giannakopoulos Machine Learning for... OKKAMoids Part II
References
(2010). Wikimedia commons. Cicchello, O. and Kremer, S. C. (2003). Inducing grammars from sparse data sets: a survey of algorithms and results. J. Mach. Learn. Res., 4:603632. Colton, S. (March 30, 2010). Articial intelligence course v231. Giannakopoulos, G. (2009). Automatic summarization from multiple documents. PhD thesis, Ph. D. dissertation, Department of Information and Communication Systems Engineering, University of the Aegean, Samos, Greece, http://www. iit. demokritos. gr/ggianna/thesis. pdf. Heckerman, D. (2008). George Giannakopoulos
Machine Learning for... OKKAMoids Part II