Integrated Learners - MLR Tutorial

20/07/2017 Integrated Learners - mlr tutorial
Integrated Learners
This page lists the learning methods already integrated in mlr (http://www.rdocumentation.org/packages/mlr/).
Columns Num., Fac., Ord., NAs, and Weights indicate if a method can cope with numerical, factor, and ordered factor predictors, if it can deal with missing values in a
meaningful way (other than simply removing observations with missing values) and if observation weights are supported.
Column Props shows further properties of the learning methods specific to the type of learning task. See also RLearner
(http://www.rdocumentation.org/packages/mlr/functions/RLearner.html) for details.
Classification (84)
For classification the following additional learner properties are relevant and shown in column Props:
prob: The method can predict probabilities,

oneclass, twoclass, multiclass: One-class, two-class (binary) or multi-class classification problems be handled,
class.weights: Class weights can be handled.
Class / Short Name / Name Packages Num. Fac. Ord. NAs Weights Props Note
classif.ada ada (http://www.rdocumentation.org/packages/ada/) X X prob xval has been set to 0 by

ada twoclass default for speed.
ada Boosting
classif.bartMachine bartMachine X X X prob use_missing_data has been
bartmachine (http://www.rdocumentation.org/packages/bartMachine/) twoclass set to TRUE by default to allow
missing data support.
Bayesian Additive Regression
Trees
classif.bdk kohonen (http://www.rdocumentation.org/packages/kohonen/) X prob keep.data is set to FALSE to
bdk twoclass reduce memory requirements.
multiclass
Bi-Directional Kohonen map
classif.binomial stats (http://www.rdocumentation.org/packages/stats/) X X X prob Delegates to glm with freely
binomial twoclass choosable binomial link function
via learner parameter link .
Binomial Regression We set 'model' to FALSE by
default to save memory.
classif.blackboost mboost (http://www.rdocumentation.org/packages/mboost/) X X X X prob See ?ctree_control for
blackboost party (http://www.rdocumentation.org/packages/party/) twoclass possible breakage for nominal
features with missingness.
Gradient Boosting With family has been set to
Regression Trees Binomial by default. For
'family' 'AUC' and 'AdaExp'
probabilities cannot be
predcited.
classif.boosting adabag (http://www.rdocumentation.org/packages/adabag/) X X X prob xval has been set to 0 by
adabag rpart (http://www.rdocumentation.org/packages/rpart/) twoclass default for speed.
multiclass
Adabag Boosting featimp
classif.bst bst (http://www.rdocumentation.org/packages/bst/) X twoclass Renamed parameter learner
bst to Learner due to nameclash
with setHyperPars . Default
Gradient Boosting changes: Learner = "ls" ,
xval = 0 , and maxdepth =
1 .
classif.C50 C50 (http://www.rdocumentation.org/packages/C50/) X X X X prob

C50 twoclass
multiclass
C50
classif.cforest party (http://www.rdocumentation.org/packages/party/) X X X X X prob See ?ctree_control for
cforest twoclass possible breakage for nominal
multiclass features with missingness.
Random forest based on featimp
conditional inference trees
classif.clusterSVM SwarmSVM X twoclass centers set to 2 by default.
clusterSVM (http://www.rdocumentation.org/packages/SwarmSVM/)
LiblineaR
Clustered Support Vector (http://www.rdocumentation.org/packages/LiblineaR/)
Machines
https://mlr-org.github.io/mlr-tutorial/release/html/integrated_learners/index.html 1/15
classif.ctree party (http://www.rdocumentation.org/packages/party/) X X X X X prob See ?ctree_control for

ctree twoclass possible breakage for nominal
multiclass features with missingness.
Conditional Inference Trees
classif.cvglmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X prob The family parameter is set to
cvglmnet twoclass binomial for two-class
multiclass problems and to multinomial
GLM with Lasso or Elasticnet otherwise. Factors automatically
Regularization (Cross Validated get converted to dummy
Lambda) columns, ordered factors to
integer. glmnet uses a global
control object for its parameters.
mlr resets all control parameters
to their defaults before setting
the specified parameters and
after training. If you are setting
glmnet.control parameters
through glmnet.control, you
need to save and re-set them
after running the glmnet learner.
classif.dbnDNN deepnet (http://www.rdocumentation.org/packages/deepnet/) X prob output set to "softmax" by
dbn.dnn twoclass default.
multiclass
Deep neural network with weights
initialized by DBN
classif.dcSVM SwarmSVM X twoclass
dcSVM (http://www.rdocumentation.org/packages/SwarmSVM/)
Divided-Conquer Support Vector

Machines
classif.earth earth (http://www.rdocumentation.org/packages/earth/) X X X prob This learner performs flexible
fda twoclass discriminant analysis using the
multiclass earth algorithm. na.action is set
Flexible Discriminant Analysis to na.fail and only this is
supported.
classif.evtree evtree (http://www.rdocumentation.org/packages/evtree/) X X X X prob pmutatemajor ,
evtree twoclass pmutateminor ,
multiclass pcrossover , psplit , and
Evolutionary learning of globally pprune , are scaled internally
optimal trees to sum to 100.
classif.extraTrees extraTrees X X prob
extraTrees (http://www.rdocumentation.org/packages/extraTrees/) twoclass
multiclass
Extremely Randomized Trees
classif.featureless mlr (http://www.rdocumentation.org/packages/mlr/) X X X X prob
featureless twoclass
multiclass
Featureless classifier
classif.fnn FNN (http://www.rdocumentation.org/packages/FNN/) X twoclass
fnn multiclass
Fast k-Nearest Neighbour

classif.gamboost mboost (http://www.rdocumentation.org/packages/mboost/) X X X prob family has been set to
gamboost twoclass Binomial() by default. For
Gradient boosting with smooth probabilities cannot be
components predicted.
classif.gaterSVM SwarmSVM X twoclass m set to 3 and max.iter
gaterSVM (http://www.rdocumentation.org/packages/SwarmSVM/) set to 1 by default.
e1071 (http://www.rdocumentation.org/packages/e1071/)
Mixture of SVMs with Neural
Network Gater Function
classif.gausspr kernlab (http://www.rdocumentation.org/packages/kernlab/) X X prob Kernel parameters have to be
gausspr twoclass passed directly and not by using
multiclass the kpar list in gausspr .
Gaussian Processes Note that fit has been set to
FALSE by default for speed.
classif.gbm gbm (http://www.rdocumentation.org/packages/gbm/) X X X X prob keep.data is set to FALSE to

gbm twoclass reduce memory requirements.
multiclass Note on param 'distribution':
Gradient Boosting Machine featimp gbm will select 'bernoulli' by
default for 2 classes, and
'multinomial' for multiclass
problems. The latter is the only
setting that works for > 2
classes.
classif.geoDA DiscriMiner X twoclass
geoda (http://www.rdocumentation.org/packages/DiscriMiner/) multiclass
Geometric Predictive
Discriminant Analysis
classif.glmboost mboost (http://www.rdocumentation.org/packages/mboost/) X X X prob family has been set to
glmboost twoclass Binomial by default. For
Boosting for GLMs probabilities cannot be
predcited.
classif.glmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X prob The family parameter is set to
glmnet twoclass binomial for two-class
multiclass problems and to multinomial
GLM with Lasso or Elasticnet otherwise. Factors automatically
Regularization get converted to dummy
columns, ordered factors to
integer. Parameter s (value of
the regularization parameter
used for predictions) is set to
0.1 by default, but needs to
be tuned by the user. glmnet
uses a global control object for
its parameters. mlr resets all
control parameters to their
defaults before setting the
specified parameters and after
training. If you are setting
through glmnet.control, you
need to save and re-set them
after running the glmnet learner.
classif.h2o.deeplearning h2o (http://www.rdocumentation.org/packages/h2o/) X X X prob
h2o.dl twoclass
multiclass
h2o.deeplearning
classif.h2o.gbm h2o (http://www.rdocumentation.org/packages/h2o/) X X prob 'distribution' is set automatically
h2o.gbm twoclass to 'gaussian'.
multiclass
h2o.gbm
classif.h2o.glm h2o (http://www.rdocumentation.org/packages/h2o/) X X X prob 'family' is always set to 'binomial'
h2o.glm twoclass to get a binary classifier.
h2o.glm
classif.h2o.randomForest h2o (http://www.rdocumentation.org/packages/h2o/) X X prob
h2o.rf twoclass
multiclass
h2o.randomForest
classif.hdrda sparsediscrim X prob
hdrda (http://www.rdocumentation.org/packages/sparsediscrim/) twoclass
High-Dimensional Regularized
classif.IBk RWeka (http://www.rdocumentation.org/packages/RWeka/) X X prob
ibk twoclass
multiclass
k-Nearest Neighbours
classif.J48 RWeka (http://www.rdocumentation.org/packages/RWeka/) X X X prob NAs are directly passed to
j48 twoclass WEKA with na.action =
multiclass na.pass .
J48 Decision Trees
classif.JRip RWeka (http://www.rdocumentation.org/packages/RWeka/) X X X prob NAs are directly passed to
jrip twoclass WEKA with na.action =
Propositional Rule Learner
classif.kknn kknn (http://www.rdocumentation.org/packages/kknn/) X X prob

kknn twoclass
multiclass
k-Nearest Neighbor
classif.knn class (http://www.rdocumentation.org/packages/class/) X twoclass
knn multiclass
k-Nearest Neighbor
classif.ksvm kernlab (http://www.rdocumentation.org/packages/kernlab/) X X prob Kernel parameters have to be
ksvm twoclass passed directly and not by using
multiclass the kpar list in ksvm . Note
Support Vector Machines class.weights that fit has been set to
FALSE by default for speed.
classif.lda MASS (http://www.rdocumentation.org/packages/MASS/) X X prob Learner parameter

lda twoclass predict.method maps to
multiclass method in predict.lda .
Linear Discriminant Analysis
classif.LiblineaRL1L2SVC LiblineaR X twoclass
liblinl1l2svc (http://www.rdocumentation.org/packages/LiblineaR/) multiclass
class.weights
L1-Regularized L2-Loss Support
Vector Classification
classif.LiblineaRL1LogReg LiblineaR X prob
liblinl1logreg (http://www.rdocumentation.org/packages/LiblineaR/) twoclass
multiclass
L1-Regularized Logistic class.weights
Regression
classif.LiblineaRL2L1SVC LiblineaR X twoclass
liblinl2l1svc (http://www.rdocumentation.org/packages/LiblineaR/) multiclass
class.weights
classif.LiblineaRL2LogReg LiblineaR X prob type = 0 (the default) is
liblinl2logreg (http://www.rdocumentation.org/packages/LiblineaR/) twoclass primal and type = 7 is dual
multiclass problem.
L2-Regularized Logistic class.weights
Regression
classif.LiblineaRL2SVC LiblineaR X twoclass type = 2 (the default) is
liblinl2svc (http://www.rdocumentation.org/packages/LiblineaR/) multiclass primal and type = 1 is dual
class.weights problem.
classif.LiblineaRMultiClassSVC LiblineaR X twoclass
liblinmulticlasssvc (http://www.rdocumentation.org/packages/LiblineaR/) multiclass
class.weights
Support Vector Classification by
Crammer and Singer
classif.linDA DiscriMiner X twoclass Set validation = NULL by
linda (http://www.rdocumentation.org/packages/DiscriMiner/) multiclass default to disable internal test
set validation.
Linear Discriminant Analysis
classif.logreg stats (http://www.rdocumentation.org/packages/stats/) X X X prob Delegates to glm with
logreg twoclass family = binomial(link =
'logit') . We set 'model' to
Logistic Regression FALSE by default to save
memory.
classif.lqa lqa (http://www.rdocumentation.org/packages/lqa/) X prob penalty has been set to
lqa twoclass "lasso" and lambda to
0.1 by default.
Fitting penalized Generalized
Linear Models with the LQA
algorithm
classif.lssvm kernlab (http://www.rdocumentation.org/packages/kernlab/) X X twoclass fitted has been set to
lssvm multiclass FALSE by default for speed.
Least Squares Support Vector

Machine
classif.lvq1 class (http://www.rdocumentation.org/packages/class/) X twoclass
lvq1 multiclass
Learning Vector Quantization
classif.mda mda (http://www.rdocumentation.org/packages/mda/) X X prob keep.fitted has been set to

mda twoclass FALSE by default for speed
multiclass and we use start.method =
Mixture Discriminant Analysis "lvq" for more robust behavior
/ less technical crashes.
classif.mlp RSNNS (http://www.rdocumentation.org/packages/RSNNS/) X prob
mlp twoclass
multiclass
Multi-Layer Perceptron
classif.multinom nnet (http://www.rdocumentation.org/packages/nnet/) X X X prob
multinom twoclass
multiclass
Multinomial Regression
classif.naiveBayes e1071 (http://www.rdocumentation.org/packages/e1071/) X X X prob
nbayes twoclass
multiclass
Naive Bayes
classif.neuralnet neuralnet X prob err.fct has been set to ce
neuralnet (http://www.rdocumentation.org/packages/neuralnet/) twoclass and linear.output to FALSE
to do classification.
Neural Network from neuralnet
classif.nnet nnet (http://www.rdocumentation.org/packages/nnet/) X X X prob size has been set to 3 by
nnet twoclass default.
multiclass
Neural Network
classif.nnTrain deepnet (http://www.rdocumentation.org/packages/deepnet/) X prob output set to softmax by
nn.train twoclass default.
multiclass max.number.of.layers can
Training Neural Network by be set to control and tune the
Backpropagation maximal number of layers
specified via hidden .
classif.nodeHarvest nodeHarvest X X prob
nodeHarvest (http://www.rdocumentation.org/packages/nodeHarvest/) twoclass
Node Harvest
classif.OneR RWeka (http://www.rdocumentation.org/packages/RWeka/) X X X prob NAs are directly passed to
oner twoclass WEKA with na.action =
1-R Classifier
classif.pamr pamr (http://www.rdocumentation.org/packages/pamr/) X prob Threshold for prediction
pamr twoclass ( threshold.predict ) has
been set to 1 by default.
Nearest shrunken centroid
classif.PART RWeka (http://www.rdocumentation.org/packages/RWeka/) X X X prob NAs are directly passed to
part twoclass WEKA with na.action =
PART Decision Lists
classif.penalized.fusedlasso penalized X X prob trace=FALSE was set by default
fusedlasso (http://www.rdocumentation.org/packages/penalized/) twoclass to disable logging output.
lambda1 and lambda2 have
Logistic Fused Lasso Regression been set to 1 by default, as
fusedlasso needs both
penalizations > 0.
classif.penalized.lasso penalized X X X prob trace=FALSE was set by default
lasso (http://www.rdocumentation.org/packages/penalized/) twoclass to disable logging output.
Logistic Lasso Regression

classif.penalized.ridge penalized X X X prob trace=FALSE was set by default
ridge (http://www.rdocumentation.org/packages/penalized/) twoclass to disable logging output.
Logistic Ridge Regression

classif.plr stepPlr (http://www.rdocumentation.org/packages/stepPlr/) X X X prob AIC and BIC penalty types can
plr twoclass be selected via the new
parameter cp.type .
Logistic Regression with a L2
Penalty
classif.plsdaCaret caret (http://www.rdocumentation.org/packages/caret/) X prob
plsdacaret twoclass
Partial Least Squares (PLS)

classif.probit stats (http://www.rdocumentation.org/packages/stats/) X X X prob Delegates to glm with

probit twoclass family = binomial(link =
'probit') . We set 'model' to
Probit Regression FALSE by default to save
memory.
classif.qda MASS (http://www.rdocumentation.org/packages/MASS/) X X prob Learner parameter
qda twoclass predict.method maps to
multiclass method in predict.qda .
Quadratic Discriminant Analysis
classif.quaDA DiscriMiner X twoclass
quada (http://www.rdocumentation.org/packages/DiscriMiner/) multiclass
Quadratic Discriminant Analysis

classif.randomForest randomForest X X X prob Note that the rf can freeze the R
rf (http://www.rdocumentation.org/packages/randomForest/) twoclass process if trained on a task with
multiclass 1 feature which is constant. This
Random Forest class.weights can happen in feature forward
featimp selection, also due to
oobpreds resampling, and you need to
remove such features with
removeConstantFeatures.
classif.randomForestSRC randomForestSRC X X X X X prob na.action has been set to
rfsrc (http://www.rdocumentation.org/packages/randomForestSRC/) twoclass "na.impute" by default to
multiclass allow missing data support.
Random Forest featimp
oobpreds
classif.ranger ranger (http://www.rdocumentation.org/packages/ranger/) X X X X prob By default, internal
ranger twoclass parallelization is switched off
multiclass ( num.threads = 1 ),
Random Forests featimp verbose output is disabled,
oobpreds respect.unordered.factors
is set to TRUE . All settings are
changeable.
classif.rda klaR (http://www.rdocumentation.org/packages/klaR/) X X prob estimate.error has been
rda twoclass set to FALSE by default for
multiclass speed.
Regularized Discriminant
Analysis
classif.rFerns rFerns (http://www.rdocumentation.org/packages/rFerns/) X X X twoclass
rFerns multiclass
oobpreds
Random ferns
classif.rknn rknn (http://www.rdocumentation.org/packages/rknn/) X X twoclass k restricted to < 99 as the code
rknn multiclass allocates arrays of static size
Random k-Nearest-Neighbors
classif.rotationForest rotationForest X X X prob
rotationForest (http://www.rdocumentation.org/packages/rotationForest/) twoclass
Rotation Forest
classif.rpart rpart (http://www.rdocumentation.org/packages/rpart/) X X X X X prob xval has been set to 0 by
rpart twoclass default for speed.
multiclass
Decision Tree featimp
classif.RRF RRF (http://www.rdocumentation.org/packages/RRF/) X X prob
RRF twoclass
multiclass
Regularized Random Forests featimp
classif.rrlda rrlda (http://www.rdocumentation.org/packages/rrlda/) X twoclass
rrlda multiclass
Robust Regularized Linear

classif.saeDNN deepnet (http://www.rdocumentation.org/packages/deepnet/) X prob output set to "softmax" by
sae.dnn twoclass default.
multiclass
Deep neural network with weights
initialized by Stacked
AutoEncoder
classif.sda sda (http://www.rdocumentation.org/packages/sda/) X prob
sda twoclass
multiclass
Shrinkage Discriminant Analysis
classif.sparseLDA sparseLDA X prob Arguments Q and stop are

sparseLDA (http://www.rdocumentation.org/packages/sparseLDA/) twoclass not yet provided as they depend
MASS (http://www.rdocumentation.org/packages/MASS/) multiclass on the task.
Sparse Discriminant Analysis elasticnet
(http://www.rdocumentation.org/packages/elasticnet/)
classif.svm e1071 (http://www.rdocumentation.org/packages/e1071/) X X prob
svm twoclass
multiclass
Support Vector Machines class.weights
(libsvm)
classif.xgboost xgboost (http://www.rdocumentation.org/packages/xgboost/) X X X prob All settings are passed directly,
xgboost twoclass rather than through xgboost 's
multiclass params argument. nrounds
eXtreme Gradient Boosting featimp has been set to 1 and
verbose to 0 by default.
num_class is set internally, so
do not set this manually.
classif.xyf kohonen (http://www.rdocumentation.org/packages/kohonen/) X prob
xyf twoclass
multiclass
X-Y fused self-organising maps
Regression (64)
Additional learner properties:
se: Standard errors can be predicted.
regr.bartMachine bartMachine X X X use_missing_data has been set

bartmachine (http://www.rdocumentation.org/packages/bartMachine/) to TRUE by default to allow
Bayesian Additive Regression
Trees
regr.bcart tgp (http://www.rdocumentation.org/packages/tgp/) X X se
bcart
Bayesian CART
regr.bdk kohonen (http://www.rdocumentation.org/packages/kohonen/) X keep.data is set to FALSE to
bdk reduce memory requirements.
Bi-Directional Kohonen map

regr.bgp tgp (http://www.rdocumentation.org/packages/tgp/) X se
bgp
Bayesian Gaussian Process

regr.bgpllm tgp (http://www.rdocumentation.org/packages/tgp/) X se
bgpllm
Bayesian Gaussian Process with

jumps to the Limiting Linear
Model
regr.blackboost mboost (http://www.rdocumentation.org/packages/mboost/) X X X X See ?ctree_control for
blackboost party (http://www.rdocumentation.org/packages/party/) possible breakage for nominal
Gradient Boosting with
Regression Trees
regr.blm tgp (http://www.rdocumentation.org/packages/tgp/) X se
blm
Bayesian Linear Model

regr.brnn brnn (http://www.rdocumentation.org/packages/brnn/) X X
brnn
Bayesian regularization for feed-

forward neural networks
regr.bst bst (http://www.rdocumentation.org/packages/bst/) X Renamed parameter learner to
bst Learner due to nameclash with
setHyperPars . Default changes:
Gradient Boosting Learner = "ls" , xval = 0 ,
and maxdepth = 1 .
regr.btgp tgp (http://www.rdocumentation.org/packages/tgp/) X X se

btgp
Bayesian Treed Gaussian

Process
regr.btgpllm tgp (http://www.rdocumentation.org/packages/tgp/) X X se
btgpllm
Bayesian Treed Gaussian

Process with jumps to the
Limiting Linear Model
regr.btlm tgp (http://www.rdocumentation.org/packages/tgp/) X X se
btlm
Bayesian Treed Linear Model

regr.cforest party (http://www.rdocumentation.org/packages/party/) X X X X X featimp See ?ctree_control for
cforest possible breakage for nominal
Random Forest Based on
regr.crs crs (http://www.rdocumentation.org/packages/crs/) X X X se
crs
Regression Splines
regr.ctree party (http://www.rdocumentation.org/packages/party/) X X X X X See ?ctree_control for
ctree possible breakage for nominal
regr.cubist Cubist (http://www.rdocumentation.org/packages/Cubist/) X X X
cubist
Cubist
regr.cvglmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X Factors automatically get
cvglmnet converted to dummy columns,
ordered factors to integer. glmnet
GLM with Lasso or Elasticnet uses a global control object for its
Regularization (Cross Validated parameters. mlr resets all control
Lambda) parameters to their defaults before
setting the specified parameters
and after training. If you are setting
glmnet.control parameters through
glmnet.control, you need to save
and re-set them after running the
glmnet learner.
regr.earth earth (http://www.rdocumentation.org/packages/earth/) X X
earth
Multivariate Adaptive
Regression Splines
regr.elmNN elmNN (http://www.rdocumentation.org/packages/elmNN/) X nhid has been set to 1 and
elmNN actfun has been set to "sig"
by default.
Extreme Learning Machine for
Single Hidden Layer
Feedforward Neural Networks
regr.evtree evtree (http://www.rdocumentation.org/packages/evtree/) X X X X pmutatemajor ,
evtree pmutateminor , pcrossover ,
psplit , and pprune , are
Evolutionary learning of globally scaled internally to sum to 100.
optimal trees
regr.extraTrees extraTrees X X
extraTrees (http://www.rdocumentation.org/packages/extraTrees/)
Extremely Randomized Trees

regr.featureless mlr (http://www.rdocumentation.org/packages/mlr/) X X X X
featureless
Featureless regression
regr.fnn FNN (http://www.rdocumentation.org/packages/FNN/) X
fnn
Fast k-Nearest Neighbor
regr.frbs frbs (http://www.rdocumentation.org/packages/frbs/) X

frbs
Fuzzy Rule-based Systems

regr.gamboost mboost (http://www.rdocumentation.org/packages/mboost/) X X X
gamboost
Gradient Boosting with Smooth

Components
regr.gausspr kernlab (http://www.rdocumentation.org/packages/kernlab/) X X se Kernel parameters have to be
gausspr passed directly and not by using
the kpar list in gausspr . Note
Gaussian Processes that fit has been set to FALSE
by default for speed.
regr.gbm gbm (http://www.rdocumentation.org/packages/gbm/) X X X X featimp keep.data is set to FALSE to
gbm reduce memory requirements,
distribution has been set to
Gradient Boosting Machine "gaussian" by default.
regr.glm stats (http://www.rdocumentation.org/packages/stats/) X X X se 'family' must be a character and

glm every family has its own link, i.e.
family = 'gaussian', link.gaussian =
Generalized Linear Regression 'identity', which is also the default.
We set 'model' to FALSE by
default to save memory.
regr.glmboost mboost (http://www.rdocumentation.org/packages/mboost/) X X X
glmboost
Boosting for GLMs

regr.glmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X X Factors automatically get
glmnet converted to dummy columns,
ordered factors to integer.
GLM with Lasso or Elasticnet Parameter s (value of the
Regularization regularization parameter used for
predictions) is set to 0.1 by
default, but needs to be tuned by
the user. glmnet uses a global
mlr resets all control parameters to
their defaults before setting the
glmnet.control parameters through
glmnet.control, you need to save
and re-set them after running the
glmnet learner.
regr.GPfit GPfit (http://www.rdocumentation.org/packages/GPfit/) X se (1) As the optimization routine
GPfit assumes that the inputs are scaled
to the unit hypercube [0,1]^d, the
Gaussian Process input gets scaled for each variable
by default. If this is not wanted,
scale = FALSE has to be set. (2)
We replace the GPfit parameter
'corr =
list(type='exponential',power=1.95)'
to be seperate parameters 'type'
and 'power', in the case of corr =
list(type='matern', nu = 0.5), the
seperate parameters are 'type' and
'matern_nu_k=0', and nu is
computed by 'nu=
(2*matern_nu_k+1)/2=0.5'
regr.h2o.deeplearning h2o (http://www.rdocumentation.org/packages/h2o/) X X X
h2o.dl
h2o.deeplearning
regr.h2o.gbm h2o (http://www.rdocumentation.org/packages/h2o/) X X 'distribution' is set automatically to
h2o.gbm 'gaussian'.
h2o.gbm
regr.h2o.glm h2o (http://www.rdocumentation.org/packages/h2o/) X X X 'family' is always set to 'gaussian'.
h2o.glm
h2o.glm
regr.h2o.randomForest h2o (http://www.rdocumentation.org/packages/h2o/) X X

h2o.rf
h2o.randomForest
regr.IBk RWeka (http://www.rdocumentation.org/packages/RWeka/) X X
ibk
K-Nearest Neighbours
regr.kknn kknn (http://www.rdocumentation.org/packages/kknn/) X X
kknn
K-Nearest-Neighbor regression
regr.km DiceKriging X se In predict, we currently always use
km (http://www.rdocumentation.org/packages/DiceKriging/) type = "SK" . The extra
parameter jitter (default is
Kriging FALSE ) enables adding a very
small jitter (order 1e-12) to the x-
values before prediction, as
predict.km reproduces the
exact y-values of the training data
points, when you pass them in,
even if the nugget effect is turned
on. We further introduced
nugget.stability which sets
the nugget to
nugget.stability * var(y)
before each training to improve
numerical stability. We recommend
a setting of 10^-8
regr.ksvm kernlab (http://www.rdocumentation.org/packages/kernlab/) X X Kernel parameters have to be
ksvm passed directly and not by using
the kpar list in ksvm . Note that
Support Vector Machines fit has been set to FALSE by
default for speed.
regr.laGP laGP (http://www.rdocumentation.org/packages/laGP/) X se
laGP
Local Approximate Gaussian

Process
regr.LiblineaRL2L1SVR LiblineaR X Parameter svr_eps has been
liblinl2l1svr (http://www.rdocumentation.org/packages/LiblineaR/) set to 0.1 by default.

Vector Regression
regr.LiblineaRL2L2SVR LiblineaR X type = 11 (the default) is primal
liblinl2l2svr (http://www.rdocumentation.org/packages/LiblineaR/) and type = 12 is dual problem.
Parameter svr_eps has been
L2-Regularized L2-Loss Support set to 0.1 by default.
Vector Regression
regr.lm stats (http://www.rdocumentation.org/packages/stats/) X X X se
lm
Simple Linear Regression

regr.mars mda (http://www.rdocumentation.org/packages/mda/) X
mars
Multivariate Adaptive
Regression Splines
regr.mob party (http://www.rdocumentation.org/packages/party/) X X X
mob
Model-based Recursive
Partitioning Yielding a Tree with
Fitted Models Associated with
each Terminal Node
regr.nnet nnet (http://www.rdocumentation.org/packages/nnet/) X X X size has been set to 3 by
nnet default.
Neural Network
regr.nodeHarvest nodeHarvest X X
nodeHarvest (http://www.rdocumentation.org/packages/nodeHarvest/)
Node Harvest
regr.pcr pls (http://www.rdocumentation.org/packages/pls/) X X

pcr
Principal Component
Regression
regr.penalized.fusedlasso penalized X X trace=FALSE was set by default to
fusedlasso (http://www.rdocumentation.org/packages/penalized/) disable logging output. lambda1
and lambda2 have been set to 1
Fused Lasso Regression by default, as fusedlasso needs
both penalizations > 0.
regr.penalized.lasso penalized X X trace=FALSE was set by default to
lasso (http://www.rdocumentation.org/packages/penalized/) disable logging output.
Lasso Regression
regr.penalized.ridge penalized X X trace=FALSE was set by default to
ridge (http://www.rdocumentation.org/packages/penalized/) disable logging output.
Ridge Regression
regr.plsr pls (http://www.rdocumentation.org/packages/pls/) X X
plsr
Partial Least Squares

Regression
regr.randomForest randomForest X X X featimp See ?regr.randomForest for
rf (http://www.rdocumentation.org/packages/randomForest/) oobpreds information about se estimation.
se Note that the rf can freeze the R
Random Forest process if trained on a task with 1
feature which is constant. This can
happen in feature forward
selection, also due to resampling,
and you need to remove such
features with
removeConstantFeatures.
keep.inbag is NULL by default but
if predict.type = 'se' and se.method
= 'jackknife' (the default) then it is
automatically set to TRUE.
regr.randomForestSRC randomForestSRC X X X X X featimp na.action has been set to
rfsrc (http://www.rdocumentation.org/packages/randomForestSRC/) oobpreds "na.impute" by default to allow
Random Forest
regr.ranger ranger (http://www.rdocumentation.org/packages/ranger/) X X X featimp By default, internal parallelization
ranger oobpreds is switched off ( num.threads =
1 ), verbose output is disabled,
Random Forests respect.unordered.factors is
set to TRUE . All settings are
changeable.
regr.rknn rknn (http://www.rdocumentation.org/packages/rknn/) X X
rknn
Random k-Nearest-Neighbors
regr.rpart rpart (http://www.rdocumentation.org/packages/rpart/) X X X X X featimp xval has been set to 0 by
rpart default for speed.
Decision Tree
regr.RRF RRF (http://www.rdocumentation.org/packages/RRF/) X X X featimp
RRF
Regularized Random Forests

regr.rsm rsm (http://www.rdocumentation.org/packages/rsm/) X You select the order of the
rsm regression by using modelfun =
"FO" (first order), "TWI" (two-
Response Surface Regression way interactions, this is with 1st
oder terms!) and "SO" (full
second order).
regr.rvm kernlab (http://www.rdocumentation.org/packages/kernlab/) X X Kernel parameters have to be
rvm passed directly and not by using
the kpar list in rvm . Note that
Relevance Vector Machine fit has been set to FALSE by
default for speed.
regr.slim flare (http://www.rdocumentation.org/packages/flare/) X lambda.idx has been set to 3

slim by default.
Sparse Linear Regression using

Nonsmooth Loss Functions and
L1 Regularization
regr.svm e1071 (http://www.rdocumentation.org/packages/e1071/) X X
svm
Support Vector Machines

(libsvm)
regr.xgboost xgboost (http://www.rdocumentation.org/packages/xgboost/) X X featimp All settings are passed directly,
xgboost rather than through xgboost 's
params argument. nrounds
eXtreme Gradient Boosting has been set to 1 and verbose
to 0 by default.
regr.xyf kohonen (http://www.rdocumentation.org/packages/kohonen/) X
xyf
X-Y fused self-organising maps
Survival analysis (15)

prob: Probabilities can be predicted,

rcens, lcens, icens: The learner can handle right, left and/or interval censored data.
surv.cforest party (http://www.rdocumentation.org/packages/party/) X X X X X featimp See ?ctree_control for

crf survival (http://www.rdocumentation.org/packages/survival/) rcens possible breakage for nominal
Random Forest based on
surv.CoxBoost CoxBoost X X X X rcens Factors automatically get
coxboost (http://www.rdocumentation.org/packages/CoxBoost/) converted to dummy columns,
Cox Proportional Hazards Model
with Componentwise Likelihood
based Boosting
surv.coxph survival (http://www.rdocumentation.org/packages/survival/) X X X rcens
coxph
Cox Proportional Hazard Model

surv.cv.CoxBoost CoxBoost X X X rcens Factors automatically get
cv.CoxBoost (http://www.rdocumentation.org/packages/CoxBoost/) converted to dummy columns,
Cox Proportional Hazards Model
with Componentwise Likelihood
based Boosting, tuned for the
optimal number of boosting
steps
surv.cvglmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X X rcens Factors automatically get
cvglmnet converted to dummy columns,
GLM with Regularization (Cross
Validated Lambda)
surv.gamboost survival (http://www.rdocumentation.org/packages/survival/) X X X X rcens family has been set to
gamboost mboost (http://www.rdocumentation.org/packages/mboost/) CoxPH() by default.
Gradient boosting with smooth

components
surv.gbm gbm (http://www.rdocumentation.org/packages/gbm/) X X X X prob keep.data is set to FALSE to
gbm featimp reduce memory requirements.
rcens
Gradient Boosting Machine
surv.glmboost survival (http://www.rdocumentation.org/packages/survival/) X X X X rcens family has been set to
glmboost mboost (http://www.rdocumentation.org/packages/mboost/) CoxPH() by default.
Gradient Boosting with

Componentwise Linear Models
surv.glmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X X rcens Factors automatically get

glmnet converted to dummy columns,
GLM with Regularization Parameter s (value of the
regularization parameter used for
predictions) is set to 0.1 by
default, but needs to be tuned by
the user. glmnet uses a global
mlr resets all control parameters
to their defaults before setting the
through glmnet.control, you need
to save and re-set them after
running the glmnet learner.
surv.penalized.fusedlasso penalized X X X rcens trace=FALSE was set by default
fusedlasso (http://www.rdocumentation.org/packages/penalized/) to disable logging output.
lambda1 and lambda2 have been
Fused Lasso Regression set to 1 by default, as fusedlasso
needs both penalizations > 0.
surv.penalized.lasso penalized X X X rcens trace=FALSE was set by default
lasso (http://www.rdocumentation.org/packages/penalized/) to disable logging output.
LassoRegression
surv.penalized.ridge penalized X X X rcens trace=FALSE was set by default
ridge (http://www.rdocumentation.org/packages/penalized/) to disable logging output.
Ridge Regression
surv.randomForestSRC survival (http://www.rdocumentation.org/packages/survival/) X X X X X featimp na.action has been set to
rfsrc randomForestSRC oobpreds "na.impute" by default to
(http://www.rdocumentation.org/packages/randomForestSRC/) rcens allow missing data support.
Random Forest
surv.ranger ranger (http://www.rdocumentation.org/packages/ranger/) X X X featimp By default, internal parallelization
ranger rcens is switched off ( num.threads =
1 ), verbose output is
Random Forests disabled,
respect.unordered.factors
is set to TRUE . All settings are
changeable.
surv.rpart rpart (http://www.rdocumentation.org/packages/rpart/) X X X X X featimp xval has been set to 0 by
rpart rcens default for speed.
Survival Tree
Cluster analysis (9)

prob: Probabilities can be predicted.
cluster.cmeans e1071 X prob The predict method uses cl_predict

cmeans (http://www.rdocumentation.org/packages/e1071/) from the clue package to compute the
clue cluster memberships for new data. The
Fuzzy C-Means Clustering (http://www.rdocumentation.org/packages/clue/) default centers = 2 is added so the
method runs without setting parameters, but
this must in reality of course be changed by
the user.
cluster.Cobweb RWeka X
cobweb (http://www.rdocumentation.org/packages/RWeka/)
Cobweb Clustering Algorithm

cluster.dbscan fpc (http://www.rdocumentation.org/packages/fpc/) X A cluster index of NA indicates noise points.
dbscan Specify method = "dist" if the data
should be interpreted as dissimilarity matrix
DBScan Clustering or object. Otherwise Euclidean distances will
be used.
cluster.EM RWeka X
em (http://www.rdocumentation.org/packages/RWeka/)
Expectation-Maximization
Clustering
cluster.FarthestFirst RWeka X
farthestfirst (http://www.rdocumentation.org/packages/RWeka/)
FarthestFirst Clustering
Algorithm
cluster.kkmeans kernlab X centers has been set to 2L by default.
kkmeans (http://www.rdocumentation.org/packages/kernlab/) The nearest center in kernel distance
determines cluster assignment of new data
Kernel K-Means points. Kernel parameters have to be passed
directly and not by using the kpar list in
kkmeans
cluster.kmeans stats X prob The predict method uses cl_predict

kmeans (http://www.rdocumentation.org/packages/stats/) from the clue package to compute the
clue cluster memberships for new data. The
K-Means (http://www.rdocumentation.org/packages/clue/) default centers = 2 is added so the
method runs without setting parameters, but
this must in reality of course be changed by
the user.
cluster.SimpleKMeans RWeka X
simplekmeans (http://www.rdocumentation.org/packages/RWeka/)
K-Means Clustering
cluster.XMeans RWeka X You may have to install the XMeans Weka
xmeans (http://www.rdocumentation.org/packages/RWeka/) package: WPM('install-package',
'XMeans') .
XMeans (k-means with
automatic determination of k)
Cost-sensitive classification
For ordinary misclassification costs you can use all the standard classification methods listed above.
For example-dependent costs there are several ways to generate cost-sensitive learners from ordinary regression and classification learners. See section cost-sensitive
classification (../cost_sensitive_classif/index.html) and the documentation of makeCostSensClassifWrapper
(http://www.rdocumentation.org/packages/mlr/functions/makeCostSensClassifWrapper.html), makeCostSensRegrWrapper
(http://www.rdocumentation.org/packages/mlr/functions/makeCostSensRegrWrapper.html) and makeCostSensWeightedPairsWrapper
(http://www.rdocumentation.org/packages/mlr/functions/makeCostSensWeightedPairsWrapper.html) for details.
Multilabel classification (3)

multilabel.cforest party (http://www.rdocumentation.org/packages/party/) X X X X X prob

cforest
Random forest based on

conditional inference trees
multilabel.randomForestSRC randomForestSRC X X X X prob na.action has been set to
rfsrc (http://www.rdocumentation.org/packages/randomForestSRC/) na.impute by default to allow
Random Forest
multilabel.rFerns rFerns (http://www.rdocumentation.org/packages/rFerns/) X X X
rFerns
Random ferns
Moreover, you can use the binary relevance method to apply ordinary classification learners to the multilabel problem. See the documentation of function
makeMultilabelBinaryRelevanceWrapper (http://www.rdocumentation.org/packages/mlr/functions/makeMultilabelBinaryRelevanceWrapper.html) and the tutorial section on
multilabel classification (../multilabel/index.html) for details.
Documentation built with MkDocs (http://www.mkdocs.org/).

Integrated Learners - MLR Tutorial

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Integrated Learners - MLR Tutorial

Uploaded by

Copyright:

Available Formats

20/07/2017 Integrated Learners - mlr tutorial

prob: The method can predict probabilities,

classif.ada ada (http://www.rdocumentation.org/packages/ada/) X X prob xval has been set to 0 by

classif.C50 C50 (http://www.rdocumentation.org/packages/C50/) X X X X prob

classif.ctree party (http://www.rdocumentation.org/packages/party/) X X X X X prob See ?ctree_control for

Divided-Conquer Support Vector

Fast k-Nearest Neighbour

classif.gbm gbm (http://www.rdocumentation.org/packages/gbm/) X X X X prob keep.data is set to FALSE to

classif.kknn kknn (http://www.rdocumentation.org/packages/kknn/) X X prob

classif.lda MASS (http://www.rdocumentation.org/packages/MASS/) X X prob Learner parameter

Least Squares Support Vector

Learning Vector Quantization

classif.mda mda (http://www.rdocumentation.org/packages/mda/) X X prob keep.fitted has been set to

Logistic Lasso Regression

Logistic Ridge Regression

Partial Least Squares (PLS)

classif.probit stats (http://www.rdocumentation.org/packages/stats/) X X X prob Delegates to glm with

Quadratic Discriminant Analysis

Robust Regularized Linear

classif.sparseLDA sparseLDA X prob Arguments Q and stop are

se: Standard errors can be predicted.

regr.bartMachine bartMachine X X X use_missing_data has been set

Bi-Directional Kohonen map

Bayesian Gaussian Process

Bayesian Gaussian Process with

Bayesian Linear Model

Bayesian regularization for feed-

regr.btgp tgp (http://www.rdocumentation.org/packages/tgp/) X X se

Bayesian Treed Gaussian

Bayesian Treed Gaussian

Bayesian Treed Linear Model

Extremely Randomized Trees

Fast k-Nearest Neighbor

regr.frbs frbs (http://www.rdocumentation.org/packages/frbs/) X

Fuzzy Rule-based Systems

Gradient Boosting with Smooth

regr.glm stats (http://www.rdocumentation.org/packages/stats/) X X X se 'family' must be a character and

Boosting for GLMs

regr.h2o.randomForest h2o (http://www.rdocumentation.org/packages/h2o/) X X

Local Approximate Gaussian

L2-Regularized L1-Loss Support

Simple Linear Regression

regr.pcr pls (http://www.rdocumentation.org/packages/pls/) X X

Partial Least Squares

Regularized Random Forests

regr.slim flare (http://www.rdocumentation.org/packages/flare/) X lambda.idx has been set to 3

Sparse Linear Regression using

Support Vector Machines

X-Y fused self-organising maps

Survival analysis (15)

prob: Probabilities can be predicted,

surv.cforest party (http://www.rdocumentation.org/packages/party/) X X X X X featimp See ?ctree_control for

Cox Proportional Hazard Model

Gradient boosting with smooth

Gradient Boosting with

surv.glmnet glmnet (http://www.rdocumentation.org/packages/glmnet/) X X X X rcens Factors automatically get

Cluster analysis (9)

prob: Probabilities can be predicted.

cluster.cmeans e1071 X prob The predict method uses cl_predict

Cobweb Clustering Algorithm

cluster.kmeans stats X prob The predict method uses cl_predict

Multilabel classification (3)

multilabel.cforest party (http://www.rdocumentation.org/packages/party/) X X X X X prob