This is an update of the second article - Second Look on MLR. While a hyper- or turning parameter is either non-existent or given in the previous article, it is estimated here - specifically cost of constraints violation (C) of support vector machine is estimated.
The Credit Scoring example in Chapter 4 of Applied Predictive Modeling is reimplemented using the mlr package. Details of the German Credit Data that is used here can be found here.
The bold-cased topics below are mainly covered.
Imputation, Processing …
The following packages are used.
mlr has different methods of preprocessing and splitting data to caret. For comparison that may be necessary in the future, these steps are performed in the same way.
80% of data is taken as the training set.
Task is set up using the training data and normalized as the original example.
The following two learners are set up for benchmark: Support vector machine and logistic regression. Note that the development version (v2.3) is necessary to fit logistic regression - see this article for installation information.
Repeated cross-validation is chosen as the original example.
As the original example, sigma (inverse kernel width) is estimated first using sigest() in the kernlab package. Then a control grid is made by varying values of C only.
In makeParamSet(), sigma and kernel are fixed as discrete parameters while C is varied from lower to upper in the scale that is determined by the argument of trafo. For numeric and integer parameters, it is possible to adjust increment by resolution. Note that the above set up can be relaxed, for example, by varying both C and sigma and, in this case, it would be more flexible to set sigma as a numeric parameter.
The resulting grid can be checked using generateGridDesign()
The parameter can be tuned using tuneParams() as shown below.
Fitting details can check as following.
Once the hyper- or tuning parameter is determined, the learner can be updated using setHyperPars().
The tuned SVM learner can be bechmarked with the logistic regression learner. This shows only a marginal difference.
The tuning section of mlr tutorial indicates that the above practice in which optimization is undertaken over the same data during tuning the SVM parameter might be optimistically biased to estimate the performance value. In order to handle this issue, nested resampling is necessary - a more detailed explanation about nested resampling can be found here. Moreover this resampling strategy can be applied to feature selection - see the benchmark tutorial and this article. In this regards, it would be alright that the topic of the next article is about nested resampling for model selection.