# 2015-02-15-Tree-Based-Methods-Part-IV-Packages-Comparison

While the last three articles illustrated the CART model for both classification (with equal/unequal costs) and regression tasks, this article is rather technical as it compares three packages: **rpart**, **caret** and **mlr**. For those who are not farmiliar with the last two packages, they are wrappers (or frameworks) that implement a range of models (or algorithms) in a unified way. For example, the CART implementation of the **rpart** package can also be performed in these packages as an integrated learner. As mentioned in an earlier article (Link), inconsistent API could be a drawback of R (like other open source tools) and it would be quite beneficial if there is a way to implement different models in a standardized way. In line with the earlier articles, the *Carseats* data is used for a classification task.

Before getting started, I should admit the names are not defined effectively. I hope the below list may be helpful to follow the script.

- package:
**rpt**- rpart,**crt**- caret,**mlr**- mlr - model fitting:
**ftd**- fit on training data,**ptd**- fit on test data - parameter selection
**bst**- best*cp*by*1-SE rule*(recommended by**rpart**)**lst**- best*cp*by highest*accuracy*(**caret**) or lowest*mmce*(**mlr**)

- cost:
**eq**- equal cost,**uq**- unequal cost (*uq*case not added in this article) - etc:
**cp**- complexity parameter,**mmce**- mean misclassification error,**acc**- Accuracy (**caret**),**cm**- confusion matrix

Also the data is randomly split into **trainData** and **testData**. In practice, the latter is not observed and it is used here for evaludation.

Letâ€™s get started.

The following packages are used.

The *Sales* column is converted into a binary variables.

Balanced splitting of data can be performed in either of the packages as shown below.

Same to the previous articles, the split by the **caret** package is taken.

Note that two custom functions are used: `bestParam()`

and `updateCM()`

. The former searches the *cp* values by the *1-SE rule* (**bst**) and at the lowest *xerror* (**lst**) from the cp table of a *rpart* object. The latter produces a confusion matrix with model and use error, added to the last column and row respectively. Their sources can be seen here.

At first, the model is fit using the **rpart** package and **bst** and **lst** *cp* values are obtained.

The selected *cp* values can be check graphically below.

The original tree is pruned with the 2 *cp* values, resulting in 2 separate trees, and they are fit on the training data.

Details of the fitting is kept in a list (*mmce*).

- pkg: package name
- isTest: fit on test data?
- isBest:
*cp*by*1-SE*rule? - isEq: equal cost?
- cp:
*cp*value used - mmce: mean misclassification error

The pruned trees are fit into the test data and the same details are added to the list (*mmce*).

Secondly the **caret** package is employed to implement the CART model.

Note that the **caret** package select the best *cp* value that corresponds to the lowest *Accuracy*. Therefore the best *cp* by this package is labeled as **lst** to be consistent with the **rpart** package. And the **bst** *cp* is selected by the *1-SE rule*. Note that, as the standard error of *Accuracy* is relatively wide, an adjustment is maded to select the best *cp* value and it can be checked in the graph below.

Similar to above, 2 trees with the respective *cp* values are fit into the train and test data and the details are kept in *mmce*. Below is the update by fitting from the train data.

Below is the update by fitting from the test data. The updated fitting details can be checked.

Finally the **mlr** package is employed.

At first, a taks and learner are set up.

Then a grid of *cp* values is generated followed by tuning the parameter. Note that, as the tuning optimization path does not include a *standard-error-like* variable, only the best *cp* values are taken into consideration.

Using the best *cp* value, the learner is updated followed by training the model.

Then the model is fit into the train and test data and the fitting details are updated in *mmce*. The overall fitting results can be checked below.

It is shown that the *mmce* values are identical and it seems to be because the model is quite stable with respect to *cp*. It can be checked in the following graph.

It may not be convicing to use a wrapper by this article about a single model. For example, however, if there are multiple models with a variety of tuning parameters to compare, the benefit of having one can be considerable. In the following articles, a similar approach would be taken, which is comparing individual packages to the wrappers.