# rpart function: how to know the % of correct classification at every terminal node?

#### randomcat

##### New Member
I have a dataset with 277 observations.I have binary response variables i.e, 0 indicates no disease, and 1 indicates disease. I know that 180 of the observations have no disease and the 97 have the disease. I build a model and construct a classification tree to see how well my model correctly predicts who has the disease and who doesn't. I used the rpart function to construct a tree, and ran a summary on it.

Code:
mytree=rpart(y~x1+x2+x3+x4, method="class")
summary(tree)
My question is, how do I know which % of the data is classified correctly at each tip? Suppose my output is as follows:

Code:
Node number 1: 277 observations,    complexity param=0.134
predicted class=0  expected loss=0.35  P(node) =1
class counts:   180    97
probabilities: 0.650 0.350
left son=2 (156 obs) right son=3 (121 obs)
Primary splits:
x1     < 1.73 to the left,  improve=17.80, (0 missing)
x3     < 1.44 to the left,  improve=17.80, (0 missing)
x2    < 1.35 to the left,  improve=16.40, (0 missing)
x4 < 3.5  to the left,  improve= 1.36, (0 missing)
Surrogate splits:
x2    < 1.35 to the left,  agree=0.751, adj=0.430, (0 split)
x3     < 1.44 to the left,  agree=0.653, adj=0.207, (0 split)
x4 < 3.5  to the right, agree=0.578, adj=0.033, (0 split)

Node number 2: 156 observations,    complexity param=0.0258
predicted class=0  expected loss=0.192  P(node) =0.563
class counts:   126    30
probabilities: 0.808 0.192
left son=4 (133 obs) right son=5 (23 obs)
Primary splits:
x3     < 1.6  to the left,  improve=4.410, (0 missing)
x2    < 1.83 to the left,  improve=3.990, (0 missing)
x1     < 1.27 to the left,  improve=1.410, (0 missing)
x4 < 4.5  to the left,  improve=0.999, (0 missing)
Node number 4: 133 observations
predicted class=0  expected loss=0.143  P(node) =0.48
class counts:   114    19
probabilities: 0.857 0.143
Note that node number 4 splits into two tips. One of the tips has 114 observations (and this is a terminal tip). It classified 114 of the 133 observations as 0. Now, how can I tell how many of the 114 is CORRECTLY classified as 0? Any insight will be greatly appreciated.

#### consuli

##### Member
Re: rpart function: how to know the % of correct classification at every terminal nod

Hi,

I do not know if rpart offers a direct view on the correct/ incorrect classifications in a node.

However it is pretty easy to replicate the set of observations in a node by reapplying the classification criteria on the dataset.

Consuli