I am trying to tabulate a decision tree using Rpart. The code I am using is below so it can be copy pasted.
ss <- 100
set.seed(123)
x1 <- relevel(as.factor(sample(1:4,ss, replace=TRUE)), ref="4")
x11 <- ifelse(x1==1,1,0)
x12 <- ifelse(x1==2,1,0)
x13 <- ifelse(x1==3,1,0)
x2 <- relevel(as.factor(sample(1:3,ss, replace=TRUE)), ref="3")
x21 <- ifelse(x2==1,1,0)
x22 <- ifelse(x2==2,1,0)
x3 <- relevel(as.factor(sample(1:2,ss, replace=TRUE)), ref="2")
x31<- ifelse(x3==1,1,0)
y <- relevel(as.factor(sample(1:2,ss, replace=TRUE)), ref="2")
y1 <- ifelse(y==1,1,0)
n1 <- relevel(as.factor(sample(1:4,ss, replace=TRUE)), ref="4")
n11 <- ifelse(n1==1,1,0)
n12 <- ifelse(n1==2,1,0)
n13 <- ifelse(n1==3,1,0)
n2 <- relevel(as.factor(sample(1:3,ss, replace=TRUE)), ref="3")
n21 <- ifelse(n2==1,1,0)
n22 <- ifelse(n2==2,1,0)
n3 <- relevel(as.factor(sample(1:2,ss, replace=TRUE)), ref="2")
n31<- ifelse(n3==1,1,0)
xbeta <- -0.667-0.167*x11 + 0.167*x12 + 0.333*x13 + x21 -1.333*x22+ x31 + 0.667*y1 +0*n11+0*n12+0*n13+ 0*n21 + 0*n22 + 0*n31 - 1.333*y1*x21+ y1*x22 -1.333*y1*x31
p <- exp(xbeta)/(1+exp(xbeta))
R<- rbinom(ss,1,p)
fit <- rpart(R ~ x1+x2+x3+n1+n2+n3+y, method="class")
And then to look at the plotted tree, I am using
plot(fit, uniform=TRUE, main="Classification Tree")
text(fit, use.n=TRUE, all=TRUE, cex=.8)
Also, in my code, all of this is in a for loop since I am simulating a 100 such datasets. Did not include all that here for simplicity.
When you type in printcp(fit), I know how to extract "variables actually used in tree construction" and tabulate them, so that I get counts for the number of times each variable was selected. Now, the issue is, I want to capture potential interactions between x2 and y as well as x3 and y and of course, tabulate the number of times these interactions appear. Now, to that end, essentially, when one looks at the diagram of the tree (using plot(fit)), everytime y is an IMMEDIATE sub-branch of either x2 or x3, I want to somehow create a vector that keeps track of that. I say immediate sub-branch because if hypothetically, x2 is subdivided into n3 and then n3 branches into y, then no, I would not count that as a two-way interaction of x2 and y. However, if x2 branches into y, then yes, I want to count that as a 2-way interaction between x2 and y.
I tried using path.rpart for this but it seems to not help in keeping track of if either x2 or x3 immediately branch into y. I would then want to tabulate how often there are x2*y interactions and how often there are x3*y interactions.