-2

This is the R code for logistic reg model,

> hrlogis1 <- glm(Attrition~. -Age -DailyRate -Department -Education
>                 -EducationField -HourlyRate -JobLevel
>                 -JobRole -MonthlyIncome -MonthlyRate
>                 -PercentSalaryHike -PerformanceRating
>                 -StandardHours -StockOptionLevel
>                 , family=binomial(link = "logit"),data=hrtrain)

where: Attrition is the dependent variable and rest are all the independent variables.

Below is the summary of the model:

Coefficients:

                                Estimate Std. Error z value Pr(>|z|)    
(Intercept)                      1.25573    0.84329   1.489 0.136464    
BusinessTravelTravel_Frequently  1.86022    0.47410   3.924 8.72e-05 ***
BusinessTravelTravel_Rarely      1.28273    0.44368   2.891 0.003839 ** 
DistanceFromHome                 0.03869    0.01138   3.400 0.000673 ***
EnvironmentSatisfaction         -0.36484    0.08714  -4.187 2.83e-05 ***
GenderMale                       0.52556    0.19656   2.674 0.007499 ** 
JobInvolvement                  -0.59407    0.13259  -4.480 7.45e-06 ***
JobSatisfaction                 -0.37315    0.08671  -4.303 1.68e-05 ***
MaritalStatusMarried             0.23408    0.26993   0.867 0.385848    
MaritalStatusSingle              1.37647    0.27511   5.003 5.63e-07 ***
NumCompaniesWorked               0.16439    0.04034   4.075 4.59e-05 ***
OverTimeYes                      1.67531    0.20054   8.354  < 2e-16 ***
RelationshipSatisfaction        -0.23865    0.08726  -2.735 0.006240 ** 
TotalWorkingYears               -0.12385    0.02360  -5.249 1.53e-07 ***
TrainingTimesLastYear           -0.15522    0.07447  -2.084 0.037124 *  
WorkLifeBalance                 -0.30969    0.13025  -2.378 0.017427 *  
YearsAtCompany                   0.06887    0.04169   1.652 0.098513 .  
YearsInCurrentRole              -0.10812    0.04880  -2.216 0.026713 *  
YearsSinceLastPromotion          0.14006    0.04452   3.146 0.001657 ** 
YearsWithCurrManager            -0.09343    0.04984  -1.875 0.060834 .  
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Now I want to remove those which are not significant, here in this case "MaritalStatusMarried" is not significant. MaritalStatus is a variable(column) with two levels "Married" and "Single".

Bala
  • 23
  • 6
  • What do you mean remove? From the dataframe `hrtrain`? And what have the levels to do with it? – desertnaut Jun 15 '18 at 08:46
  • Possible duplicate of [Drop data frame columns by name](https://stackoverflow.com/questions/4605206/drop-data-frame-columns-by-name) – desertnaut Jun 15 '18 at 09:04
  • I want to exclude only "MaritalStatusMarried" because it's not a significant for the model. that's what i mean. – Bala Jun 15 '18 at 09:07
  • 1
    So, you are just asking how to remove columns from an R dataframe... Your question is very poorly expressed (and it has nothing to do with logistic regression itself) - see answer in the link above – desertnaut Jun 15 '18 at 09:14
  • It's not just a column. I will give an example: suppose a column "Gender" which contains two categories "male" and "female", and here say that male is not significant for the model. hence I need to exclude only male from the Gender column. – Bala Jun 15 '18 at 09:21
  • So you have a factor with *two* levels and want to drop one of them. And expect the other to be significant? How is that possible? – Rui Barradas Jun 15 '18 at 17:43

1 Answers1

0

How about:

data$MaritalStatus[data[,num]="Married"] <- NA

(where num = number of the column in the data)

The values for Married will be replaced for NA's and then you can run the glm model again.

Érica Wong
  • 109
  • 6