Although the ratio of 0 to 1s is 1:9, it does not mean the weights are 0.1 and 0.9. The weights decides how much emphasis you want to give observation compared to the others.
And in your case, if you want to predict something, it is essential you split your data into train and test, and see what influence the weights have on prediction.
Below is using the pima indian diabetes example, I subsample the Yes type such that the training set has 1:9 ratio.
set.seed(111)
library(MASS)
# we sample 10 from Yes and 90 from No
idx = unlist(mapply(sample,split(1:nrow(Pima.tr),Pima.tr$type),c(90,10)))
Data = Pima.tr
trn = Data[idx,]
test = Data[-idx,]
table(trn$type)
No Yes
90 10
Lets try regressing it with weight 9 if positive, 1 if negative:
library(caret)
W = 9
lvl = levels(trn$type)
#if positive we give it the defined weight, otherwise set it to 1
fit_wts = ifelse(trn$type==lvl[2],W,1)
fit = glm(type ~ .,data=trn,weight=fit_wts,family=binomial)
# we test it on the test set
pred = ifelse(predict(fit,test,type="response")>0.5,lvl[2],lvl[1])
pred = factor(pred,levels=lvl)
confusionMatrix(pred,test$type,positive=lvl[2])
Confusion Matrix and Statistics
Reference
Prediction No Yes
No 34 26
Yes 8 32
You can see from above, you can see it's doing ok, but you are missing out on 8 positives and also falsely labeling 26 false positives. Let's say we try W = 3
W = 3
lvl = levels(trn$type)
fit_wts = ifelse(trn$type==lvl[2],W,1)
fit = glm(type ~ .,data=trn,weight=fit_wts,family=binomial)
pred = ifelse(predict(fit,test,type="response")>0.5,lvl[2],lvl[1])
pred = factor(pred,levels=lvl)
confusionMatrix(pred,test$type,positive=lvl[2])
Confusion Matrix and Statistics
Reference
Prediction No Yes
No 39 30
Yes 3 28
Now we manage to get almost all the positive calls correct.. But still miss out on a lot of potential "Yes". Bottom line is, code above might work, but you need to do some checks to figure out what is the weight for your data.
You can also look around the other stats provided by confusionMatrix in caret to guide your choice.