I apologize if this question is trivial or a duplicate but I can't find any answers anywhere. I've read the docs and similar questions here but they aren't answering my question. My background is in nonstatistical programming in languages like javascript, python etc so please explain for that audience.
My question is simple (hopefully):
What is a formula? What is the ~
operator? How are they evaluated and used?
(As someone with a programming background,) I want to understand formulas like I understand any other type: what operations can I perform on them and how can I use them generically in functions?
For example, the mosaic package (from library(mdsr)
) has the mean
function.
mean( ~ mpg, data = mtcars)
They are using the ~
operator to grab the column name. I know that the $
operator as in mtcars$mpg
returns a vector of all the mpg
s within the data frame mtcars
. How can I use a formula to make generic functions like that? How can I evaluate the formula? How does that type work?