I have a function that behaves incorrectly when passed through the mutate function from the dplyr package. The function takes a UK postcode and returns a postal area. It works fine with individual post codes or vectors of postcodes.
Here is the function:
pArea_parse <- function(x) {
z <- any(grep('[A-Z][A-Z]',substr(x,1,2)))
y <- any(grep('[A-Z][0-9]',substr(x,1,2)))
if (z) {
return(substr(x,1,2))
}
else if (y) {
return(substr(x,1,1))
}
else if (!y & !z) {
return(NA)
}
}
It works:
x <- "B30 1AA" # plucked randomly from a postcode site
> pArea_parse(x)
[1] "B"
Here is some sample data:
test <- data.frame(id = c(1,2,3,4), post_code = c("B30 1AA", "B30 3FT", "B30
3AZ", "BA1 8TU"))
Here is my dplyr code:
test %>% mutate(postal_area = pArea_parse(post_code))
Instead of returning the first letter when there is a letter followed by a number, it returns the letter and the number, even though this doesn't happen with a vector of postcodes or an individual postcode.
id post_code postal_area
1 B30 1AA B3
2 B30 3FT B3
3 B30 3AZ B3
4 BA1 8TU BA
How can a function do something it's not programmed to do when used in conjunction with mutate? I am stumped!