I've got a list of datasets, and I want to make a few changes to these datasets using R.
First, if variable "mac_sector"
exists, I want to rename it to "sector".
*Edit: It always says mac_sector not found, even if it is in at least one of the datasets. Also, if something is not found by if(exists()) does it then just continue on with the rest of the script, or does it terminate the script?
Second, if there is no variable called "mac_sector"
or "sector"
, I want to create a new column variable called "sector"
with putting "total"
as the values.
Lastly, I rearrange the columns because I want variable "sector"
to be the 3rd column in each dataset.
I wrote the script (some parts are not even in R language) below, but obviously it's not working, so I'm hoping that some of you may be able to help me with this.
I also want to save these changes to the respective datasets, but I've no idea how to even go about that in this particular case?? (I know of the save() command but I feel like it wouldn't work here)
setwd("C:\\Users\\files")
mylist = list.files(pattern="*.dta")
#Loop through all of the datasets in C:\\Users\\files
#Reading the datasets into R
df <- lapply(mylist, read.dta13)
#Naming the list of elemenents to match the files for convenience
names(df) <- gsub("\\.dta$", "", mylist)
# If column mac_sector exists, rename to sector
if(exists(mac_sector, df)){
df <- rename(df, c(mac_sector="sector"))
}
# If column variable with pattern("sector") does not exist, create variable sector=total
if(does not exist(pattern="sector")){
sector <- c("total")
df$sector <- sector
}
# rearrange variable, sector must be placed 3rd
df <- arrange.vars(df, c("sector" = 3))
edit: I want all datasets to look like this (and some already do look like this):
Country|sector| Variable1| Variable2| Variable3|....
GER | M | value | value | value |....
BELG | K | value | value | value |....
and so on.
Now some of them look like this:
Country|mac_sector| Variable1| Variable2| Variable3|....
GER | F | value | value | value |....
BELG | L | value | value | value |....
In which case I want to rename mac_sector to sector.
They can also look like this:
Country| Variable1| Variable2| Variable3|....
GER | value | value | value |....
BELG | value | value | value |....
In which case I want to add a variable sector = total:
Country|sector| Variable1| Variable2| Variable3|....
GER | total| value | value | value |....
BELG | total| value | value | value |....
*Variable1, Variable2, Variable3 and so on, do not represent the same thing across datasets, just thought I should mention that.