This question is related to this question, but not quite the same.
Say I have this data frame,
df <- data.frame( id = c(1:6), profession = c(1, 5, 4, NA, 0, 5))
and a string with human readable information about the profession codes. Say,
profession.code <- c( Optometrists=1, Accountants=2, Veterinarians=3, `Financial analysts`=4, Nurses=5)
Now, I'm looking for the easiest way to replace the values in
df$profession with the text found in
profession.code. Preferably without use of special libraries, unless it shortens the code significantly.
I would like my end result to be
df <- data.frame( id = c(1:6), profession = c("Optometrists", "Nurses", "Financial analysts", NA, 0, "Nurses"))
Any help would be greatly appreciated.
You can do it this way:
df <- data.frame(id = c(1:6), profession = c(1, 5, 4, NA, 0, 5)) profession.code <- c(`0` = 0, Optometrists=1, Accountants=2, Veterinarians=3, `Financial analysts`=4, Nurses=5) df$profession.str <- names(profession.code)[match(df$profession, profession.code)] df # id profession profession.str # 1 1 1 Optometrists # 2 2 5 Nurses # 3 3 4 Financial analysts # 4 4 NA <NA> # 5 5 0 0 # 6 6 5 Nurses
Note that I had to add a
0 entry in your
profession.code vector to account for those zeroes.
EDIT: here is an updated solution to account for Eric's comment below that the data may contain any number of profession codes for which there are no corresponding descriptions:
match.idx <- match(df$profession, profession.code) df$profession.str <- ifelse(is.na(match.idx), df$profession, names(profession.code)[match.idx])
I played around with it and this is my current solution using the
pLoop <- function(v) paste(profession.code[v],"='", names(profession.code[v]),"';") library(car) df$profession<- recode(df$profession, paste(sapply(1:5, pLoop),collapse="")) df # id profession # 1 Optometrists # 2 Nurses # 3 Financial analysts # 4 <NA> # 5 0 # 6 Nurses
Still interest to if anyone have other suggestions for a solution. I would prefer to do it using only the base function in R.
I personally like the way the
arules package deals with this problem, using the
decode function. From the documentation:
library(arules) data("Adult") ## Example 1: Manual decoding ## get code iLabels <- itemLabels(Adult) head(iLabels) ## get undecoded list and decode in a second step list <- LIST(Adult[1:5], decode = FALSE) list decode(list, itemLabels = iLabels)
Advantage is that the package also offers the functions
recode. Their respective purpose is straightforward, I believe.