在R中嵌套ifelse语句

我在这里是新手,在R开头。我在Windows7上使用最新的R 3.0.1。

我还在学习如何将SAS代码翻译成R,我得到警告。我需要了解我在哪里犯错误。我想做的是创建一个变量,总结和区分人口的三个状态:大陆,海外,外国人。
我有一个数据库有2个变量:

> id国籍:idnat(法语,外国人),

如果idnat是法语,那么:

> id出生地:idbp(大陆,殖民地,海外)

我想将来自idnat和idbp的信息汇总到一个名为idnat2的新变量:

>状态:k(大陆,海外,外国人)

所有这些变量使用“字符类型”。

列idnat2中预期的结果:

   idnat     idbp   idnat2
1  french mainland mainland
2  french   colony overseas
3  french overseas overseas
4 foreign  foreign  foreign

这里是我的SAS代码我想翻译在R:

if idnat = "french" then do;
   if idbp in ("overseas","colony") then idnat2 = "overseas";
   else idnat2 = "mainland";
end;
else idnat2 = "foreigner";
run;

这里是我在R的尝试:

if(idnat=="french"){
    idnat2 <- "mainland"
} else if(idbp=="overseas"|idbp=="colony"){
    idnat2 <- "overseas"
} else {
    idnat2 <- "foreigner"
}

我收到此警告:

Warning message:
In if (idnat=="french") { :
  the condition has length > 1 and only the first element will be used

我被建议使用一个“嵌套ifelse”,而不是它的容易,但得到更多的警告:

idnat2 <- ifelse (idnat=="french", "mainland",
        ifelse (idbp=="overseas"|idbp=="colony", "overseas")
      )
            else (idnat2 <- "foreigner")

根据警告消息,长度大于1,因此只考虑第一个括号之间的内容。对不起,我不明白这个长度与这里有什么关系?任何人知道我错了什么?

如果你使用任何电子表格应用程序,有一个基本的函数if()与语法:

if(<condition>, <yes>, <no>)

语法与R中的ifelse()完全相同:

ifelse(<condition>, <yes>, <no>)

在电子表格应用程序中与if()的唯一区别是R ifelse()是向量化的(将向量作为输入并在输出上返回向量)。考虑以下电子表格应用中的公式的比较,以及对于其中我们想比较如果> b,如果是,返回1,否则返回0。

在电子表格中:

  A  B C
1 3  1 =if(A1 > B1, 1, 0)
2 2  2 =if(A2 > B2, 1, 0)
3 1  3 =if(A3 > B3, 1, 0)

在R:

> a <- 3:1; b <- 1:3
> ifelse(a > b, 1, 0)
[1] 1 0 0

ifelse()可以以多种方式嵌套:

ifelse(<condition>, <yes>, ifelse(<condition>, <yes>, <no>))

ifelse(<condition>, ifelse(<condition>, <yes>, <no>), <no>)

ifelse(<condition>, 
       ifelse(<condition>, <yes>, <no>), 
       ifelse(<condition>, <yes>, <no>)
      )

ifelse(<condition>, <yes>, 
       ifelse(<condition>, <yes>, 
              ifelse(<condition>, <yes>, <no>)
             )
       )

要计算列idnat2,您可以:

df <- read.table(header=TRUE, text="
idnat idbp idnat2
french mainland mainland
french colony overseas
french overseas overseas
foreign foreign foreign"
)

with(df, 
     ifelse(idnat=="french",
       ifelse(idbp %in% c("overseas","colony"),"overseas","mainland"),"foreign")
     )

R Documentation

什么是条件的长度> 1,只有第一个元素将被使用?让我们来看看:

> # What is first condition really testing?
> with(df, idnat=="french")
[1]  TRUE  TRUE  TRUE FALSE
> # This is result of vectorized function - equality of all elements in idnat and 
> # string "french" is tested.
> # Vector of logical values is returned (has the same length as idnat)
> df$idnat2 <- with(df,
+   if(idnat=="french"){
+   idnat2 <- "xxx"
+   }
+   )
Warning message:
In if (idnat == "french") { :
  the condition has length > 1 and only the first element will be used
> # Note that the first element of comparison is TRUE and that's whay we get:
> df
    idnat     idbp idnat2
1  french mainland    xxx
2  french   colony    xxx
3  french overseas    xxx
4 foreign  foreign    xxx
> # There is really logic in it, you have to get used to it

我仍然可以使用if()?是的,可以,但是语法不是很酷:)

test <- function(x) {
  if(x=="french") {
    "french"
  } else{
    "not really french"
  }
}

apply(array(df[["idnat"]]),MARGIN=1, FUN=test)

如果您熟悉SQL,还可以在sqldf package中使用CASE statement

http://stackoverflow.com/questions/18012222/nested-ifelse-statement-in-r

本站文章除注明转载外,均为本站原创或编译
转载请明显位置注明出处:在R中嵌套ifelse语句