R-将data.frame中的值/因子分配给以其他列的值为条件的列

R-将data.frame中的值/因子分配给以其他列的值为条件的列,r,dataframe,conditional,assign,R,Dataframe,Conditional,Assign,这应提供: df$level <- ifelse(df$n < 1 & df$m < 1, "low", ifelse(df$n > 1 & df$m > 1, "high", "medium") 或者如果我想根据l列和n中的值为level分配一个值(还是伪代码): 我可能没有回答这个问题,但当我添加一个缺少的右括号时,它似乎工作得很好: df$level #"low A/B" "high" "high" "high" "high" >df$1

这应提供:

df$level <- ifelse(df$n < 1 & df$m < 1, "low", ifelse(df$n > 1 & df$m > 1, "high", "medium")
或者如果我想根据
l
列和
n
中的值为
level
分配一个值(还是伪代码):


我可能没有回答这个问题,但当我添加一个缺少的右括号时,它似乎工作得很好:

df$level

#"low A/B" "high" "high" "high" "high"
>df$1级和df$m>1级,“高”、“中”)
>df
n、m、l级
1 0.9154139-0.1078814 A低
2 1.8404001-0.1702891 B中等
3 0.5365172-1.0883317摄氏度低
4 0.4491650-3.0110517 D低
5 1.7360404-0.5931743 E中等
>df$级
[1] “低”“中”“低”“低”“中”

我可能遗漏了这个问题,但当我添加一个遗漏的右括号时,它似乎工作得很好:

df$level

#"low A/B" "high" "high" "high" "high"
>df$1级和df$m>1级,“高”、“中”)
>df
n、m、l级
1 0.9154139-0.1078814 A低
2 1.8404001-0.1702891 B中等
3 0.5365172-1.0883317摄氏度低
4 0.4491650-3.0110517 D低
5 1.7360404-0.5931743 E中等
>df$级
[1] “低”“中”“低”“低”“中”
这里有一个解决方案:

> df$level <- ifelse(df$n < 1 & df$m < 1, "low", ifelse(df$n > 1 & df$m > 1, "high", "medium"))
> df
          n          m l  level
1 0.9154139 -0.1078814 A    low
2 1.8404001 -0.1702891 B medium
3 0.5365172 -1.0883317 C    low
4 0.4491650 -3.0110517 D    low
5 1.7360404 -0.5931743 E medium
> df$level
[1] "low"    "medium" "low"    "low"    "medium"
df$level1这里有一个解决方案:

> df$level <- ifelse(df$n < 1 & df$m < 1, "low", ifelse(df$n > 1 & df$m > 1, "high", "medium"))
> df
          n          m l  level
1 0.9154139 -0.1078814 A    low
2 1.8404001 -0.1702891 B medium
3 0.5365172 -1.0883317 C    low
4 0.4491650 -3.0110517 D    low
5 1.7360404 -0.5931743 E medium
> df$level
[1] "low"    "medium" "low"    "low"    "medium"
df$level1您还可以:

df$level1 <- c("low", "medium", "high")[rowMeans(sign(df[c("n", "m")] - 1)) + 2]

df$level2 <- c("high", "low A/B")[(df$n < 1 & df$l %in% c("A", "B")) + 1]

#           n          m l level1  level2
# 1 0.9154139 -0.1078814 A    low low A/B
# 2 1.8404001 -0.1702891 B medium    high
# 3 0.5365172 -1.0883317 C    low    high
# 4 0.4491650 -3.0110517 D    low    high
# 5 1.7360404 -0.5931743 E medium    high
  • +1
    以上内容将给我们

    rowSums(df[,-3] <1) #in this example, there are no values equal to 0
    #[1] 2 1 2 2 1
    
    行和(df[,-3]您还可以执行以下操作:

    df$level1 <- c("low", "medium", "high")[rowMeans(sign(df[c("n", "m")] - 1)) + 2]
    
    df$level2 <- c("high", "low A/B")[(df$n < 1 & df$l %in% c("A", "B")) + 1]
    
    #           n          m l level1  level2
    # 1 0.9154139 -0.1078814 A    low low A/B
    # 2 1.8404001 -0.1702891 B medium    high
    # 3 0.5365172 -1.0883317 C    low    high
    # 4 0.4491650 -3.0110517 D    low    high
    # 5 1.7360404 -0.5931743 E medium    high
    
  • +1
    以上内容将给我们

    rowSums(df[,-3] <1) #in this example, there are no values equal to 0
    #[1] 2 1 2 2 1
    

    rowSums(df[,-3]与其说是答案,不如说是一个扩展的注释,而且可能并不完全是您想要的

    通常,当我需要捕获连续变量组并将其转换为单个分类变量时,我会使用聚类,并根据给出的值为聚类命名。下面是使用kmeans的示例:

      c("high", "medium", "low")[rowSums(df[,-3] <1)+1]
      #[1] "low"    "medium" "low"    "low"    "medium"
    
    set.seed(8)
    
    df与其说是答案,不如说是一个延伸的评论,也许并不完全是你想要的

    通常,当我需要捕获连续变量组并将其转换为单个分类变量时,我会使用聚类,并根据给出的值为聚类命名。下面是使用kmeans的示例:

      c("high", "medium", "low")[rowSums(df[,-3] <1)+1]
      #[1] "low"    "medium" "low"    "low"    "medium"
    
    set.seed(8)
    
    df是的,它做得很好。但我希望有一种更“通用”的方法来生成这样的条件赋值?如果level1不是呢?是的,它做得很好。但我希望有一种更“通用”的方法来生成这样的条件赋值?如果level1不是,怎么办?我无法解释这种语法(同样在Sven的解决方案中);我不确定我是否得到了c(…)[…]。谢谢!请解释一下这个语法(同样在Sven的解决方案中);我不确定我得到了c(…)[…。谢谢!
    set.seed(8)
    df <- data.frame(n = rnorm(5000,1), m = rnorm(5000,0), l = factor(LETTERS[1:5]))
    df$Category <- kmeans(df[1:2],7)$cluster
    
    kmeans(df[1:2],7)
    K-means clustering with 7 clusters of sizes 593, 606, 649, 626, 641, 1219, 666
    
    Cluster means:
               n           m
    1 -0.2097451  0.84837728 # Low-High
    2  1.0977826  1.44383531 # Mid-Upper
    3  2.1682482 -0.70983193 # High-Low
    4 -0.3389432 -0.54514302 # Low-Low
    5  2.3332772  0.67415808 # High-Mid
    6  0.9816709 -0.01549909 # Upper-Mid
    7  0.8859904 -1.46126667 # Mid-Low
    
    df$Category <- factor(df$Category, c("Low-High","Mid-Upper","High-Low","Low-Low",...))