R 使用另一列的字符串添加列

R 使用另一列的字符串添加列,r,if-statement,grepl,R,If Statement,Grepl,我希望使用transform函数在我的数据帧中添加一列。我的一列包含字符串作为元素。我希望找到某些字符串并添加另一列 UNIT.NO. USAGE..kWh.month. A1 863 A1 1339 D3 1058 D1 782 L1 1339 L7 1058 L1

我希望使用transform函数在我的数据帧中添加一列。我的一列包含字符串作为元素。我希望找到某些字符串并添加另一列

UNIT.NO. USAGE..kWh.month.
     A1               863
     A1              1339
     D3              1058
     D1               782
     L1              1339
     L7              1058
     L1               782
我希望添加另一列对数据类别进行分类,并得到以下结果:

UNIT.NO. USAGE..kWh.month.   Category
     A1               863       A
     A1              1339       A
     D3              1058       D
     D1               782       D
     L1              1339       L
     L7              1058       L
     L1               782       L
我使用了以下代码,但它不起作用

dataset.1<-transform(
  dataset.1,
  Category=
    if(grepl("A",dataset.1$UNIT.NO.)==T){
      "A"
    } else 
      if(grepl("D",dataset.1$UNIT.NO.)==T){
        "D"
      } else 
        if(grepl("L",dataset.1$UNIT.NO.)==T){
          "L"
        }else{
              "Other"
            }
)
dataset.1,并且只使用第一个元素

因此,我的所有类别值现在都是A,不同的字符不会根据其单位编号进行替换。添加此类列的最佳方式是什么

我需要这些类别来执行非参数分析。 提前谢谢

一个选项就是

indx <- gsub("[0-9]", "" , df1$UNIT.NO.)
df1$Category <- "Other"
df1[indx %in% c("A","D","L"), "Category"] <- indx

使用
substr
获取第一个字母:

dataset.1$Category <- ifelse(substr(dataset.1$"UNIT.NO.",1,1) %in% c("A","D","L"), 
                             substr(dataset.1$"UNIT.NO.",1,1),
                             "other")
dataset.1$Category有很多方法:

#dummy data
dataset.1 <- read.table(text="
UNIT.NO. USAGE..kWh.month.
A1               863
A1              1339
D3              1058
D1               782
L1              1339
L7              1058
L1               782
XX1               782", header=TRUE)

#using your approach - nested ifelse
dataset.1$CategoryIfElse <-
  ifelse(grepl("A",dataset.1$UNIT.NO.)==T,"A",
         ifelse(grepl("D",dataset.1$UNIT.NO.)==T,"D",
                ifelse(grepl("L",dataset.1$UNIT.NO.)==T,"L","Other")))

#using substr
dataset.1$CategorySusbstr <-
  substr(dataset.1$"UNIT.NO.",1,1)
dataset.1$CategorySusbstr <- 
  factor(dataset.1$CategorySusbstr,levels=c("A","D","L","Other"))
dataset.1$CategorySusbstr[ is.na(dataset.1$CategorySusbstr)] <- "Other"

#result
dataset.1

# UNIT.NO. USAGE..kWh.month. CategoryIfElse CategorySusbstr
# 1       A1               863              A               A
# 2       A1              1339              A               A
# 3       D3              1058              D               D
# 4       D1               782              D               D
# 5       L1              1339              L               L
# 6       L7              1058              L               L
# 7       L1               782              L               L
# 8      XX1               782          Other           Other
#虚拟数据

dataset.1为什么不使用
dataset.1$Category@zx8754,因为他还必须解析
Other
参数?
dataset.1$Category <- substr(dataset.1$"UNIT.NO.",1,1)
#dummy data
dataset.1 <- read.table(text="
UNIT.NO. USAGE..kWh.month.
A1               863
A1              1339
D3              1058
D1               782
L1              1339
L7              1058
L1               782
XX1               782", header=TRUE)

#using your approach - nested ifelse
dataset.1$CategoryIfElse <-
  ifelse(grepl("A",dataset.1$UNIT.NO.)==T,"A",
         ifelse(grepl("D",dataset.1$UNIT.NO.)==T,"D",
                ifelse(grepl("L",dataset.1$UNIT.NO.)==T,"L","Other")))

#using substr
dataset.1$CategorySusbstr <-
  substr(dataset.1$"UNIT.NO.",1,1)
dataset.1$CategorySusbstr <- 
  factor(dataset.1$CategorySusbstr,levels=c("A","D","L","Other"))
dataset.1$CategorySusbstr[ is.na(dataset.1$CategorySusbstr)] <- "Other"

#result
dataset.1

# UNIT.NO. USAGE..kWh.month. CategoryIfElse CategorySusbstr
# 1       A1               863              A               A
# 2       A1              1339              A               A
# 3       D3              1058              D               D
# 4       D1               782              D               D
# 5       L1              1339              L               L
# 6       L7              1058              L               L
# 7       L1               782              L               L
# 8      XX1               782          Other           Other