R 使用另一列的字符串添加列
我希望使用transform函数在我的数据帧中添加一列。我的一列包含字符串作为元素。我希望找到某些字符串并添加另一列R 使用另一列的字符串添加列,r,if-statement,grepl,R,If Statement,Grepl,我希望使用transform函数在我的数据帧中添加一列。我的一列包含字符串作为元素。我希望找到某些字符串并添加另一列 UNIT.NO. USAGE..kWh.month. A1 863 A1 1339 D3 1058 D1 782 L1 1339 L7 1058 L1
UNIT.NO. USAGE..kWh.month.
A1 863
A1 1339
D3 1058
D1 782
L1 1339
L7 1058
L1 782
我希望添加另一列对数据类别进行分类,并得到以下结果:
UNIT.NO. USAGE..kWh.month. Category
A1 863 A
A1 1339 A
D3 1058 D
D1 782 D
L1 1339 L
L7 1058 L
L1 782 L
我使用了以下代码,但它不起作用
dataset.1<-transform(
dataset.1,
Category=
if(grepl("A",dataset.1$UNIT.NO.)==T){
"A"
} else
if(grepl("D",dataset.1$UNIT.NO.)==T){
"D"
} else
if(grepl("L",dataset.1$UNIT.NO.)==T){
"L"
}else{
"Other"
}
)
dataset.1,并且只使用第一个元素
因此,我的所有类别值现在都是A,不同的字符不会根据其单位编号进行替换。添加此类列的最佳方式是什么
我需要这些类别来执行非参数分析。
提前谢谢 一个选项就是
indx <- gsub("[0-9]", "" , df1$UNIT.NO.)
df1$Category <- "Other"
df1[indx %in% c("A","D","L"), "Category"] <- indx
使用substr
获取第一个字母:
dataset.1$Category <- ifelse(substr(dataset.1$"UNIT.NO.",1,1) %in% c("A","D","L"),
substr(dataset.1$"UNIT.NO.",1,1),
"other")
dataset.1$Category有很多方法:
#dummy data
dataset.1 <- read.table(text="
UNIT.NO. USAGE..kWh.month.
A1 863
A1 1339
D3 1058
D1 782
L1 1339
L7 1058
L1 782
XX1 782", header=TRUE)
#using your approach - nested ifelse
dataset.1$CategoryIfElse <-
ifelse(grepl("A",dataset.1$UNIT.NO.)==T,"A",
ifelse(grepl("D",dataset.1$UNIT.NO.)==T,"D",
ifelse(grepl("L",dataset.1$UNIT.NO.)==T,"L","Other")))
#using substr
dataset.1$CategorySusbstr <-
substr(dataset.1$"UNIT.NO.",1,1)
dataset.1$CategorySusbstr <-
factor(dataset.1$CategorySusbstr,levels=c("A","D","L","Other"))
dataset.1$CategorySusbstr[ is.na(dataset.1$CategorySusbstr)] <- "Other"
#result
dataset.1
# UNIT.NO. USAGE..kWh.month. CategoryIfElse CategorySusbstr
# 1 A1 863 A A
# 2 A1 1339 A A
# 3 D3 1058 D D
# 4 D1 782 D D
# 5 L1 1339 L L
# 6 L7 1058 L L
# 7 L1 782 L L
# 8 XX1 782 Other Other
#虚拟数据
dataset.1为什么不使用dataset.1$Category@zx8754,因为他还必须解析Other
参数?
dataset.1$Category <- substr(dataset.1$"UNIT.NO.",1,1)
#dummy data
dataset.1 <- read.table(text="
UNIT.NO. USAGE..kWh.month.
A1 863
A1 1339
D3 1058
D1 782
L1 1339
L7 1058
L1 782
XX1 782", header=TRUE)
#using your approach - nested ifelse
dataset.1$CategoryIfElse <-
ifelse(grepl("A",dataset.1$UNIT.NO.)==T,"A",
ifelse(grepl("D",dataset.1$UNIT.NO.)==T,"D",
ifelse(grepl("L",dataset.1$UNIT.NO.)==T,"L","Other")))
#using substr
dataset.1$CategorySusbstr <-
substr(dataset.1$"UNIT.NO.",1,1)
dataset.1$CategorySusbstr <-
factor(dataset.1$CategorySusbstr,levels=c("A","D","L","Other"))
dataset.1$CategorySusbstr[ is.na(dataset.1$CategorySusbstr)] <- "Other"
#result
dataset.1
# UNIT.NO. USAGE..kWh.month. CategoryIfElse CategorySusbstr
# 1 A1 863 A A
# 2 A1 1339 A A
# 3 D3 1058 D D
# 4 D1 782 D D
# 5 L1 1339 L L
# 6 L7 1058 L L
# 7 L1 782 L L
# 8 XX1 782 Other Other