R 列上的data.table子字符串
给定这个数据表R 列上的data.table子字符串,r,data.table,substring,R,Data.table,Substring,给定这个数据表 DT = data.table(item=c("item1 - description one", "item2 - description two", "item3 - description three"), sales=1:3) DT item sales 1: item1 - description one 1 2: item2 - description two 2 3: item3 - desc
DT = data.table(item=c("item1 - description one", "item2 - description two", "item3 - description three"), sales=1:3)
DT
item sales
1: item1 - description one 1
2: item2 - description two 2
3: item3 - description three 3
我怎样才能轻松地获得如下所示的输出
code sales
1: item1 1
2: item2 2
3: item3 3
可能很简单,但请提前感谢。您可以这样做:
DT[, item:=gsub(item, pattern=" - [a-zA-Z ]+", replacement="")]
setnames(DT, "item", "code")
# code sales
#1: item1 1
#2: item2 2
#3: item3 3
按空格分割,然后保留第一个值。类似于:
dt1$item\u clean SimplyDT[,item:=sub('\\s+.'','',item)]
data.table(code=substr(DT[[1]],1,regexpr('',DT[[1]])-1,sales=DT[[2]])
感谢大家的及时回复。。。抱歉重复。感谢Rafael的及时回复。工作得很有魅力。