R 如何在列中的每个行字符串之前添加变量(列)名称
我想将列名添加到列中的每个字符串中。下面是一个小数据框R 如何在列中的每个行字符串之前添加变量(列)名称,r,paste,R,Paste,我想将列名添加到列中的每个字符串中。下面是一个小数据框 df <-structure(list(CoA = c("Baton Rouge", "Birmingham", "Chattanooga", "Columbia", "Houston"), CoB = c("Haddonfield, NJ", "Haddonfield, NJ", "
df <-structure(list(CoA = c("Baton Rouge", "Birmingham", "Chattanooga",
"Columbia", "Houston"), CoB = c("Haddonfield, NJ", "Haddonfield, NJ",
"Philadelphia, PA", "Hackensack, NJ", "Princeton, NJ"), CoC = c("St. Louis, Missouri",
"Kansas City, Missouri", "Jefferson City, Missouri", "Belleville, Illinois",
"Overland Park, Kansas")), .Names = c("CoA", "CoB", "CoC"), row.names = c(NA,
-5L), class = "data.frame")
我尝试了各种正则表达式,但一无所获。如何让
sapply
在每个列上执行此粘贴操作?这里有一个可能的解决方案:
mx <- sapply(colnames(df),function(name){ paste(name,df[,name],sep=", ")})
> mx
CoA CoB CoC
[1,] "CoA, Baton Rouge" "CoB, Haddonfield, NJ" "CoC, St. Louis, Missouri"
[2,] "CoA, Birmingham" "CoB, Haddonfield, NJ" "CoC, Kansas City, Missouri"
[3,] "CoA, Chattanooga" "CoB, Philadelphia, PA" "CoC, Jefferson City, Missouri"
[4,] "CoA, Columbia" "CoB, Hackensack, NJ" "CoC, Belleville, Illinois"
[5,] "CoA, Houston" "CoB, Princeton, NJ" "CoC, Overland Park, Kansas"
df2 <- as.data.frame(sapply(colnames(df),
function(name){ paste(name,df[,name],sep=", ")},
simplify=F))
> df2
CoA CoB CoC
1 CoA, Baton Rouge CoB, Haddonfield, NJ CoC, St. Louis, Missouri
2 CoA, Birmingham CoB, Haddonfield, NJ CoC, Kansas City, Missouri
3 CoA, Chattanooga CoB, Philadelphia, PA CoC, Jefferson City, Missouri
4 CoA, Columbia CoB, Hackensack, NJ CoC, Belleville, Illinois
5 CoA, Houston CoB, Princeton, NJ CoC, Overland Park, Kansas
为colname(df)
的每个元素调用此函数,每个元素作为第一个参数传递(即参数name
)。因此,使用
name
(记住是一个列名)我们选择df
的一列,使用paste
函数预先添加列名,然后返回字符串的结果向量。其余部分留给
sapply
函数,该函数自动将每个结果向量绑定到一个矩阵中(因为默认情况下simplify=TRUE
,否则将使用lappy
返回向量列表)
编辑:
正如@hadley正确指出的那样,sapply
与simplify=TRUE
的结果并不总是相同的(例如,如果您只有一行或一列,它会发生变化)。因此,这是一个更安全的解决方案:
mx <- sapply(colnames(df),function(name){ paste(name,df[,name],sep=", ")})
> mx
CoA CoB CoC
[1,] "CoA, Baton Rouge" "CoB, Haddonfield, NJ" "CoC, St. Louis, Missouri"
[2,] "CoA, Birmingham" "CoB, Haddonfield, NJ" "CoC, Kansas City, Missouri"
[3,] "CoA, Chattanooga" "CoB, Philadelphia, PA" "CoC, Jefferson City, Missouri"
[4,] "CoA, Columbia" "CoB, Hackensack, NJ" "CoC, Belleville, Illinois"
[5,] "CoA, Houston" "CoB, Princeton, NJ" "CoC, Overland Park, Kansas"
df2 <- as.data.frame(sapply(colnames(df),
function(name){ paste(name,df[,name],sep=", ")},
simplify=F))
> df2
CoA CoB CoC
1 CoA, Baton Rouge CoB, Haddonfield, NJ CoC, St. Louis, Missouri
2 CoA, Birmingham CoB, Haddonfield, NJ CoC, Kansas City, Missouri
3 CoA, Chattanooga CoB, Philadelphia, PA CoC, Jefferson City, Missouri
4 CoA, Columbia CoB, Hackensack, NJ CoC, Belleville, Illinois
5 CoA, Houston CoB, Princeton, NJ CoC, Overland Park, Kansas
df2-df2
CoA CoB CoC
1 CoA,巴吞鲁日CoB,哈登菲尔德,新泽西州CoC,密苏里州圣路易斯
密苏里州堪萨斯城新泽西州哈登菲尔德伯明翰CoB 2 CoA
密苏里州杰斐逊市宾夕法尼亚州费城查塔努加CoB 3 CoA
伊利诺伊州贝尔维尔市新泽西州哈肯萨克哥伦比亚海岸4 CoA
5科阿,休斯顿科布,普林斯顿,新泽西科布,堪萨斯州陆上公园
太好了。你介意加一个简短的解释吗?我不确定您使用的“名称”。如果您想保留为数据帧,最好使用lappy()
,而不是sapply()
。@lawyeR:添加了一个explanation@hadley:lappy将返回一个向量列表,因此我们仍然需要调用do.call(cbind,…)
来创建data.frame
。我发现sapply
version shorter…@digEmAllsapply()
很危险,因为你永远不知道输出会是什么as.data.frame(lappy(…)
更安全,长度与使用sapply()
的等效调用相同。
df2 <- as.data.frame(sapply(colnames(df),
function(name){ paste(name,df[,name],sep=", ")},
simplify=F))
> df2
CoA CoB CoC
1 CoA, Baton Rouge CoB, Haddonfield, NJ CoC, St. Louis, Missouri
2 CoA, Birmingham CoB, Haddonfield, NJ CoC, Kansas City, Missouri
3 CoA, Chattanooga CoB, Philadelphia, PA CoC, Jefferson City, Missouri
4 CoA, Columbia CoB, Hackensack, NJ CoC, Belleville, Illinois
5 CoA, Houston CoB, Princeton, NJ CoC, Overland Park, Kansas