如何在R dataframe中使用来自多个列的有序项生成新列
我在R中有一个数据帧,如下所示:如何在R dataframe中使用来自多个列的有序项生成新列,r,dataframe,sorting,R,Dataframe,Sorting,我在R中有一个数据帧,如下所示: df <- data.frame( "first_col" = c("apple", "apple", "banana", "banana", "cacao", "dough"), "second_col" = c("apple", "apple", "banana", "banana", "apple", "dough"), "third_col" = c("banana", "apple", "banana", "banana", "banana",
df <-
data.frame(
"first_col" = c("apple", "apple", "banana", "banana", "cacao", "dough"),
"second_col" = c("apple", "apple", "banana", "banana", "apple", "dough"),
"third_col" = c("banana", "apple", "banana", "banana", "banana", "apple"),
stringsAsFactors = FALSE
)
df$label <- paste(sort(df$first_col,
df$second_col,
df$third_col),
sep = " - ")
first_col second_col third_col
1 apple apple banana
2 apple apple apple
3 banana banana banana
4 banana banana banana
5 cacao apple banana
6 dough dough apple
first_col second_col third_col label
1 apple apple banana apple - apple - banana
2 apple apple apple apple - apple - apple
3 banana banana banana banana - banana - banana
4 banana banana banana banana - banana - banana
5 cacao apple banana apple - banana - cacao
6 dough dough apple apple - dough - dough
很明显我做错了什么。查看文档,该方法似乎需要一个向量,因此我尝试通过这样做将其向量化
df$label <- paste(sort(c(df$first_col,
df$second_col,
df$third_col)),
sep = " - ")
我想得到这样的东西:
df <-
data.frame(
"first_col" = c("apple", "apple", "banana", "banana", "cacao", "dough"),
"second_col" = c("apple", "apple", "banana", "banana", "apple", "dough"),
"third_col" = c("banana", "apple", "banana", "banana", "banana", "apple"),
stringsAsFactors = FALSE
)
df$label <- paste(sort(df$first_col,
df$second_col,
df$third_col),
sep = " - ")
first_col second_col third_col
1 apple apple banana
2 apple apple apple
3 banana banana banana
4 banana banana banana
5 cacao apple banana
6 dough dough apple
first_col second_col third_col label
1 apple apple banana apple - apple - banana
2 apple apple apple apple - apple - apple
3 banana banana banana banana - banana - banana
4 banana banana banana banana - banana - banana
5 cacao apple banana apple - banana - cacao
6 dough dough apple apple - dough - dough
您可以通过查看第5行和第6行来判断是否已排序。使用
base
:
df$combined<-apply(df,1,function(x) paste(sort(x),collapse="-"))
df
first_col second_col third_col combined
1 apple apple banana apple-apple-banana
2 apple apple apple apple-apple-apple
3 banana banana banana banana-banana-banana
4 banana banana banana banana-banana-banana
5 cacao apple banana apple-banana-cacao
6 dough dough apple apple-dough-dough
df$组合使用dplyr
mutate()
和purrr
pmap()
库(dplyr)
图书馆(purrr)
df%
mutate(label=pmap(列表(第一列,第二列,第三列),函数(x,y,z)粘贴(排序(c(x,y,z)),collapse=“-”)
#第一列第二列第三列标签
#1苹果香蕉苹果-苹果-香蕉
#2苹果-苹果-苹果
#3香蕉-香蕉-香蕉
#4香蕉-香蕉-香蕉
#5可可苹果香蕉苹果-香蕉-可可
#6面团苹果-面团-面团
您可以添加您的预期输出吗?@NelsonGon我更新了问题,希望现在更好。谢谢,这就可以了。如果我只想在我拥有的三个列中选择两个列来生成新列,该怎么办?您的解决方案使用所有列。我可以先选择所有列,然后在这里应用该方法,但在我看来这不是最好的方法。如果我只想使用列1和列2,可以使用如下选择:df[c(1,2)]
。注意,答案名为dfdf
asdf3
,因为我有一些其他的df
用于其他答案。
df <- structure(list(first_col = c("apple", "apple", "banana", "banana",
"cacao", "dough"), second_col = c("apple", "apple", "banana",
"banana", "apple", "dough"), third_col = c("banana", "apple",
"banana", "banana", "banana", "apple"), sorted = c("apple-apple-banana",
"apple-apple-apple", "banana-banana-banana", "banana-banana-banana",
"apple-banana-cacao", "apple-dough-dough")), row.names = c(NA,
-6L), class = "data.frame")
library(dplyr)
library(purrr)
df <-
data.frame(
"first_col" = c("apple", "apple", "banana", "banana", "cacao", "dough"),
"second_col" = c("apple", "apple", "banana", "banana", "apple", "dough"),
"third_col" = c("banana", "apple", "banana", "banana", "banana", "apple"),
stringsAsFactors = FALSE
)
df %>%
mutate(label = pmap(list(first_col, second_col, third_col), function(x, y, z) paste(sort(c(x,y,z)), collapse = " - ")))
# first_col second_col third_col label
# 1 apple apple banana apple - apple - banana
# 2 apple apple apple apple - apple - apple
# 3 banana banana banana banana - banana - banana
# 4 banana banana banana banana - banana - banana
# 5 cacao apple banana apple - banana - cacao
# 6 dough dough apple apple - dough - dough