R通过追加列合并重复的行_R_Duplicates

R通过追加列合并重复的行

R通过追加列合并重复的行,r,duplicates,R,Duplicates,我有一个很大的数据集，其中包含文本注释及其对不同变量的评级，如下所示： df <- data.frame( comment = c("commentA","commentB","commentB","commentA","commentA","commentC" sentiment=c(1,2,1,4,1,2), tone=c(1,5,3,2,6,1) ) 但这会导致情绪和语气之间的所有可能组合，而不是简单地独立复制情绪和语气我也尝试过使用collect，比如df%>%

我有一个很大的数据集，其中包含文本注释及其对不同变量的评级，如下所示：

df <- data.frame(
  comment = c("commentA","commentB","commentB","commentA","commentA","commentC" 
  sentiment=c(1,2,1,4,1,2), 
  tone=c(1,5,3,2,6,1)
)

但这会导致情绪和语气之间的所有可能组合，而不是简单地独立复制情绪和语气

我也尝试过使用

collect

，比如

df%>%collect（key，value，-comment）

，但这只能让我半途而废

有人能给我指出正确的方向吗？

您需要创建一个变量，用作列中的数字

rowid（comment）

在dcast中，将行标识符放在

的左侧，列标识符放在右侧。然后value.var是要在长到宽转换中包含int的所有列的字符向量

library(data.table)
setDT(df)

dcast(df, comment ~ rowid(comment), value.var = c('sentiment', 'tone'))

#     comment sentiment_1 sentiment_2 sentiment_3 tone_1 tone_2 tone_3
# 1: commentA           1           4           1      1      2      6
# 2: commentB           2           1          NA      5      3     NA
# 3: commentC           2          NA          NA      1     NA     NA

reshape(df, 
  idvar = "comment",
  timevar = c("sentiment","tone"), 
  direction = "wide"
)

library(data.table)
setDT(df)

dcast(df, comment ~ rowid(comment), value.var = c('sentiment', 'tone'))

#     comment sentiment_1 sentiment_2 sentiment_3 tone_1 tone_2 tone_3
# 1: commentA           1           4           1      1      2      6
# 2: commentB           2           1          NA      5      3     NA
# 3: commentC           2          NA          NA      1     NA     NA