R 如何使用向量将数据帧缩减为单行
我有这个DFR 如何使用向量将数据帧缩减为单行,r,dataframe,vector,reduce,R,Dataframe,Vector,Reduce,我有这个DF email date user_ipaddress other data 1 x@bla.com 2020-03-24 177.95.75.230 xxxx 2 x@bla.com 2020-04-02 177.139.49.93 yyyy 3 x@bla.com 2020-04-02 177.139.49.93 zzzz 我想把这些数据转换成它要存储的形状 整个问题将是一个包含不
email date user_ipaddress other data
1 x@bla.com 2020-03-24 177.95.75.230 xxxx
2 x@bla.com 2020-04-02 177.139.49.93 yyyy
3 x@bla.com 2020-04-02 177.139.49.93 zzzz
我想把这些数据转换成它要存储的形状
整个问题将是一个包含不同电子邮件的大数据框架,我想将每封电子邮件的所有数据减少到一行,就像这样
email date user_ipaddress other data
1 x@bla.com 2020-04-02 c('177.95.75.230','177.139.49.93') c('xxxx','yyyy','zzzz')
事实上,如果有人能在只有一个电子邮件地址的情况下帮我的话,这会救我的命,但请放心帮我解决整个问题
使用
ipadreessVec<-Reduce(append,x =df$network_userid)
我明白了
Error in `$<-.data.frame`(`*tmp*`, network_userid, value = c("20562206-f557-48a3-861b-5d1e18524bbb", :
replacement has 3 rows, data has 1
“$中的
错误我们可以创建一个列表
列,按“电子邮件”、“日期”分组
library(dplyr)
DF %>%
group_by(email, date) %>%
summarise_all(list)
# A tibble: 2 x 4
# Groups: email [1]
# email date user_ipaddress otherdata
# <chr> <chr> <list> <list>
#1 x@bla.com 2020-03-24 <chr [1]> <chr [1]>
#2 x@bla.com 2020-04-02 <chr [2]> <chr [2]>
数据
DF我可能误解了你,你更可能想要@akrun shows这样的节目,但从字面上解释,你可能想要使用dput
:
as.data.frame(lappy(df,function(x)capture.output(dput(unique(x '))))
#>电子邮件日期用户\u IP地址
#> 1 "x@bla.com“c”(“2020-03-24”、“2020-04-02”)c(“177.95.75.230”、“177.139.49.93”)
#>其他
#>1c(“xxxx”、“yyyy”、“zzzz”)
通过电子邮件和日期:
setDT(df)[, .(user_ipaddress = paste0(user_ipaddress, collapse = ","),
other = paste0(other_data, collapse = ",")), by = .(email, date)]
# email date user_ipaddress other
# 1: x@bla.com 2020-03-24 177.95.75.230 xxxx
# 2: x@bla.com 2020-04-02 177.139.49.93,177.139.49.93 yyyy,zzzz
setDT(df)[, .(date = paste0(date, collapse = ","),
user_ipaddress = paste0(user_ipaddress, collapse = ","),
other = paste0(other_data, collapse = ",")), by = .(email)]
# email date user_ipaddress other
# 1: x@bla.com 2020-03-24,2020-04-02,2020-04-02 177.95.75.230,177.139.49.93,177.139.49.93 xxxx,yyyy,zzzz
df <- read.table(text='email date user_ipaddress other_data
1 x@bla.com 2020-03-24 177.95.75.230 xxxx
2 x@bla.com 2020-04-02 177.139.49.93 yyyy
3 x@bla.com 2020-04-02 177.139.49.93 zzzz', header = TRUE, stringsAsFactors = FALSE)
仅通过电子邮件发送:
setDT(df)[, .(user_ipaddress = paste0(user_ipaddress, collapse = ","),
other = paste0(other_data, collapse = ",")), by = .(email, date)]
# email date user_ipaddress other
# 1: x@bla.com 2020-03-24 177.95.75.230 xxxx
# 2: x@bla.com 2020-04-02 177.139.49.93,177.139.49.93 yyyy,zzzz
setDT(df)[, .(date = paste0(date, collapse = ","),
user_ipaddress = paste0(user_ipaddress, collapse = ","),
other = paste0(other_data, collapse = ",")), by = .(email)]
# email date user_ipaddress other
# 1: x@bla.com 2020-03-24,2020-04-02,2020-04-02 177.95.75.230,177.139.49.93,177.139.49.93 xxxx,yyyy,zzzz
df <- read.table(text='email date user_ipaddress other_data
1 x@bla.com 2020-03-24 177.95.75.230 xxxx
2 x@bla.com 2020-04-02 177.139.49.93 yyyy
3 x@bla.com 2020-04-02 177.139.49.93 zzzz', header = TRUE, stringsAsFactors = FALSE)
数据:
setDT(df)[, .(user_ipaddress = paste0(user_ipaddress, collapse = ","),
other = paste0(other_data, collapse = ",")), by = .(email, date)]
# email date user_ipaddress other
# 1: x@bla.com 2020-03-24 177.95.75.230 xxxx
# 2: x@bla.com 2020-04-02 177.139.49.93,177.139.49.93 yyyy,zzzz
setDT(df)[, .(date = paste0(date, collapse = ","),
user_ipaddress = paste0(user_ipaddress, collapse = ","),
other = paste0(other_data, collapse = ",")), by = .(email)]
# email date user_ipaddress other
# 1: x@bla.com 2020-03-24,2020-04-02,2020-04-02 177.95.75.230,177.139.49.93,177.139.49.93 xxxx,yyyy,zzzz
df <- read.table(text='email date user_ipaddress other_data
1 x@bla.com 2020-03-24 177.95.75.230 xxxx
2 x@bla.com 2020-04-02 177.139.49.93 yyyy
3 x@bla.com 2020-04-02 177.139.49.93 zzzz', header = TRUE, stringsAsFactors = FALSE)
df也许你可以在基本R中尝试aggregate
:
dfout <- aggregate(.~email,df,FUN = function(x) list(unique(levels(x))))
数据
df <- structure(list(email = c("x@bla.com", "x@bla.com", "x@bla.com"
), date = c("2020-03-24", "2020-04-02", "2020-04-02"), user_ipaddress = c("177.95.75.230",
"177.139.49.93", "177.139.49.93"), `other data` = c("xxxx", "yyyy",
"zzzz")), class = "data.frame", row.names = c(NA, -3L))
df>df%%>%+分组依据(电子邮件,日期)%%>%+摘要(跨越(所有内容(),列表))在跨越(所有内容(),列表)中出错:未能找到函数“跨越”>df%%>%+分组依据(电子邮件,日期)%%>%+摘要依据(列表)错误:应为单面公式、函数或函数名。调用rlang::last_error()
查看backtrace@filscapo它在德维尔versoin@filscapo这是一个完美的问题,我不确定这不是一个很难的问题,因为我在不到一小时的时间里得到了4个精彩的答案!。。或者,也许是以前患过此病的人遭受了太多的痛苦,他们将永远记住如何解决这一问题