R 如何根据一列中的ID计算另一列中连接字符串的长度?
我有以下功能R 如何根据一列中的ID计算另一列中连接字符串的长度?,r,matrix,concatenation,R,Matrix,Concatenation,我有以下功能 set.seed(1984) test <- function(paths){ x <- matrix(rep(NA, paths*3), ncol = 3, dimnames = list(c(), c("Cookie", "Site", "Count"))) for(i in 1:paths){ x[i, 1] <- round(sqrt(rnorm(1,50,100)^2)) n <- functi
set.seed(1984)
test <- function(paths){
x <- matrix(rep(NA, paths*3), ncol = 3,
dimnames = list(c(), c("Cookie", "Site", "Count")))
for(i in 1:paths){
x[i, 1] <- round(sqrt(rnorm(1,50,100)^2))
n <- function(){sample(1:10, size = 1)}
draws <- function(){sample(LETTERS[1:5], n(), replace = T)}
x[i, 2] <- paste(draws(), collapse = '-')
}
return(x)
}
对于cookie
列中的每个唯一cookie ID,我想
Site
字符串连接在一起(Cookie
包含重复值)Cookie
ID的Count
值(因此,可能会有重复)有什么想法吗?这将按
Cookie
对矩阵进行分组,并返回站点
列中的字符总数(等于连接长度)
test.df <- test(91)
library(dplyr)
test.df %>%
as.data.frame(., stringsAsFactors = FALSE) %>%
group_by(Cookie) %>%
mutate(Count = sum(nchar(Site)))
test.df%
as.data.frame(,stringsAsFactors=FALSE)%>%
分组依据(Cookie)%>%
突变(计数=总和(nchar(位点)))
如果希望
计数
排除字符-
,只需将站点
替换为gsub(“-”,“”,Site,fixed=TRUE)
替换为数据。表
,我们可以这样做
library(data.table)
dt <- as.data.table(test(91))[, Count := as.character(sum(nchar(gsub("-", "", Site)))) ,
by = Cookie][]
dt[, Full_path := gsub("-", ", ", toString(Site)), by = Cookie]
head(dt)
# Cookie Site Count Full_path
#1: 258 A 1 A
#2: 26 D-D-E-E-C 10 D, D, E, E, C, E, E, A, C, A
#3: 43 D-D-A 3 D, D, A
#4: 171 C-C-E-A-B-D-E 7 C, C, E, A, B, D, E
#5: 57 A-D-D-C 4 A, D, D, C
#6: 156 A-D 2 A, D
有没有办法不计算破折号呢?我知道这很痛苦,但我需要在对象中保留一些d3.js的破折号我很抱歉接下来的内容,但是如果我想用
Site
片段的完整串联来变异出一列,我已经尝试过(在mutate中)full_Path=paste(Site,collapse='-'))
但它不起作用…当我在mutate中用strsplit(strsplit(Site)
)包装站点并运行测试(100)时,请检查strsplit
它说它找不到任何方法让它不计算破折号?我知道这很痛苦,但我需要在对象中保留破折号,以备d3.js I使用have@D8Amonk更新了后续的PPOTSORRY,但是如果我想用站点片段的完整连接来改变一个列,我已经尝试过了(在mutate中)Full路径=粘贴(site,塌陷=-'),但它不工作…@ d8aMon如果是“cookie”,尝试更新的代码,如何在Full路径中间折叠逗号?
library(data.table)
dt <- as.data.table(test(91))[, Count := as.character(sum(nchar(gsub("-", "", Site)))) ,
by = Cookie][]
dt[, Full_path := gsub("-", ", ", toString(Site)), by = Cookie]
head(dt)
# Cookie Site Count Full_path
#1: 258 A 1 A
#2: 26 D-D-E-E-C 10 D, D, E, E, C, E, E, A, C, A
#3: 43 D-D-A 3 D, D, A
#4: 171 C-C-E-A-B-D-E 7 C, C, E, A, B, D, E
#5: 57 A-D-D-C 4 A, D, D, C
#6: 156 A-D 2 A, D
dt[, Full_path := paste(Site, collapse="-"), by = Cookie]