如何使用data.frame重写我的dataframe转换
我有一个数据帧:如何使用data.frame重写我的dataframe转换,r,dataframe,data.table,R,Dataframe,Data.table,我有一个数据帧: ID value 1 he following object is masked from ‘package:purrr’. R is free software and comes with ABSOLUTELY NO WARRANTY 2 Attaching package: ‘magrittr’. Natural language support but running in an English locale 2 Attachi
ID value
1 he following object is masked from ‘package:purrr’. R is free software and comes with ABSOLUTELY NO WARRANTY
2 Attaching package: ‘magrittr’. Natural language support but running in an English locale
2 Attaching package: ‘DT’. Natural language support but running in an English locale
2 Attaching package: ‘anytime’. Natural language support but running in an English locale
3 package ‘ggplot2’ was built under R version 3.6.2. Type 'contributors()' for more information
4 Warning messages: type 'demo()' for some demos, 'help()' for on-line help
4 Warning messages: 'help.start()' for an HTML browser interface to help
如何创建它:
ID <- c(1,2,2,2,3,4,4)
value <- c("he following object is masked from ‘package:purrr’. R is free software and comes with ABSOLUTELY NO WARRANTY",
"Attaching package: ‘magrittr’. Natural language support but running in an English locale",
"Attaching package: ‘DT’. Natural language support but running in an English locale",
"Attaching package: ‘anytime’. Natural language support but running in an English locale",
"package ‘ggplot2’ was built under R version 3.6.2. Type 'contributors()' for more information",
"Warning messages:type 'demo()' for some demos, 'help()' for on-line help",
"Warning messages:'help.start()' for an HTML browser interface to help")
df <- data.table(ID, value)
以及它所期望的输出。如您所见,我创建了新的列模式,并根据它对数据表进行了分组。我还添加了带有模式示例的列示例
如何使用data.table重写此转换?我希望使用data.table的函数,而不是使用mutate和其他函数。但是我不擅长。我试过了,但我不知道下一步该怎么办:
df_patterns <- df[, c("pattern", "id_type") := list(
pattern = coalesce(stringr::str_extract(pattern= stringr::str_extract(value, "\\S+\\s+\\S+\\s+\\S+"), "^Attaching package:|Warning messages:"),pattern= stringr::str_extract(value, "\\S+\\s+\\S+\\s+\\S+")),
case_when(ID %in% c(1, 5) ~ "extra_type")), by = ID, pattern]
df_patterns删除除数据之外的所有依赖项。表
以下内容应与您的预期输出相匹配(当然,在不设置种子的情况下会有所不同):
df_patterns您能解释一下预期的输出吗?@zx8754我添加了一些说明找不到函数“fcase”。我在data.table包中没有该函数,我将列id\u类型添加到了所需的结果中。我昨天晚上忘了beginning@french_fries您有一个旧版本的data.table
。要安装最新版本,请使用install.packages('data.table')
。已将解决方案更新为包含id\u type
。
ID pattern example id_type
1 he following object he following object is masked from ‘package:purrr’. R is free software and comes with ABSOLUTELY NO WARRANTY extra_type
2 Attaching package: Attaching package: ‘anytime’. Natural language support but running in an English locale NA
3 package ‘ggplot2’ was package ‘ggplot2’ was built under R version 3.6.2. Type 'contributors()' for more information NA
4 Warning messages: Warning messages:'help.start()' for an HTML browser interface to help NA
df_patterns <- df[, c("pattern", "id_type") := list(
pattern = coalesce(stringr::str_extract(pattern= stringr::str_extract(value, "\\S+\\s+\\S+\\s+\\S+"), "^Attaching package:|Warning messages:"),pattern= stringr::str_extract(value, "\\S+\\s+\\S+\\s+\\S+")),
case_when(ID %in% c(1, 5) ~ "extra_type")), by = ID, pattern]
df_patterns <-
copy(df)[, pattern := fcase(
startsWith(value, "Attaching package:"), "Attaching package:",
startsWith(value, "Warning messages:"), "Warning messages:",
rep(TRUE, nrow(df)), sub("((\\S+\\s+){2}\\S+).+", "\\1", value)
)][,
.(
example = sample(value, 1),
id_type = fifelse(ID %in% c(1,5), "extra_type", NA_character_)
),
by = .(ID, pattern)]
ID pattern example id_type
1: 1 he following object he following object is masked from ‘package:purrr’. R is free software and comes with ABSOLUTELY NO WARRANTY extra_type
2: 2 Attaching package: Attaching package: ‘DT’. Natural language support but running in an English locale <NA>
3: 3 package ‘ggplot2’ was package ‘ggplot2’ was built under R version 3.6.2. Type 'contributors()' for more information <NA>
4: 4 Warning messages: Warning messages:type 'demo()' for some demos, 'help()' for on-line help <NA>