Regex 高效编码-R正则表达式为每个匹配复制行
我一直在根据一个包含自由文本的专栏讨论一些数据争论。我想从这个文本中识别一组特定的字符串,创建一个列来指定一个匹配项,然后如果一个特定字段中有多个字符串匹配项,则复制一行。我已经做到了这一点(为没有节日气氛的人道歉):Regex 高效编码-R正则表达式为每个匹配复制行,regex,r,reshape,Regex,R,Reshape,我一直在根据一个包含自由文本的专栏讨论一些数据争论。我想从这个文本中识别一组特定的字符串,创建一个列来指定一个匹配项,然后如果一个特定字段中有多个字符串匹配项,则复制一行。我已经做到了这一点(为没有节日气氛的人道歉): #示例数据帧 要求(stringr) dats试试这个: library(quanteda) s <- "rudolph the red nosed reindeer" words <- strsplit(s, " ")[[1]] do.call(rbind, l
#示例数据帧
要求(stringr)
dats试试这个:
library(quanteda)
s <- "rudolph the red nosed reindeer"
words <- strsplit(s, " ")[[1]]
do.call(rbind, lapply(words, kwic, x = s))
从splitstackshape
包中尝试cSplit
:
library(splitstackshape)
dats$value <- lapply(str_extract_all(dats$text, reg.patt), toString)
cSplit(dats, 'value', direction="long")
# ID text value
# 1: 1 rudolph rudolph
# 2: 2 rudolph the rudolph
# 3: 2 rudolph the the
# 4: 3 rudolph the red rudolph
# 5: 3 rudolph the red the
# 6: 3 rudolph the red red
# 7: 4 rudolph the red nosed rudolph
# 8: 4 rudolph the red nosed the
# 9: 4 rudolph the red nosed red
# 10: 4 rudolph the red nosed nosed
# 11: 5 rudolph the red nosed reindeer rudolph
# 12: 5 rudolph the red nosed reindeer the
# 13: 5 rudolph the red nosed reindeer red
# 14: 5 rudolph the red nosed reindeer nosed
# 15: 5 rudolph the red nosed reindeer reindeer
库(splitstackshape)
dats美元价值
contextPre keyword contextPost
[text1, 1] [ rudolph ] the red nosed reindeer
[text1, 2] rudolph [ the ] red nosed reindeer
[text1, 3] rudolph the [ red ] nosed reindeer
[text1, 4] rudolph the red [ nosed ] reindeer
[text1, 5] rudolph the red nosed [ reindeer ]
library(splitstackshape)
dats$value <- lapply(str_extract_all(dats$text, reg.patt), toString)
cSplit(dats, 'value', direction="long")
# ID text value
# 1: 1 rudolph rudolph
# 2: 2 rudolph the rudolph
# 3: 2 rudolph the the
# 4: 3 rudolph the red rudolph
# 5: 3 rudolph the red the
# 6: 3 rudolph the red red
# 7: 4 rudolph the red nosed rudolph
# 8: 4 rudolph the red nosed the
# 9: 4 rudolph the red nosed red
# 10: 4 rudolph the red nosed nosed
# 11: 5 rudolph the red nosed reindeer rudolph
# 12: 5 rudolph the red nosed reindeer the
# 13: 5 rudolph the red nosed reindeer red
# 14: 5 rudolph the red nosed reindeer nosed
# 15: 5 rudolph the red nosed reindeer reindeer