R 选择包含特定单词的句子

R 选择包含特定单词的句子,r,quanteda,R,Quanteda,在quanteda中,有没有一种方法可以在两个单词同时出现的情况下选择一个句子?我找到了将文本语料库标记成句子的方法。使用kwic和tokens_select似乎表明它们为2个术语实现了逻辑OR,而不是and 我可以用stringr做ti,但我想确定我没有遗漏什么 stringr的示例: library(tidyverse) myStr <- c("soil carbon is the best", "biodiversity is key",





myStr <- c("soil carbon is the best", 
           "biodiversity is key", 
           "soil carbon is biodiversity by nature")

keyw <- c("soil","biodiversity")

tibble(sentences = myStr,
       hit_soil_carbon_biodiveristy = unlist(purrr::map(myStr,~all(str_detect(.x,keyw)))))




# reformat the corpus as sentences
sentcorp <- corpus_reshape(data_corpus_inaugural, to = "sentences")
#                                           2017-Trump.83 
#          "Together, we will make America strong again." 
#                                           2017-Trump.84 
#                   "We will make America wealthy again." 
#                                           2017-Trump.85 
#                     "We will make America proud again." 
#                                           2017-Trump.86 
#                      "We will make America safe again." 
#                                           2017-Trump.87 
# "And, yes, together, we will make America great again." 
#                                           2017-Trump.88 
#      "Thank you, God bless you, and God bless America." 

# illustrate the selection
kwic(sentcorp, phrase("nuclear w*"), window = 3)
# [1977-Carter.47, 18:19]  elimination of all | nuclear weapons | from this Earth
# [1985-Reagan.88, 12:13] further increase of | nuclear weapons | .              
#  [1985-Reagan.90, 9:10]          one day of | nuclear weapons | from the face  
# [1985-Reagan.91, 27:28]          the use of | nuclear weapons | , the other    
#   [1985-Reagan.96, 4:5]     It would render | nuclear weapons | obsolete.  

# now pipe the longer kwic results back into a corpus
newsentcorp <- 
    kwic(sentcorp, phrase("nuclear w*"), window = 1000) %>%
    corpus(split_context = FALSE) %>%
newsentcorp[-4]  # because 4 is really long    
#                                                                                                   1977-Carter.47.L18 
# "And we will move this year a step toward ultimate goal - - the elimination of all nuclear weapons from this Earth." 
#                                                                                                   1985-Reagan.88.L12 
#                                        "We are not just discussing limits on a further increase of nuclear weapons." 
#                                                                                                    1985-Reagan.90.L9 
#                               "We seek the total elimination one day of nuclear weapons from the face of the Earth." 
#                                                                                                    1985-Reagan.96.L4 
#                                                                          "It would render nuclear weapons obsolete."