如何在purrr中获得与参考词最接近的词_R

如何在purrr中获得与参考词最接近的词

如何在purrr中获得与参考词最接近的词,r,R,我的名单如下： list(c("\n", "\n", "oesophagus graded and fine\n", "\n", "\n", "\n", "stomach and antrum altough with some rfa response rfa\n", "\n", "mucosa washed a lot\n", "\n", "treated with halo rfa ultra \n", "\n", "total of 100 times\n", "\n", "

我的名单如下：

list(c("\n", "\n", "oesophagus graded  and fine\n", 
"\n", "\n", "\n", "stomach and  antrum  altough with some rfa response rfa\n", 
"\n", "mucosa washed a lot\n", "\n", "treated with halo rfa ultra \n", 
"\n", "total of 100 times\n", "\n", "duodenum looks ok"))

我想从一个列表中提取一个与另一个列表中的另一个术语最接近的术语

我期望的输出是

antrum:rfa

我的第一个清单是：

EventList<-c("rfa", "apc", "dilat", "emr", "clip", "grasp", "probe", "iodine", 
"acetic", "nac", "peg", "botox")

这给了我事件（在本例中为

rfa

），但它没有将其分配给

胃窦

，而是将其分配给

食管

因此，它将其赋予在

tofind

列表中找到的第一个术语，而不是最接近事件的术语

我怀疑电话线有问题

`[[`(1) %>%

 .[length(.)]

是罪魁祸首，但我不知道如何更改它，以便它为我提供最接近的术语，而不是第一个术语

下面为您提供了在

中匹配的最后一个元素，以查找事件列表中的每个匹配元素

map(EventList, 
    function(event) {
      indices <- map(words, str_which, pattern = event)
      map(indices, function(i) 
        map2_chr(words, i, ~ .x[seq_len(.y)] %>% 
               str_c(collapse = ' ') %>% 
               str_extract_all(regex(tofind, ignore_case = TRUE), simplify = TRUE) %>% 
               last()) %>%
          map_if(is_empty, ~ NA_character_)
        ) %>% 
        unlist() %>% 
        paste0(':', event)
    })  %>%
  unlist() %>%
  str_subset('.+:')

# [1] "antrum:rfa"     "oesophagus:rfa"

map（事件列表，
功能（事件）{
指数%
str_c（collapse=''）%>%
str\u extract\u all（正则表达式（tofind，ignore\u case=TRUE），simplify=TRUE）%>%
last（））%>%
映射如果（为空，~NA\u字符）
) %>% 
取消列表（）%>%
粘贴0（“：”，事件）
})  %>%
取消列表（）%>%
str_子集（“.+：”）
#[1]“胃窦：射频消融”“食管：射频消融”
我想问题应该从单词%>%stru（粘贴0（“^.*”，.x））的行开始。你的单词表的结构是否比你展示的更大？我的意思是，有几个向量像你展示的那样，还是真的只有一个向量的列表？很好。谢谢
`[[`(1) %>%

 .[length(.)]

map(EventList, 
    function(event) {
      indices <- map(words, str_which, pattern = event)
      map(indices, function(i) 
        map2_chr(words, i, ~ .x[seq_len(.y)] %>% 
               str_c(collapse = ' ') %>% 
               str_extract_all(regex(tofind, ignore_case = TRUE), simplify = TRUE) %>% 
               last()) %>%
          map_if(is_empty, ~ NA_character_)
        ) %>% 
        unlist() %>% 
        paste0(':', event)
    })  %>%
  unlist() %>%
  str_subset('.+:')

# [1] "antrum:rfa"     "oesophagus:rfa"