R 在执行最新的\u标记后删除具有空值的行并删除停止字?

R 在执行最新的\u标记后删除具有空值的行并删除停止字?,r,text,nlp,tidytext,R,Text,Nlp,Tidytext,这是我的df: df <- structure(list(id = 1:50, strain_id = c(6L, 6L, 7L, 12L, 19L, 35L, 81L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 202L, 202L, 202L, 202L, 202L, 20

这是我的df:

df <- structure(list(id = 1:50, strain_id = c(6L, 6L, 7L, 12L, 19L, 
35L, 81L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 100L, 
100L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 
123L, 202L, 202L, 202L, 202L, 202L, 202L, 202L, 202L, 202L, 202L, 
202L, 246L, 246L, 246L, 246L, 246L, 246L, 246L, 246L, 246L, 246L, 
246L), name = c("Anorexia and Cachexia", "Autoimmune Diseases and Inflammation", 
"Psychiatric Symptoms", "Autoimmune Diseases and Inflammation", 
"Pain", "Autoimmune Diseases and Inflammation", "Dependency and Withdrawal", 
"Anorexia and Cachexia", "Spasticity", "Movement Disorders", 
"Pain", "Glaucoma", "Epilepsy", "Asthma", "Dependency and Withdrawal", 
"Psychiatric Symptoms", "Autoimmune Diseases and Inflammation", 
"Nausea and Vomiting", "Anorexia and Cachexia", "Spasticity", 
"Movement Disorders", "Pain", "Glaucoma", "Epilepsy", "Asthma", 
"Dependency and Withdrawal", "Psychiatric Symptoms", "Autoimmune Diseases and Inflammation", 
"Nausea and Vomiting", "Anorexia and Cachexia", "Spasticity", 
"Movement Disorders", "Pain", "Glaucoma", "Epilepsy", "Asthma", 
"Dependency and Withdrawal", "Psychiatric Symptoms", "Autoimmune Diseases and Inflammation", 
"Nausea and Vomiting", "Anorexia and Cachexia", "Spasticity", 
"Movement Disorders", "Pain", "Glaucoma", "Epilepsy", "Asthma", 
"Dependency and Withdrawal", "Psychiatric Symptoms", "Autoimmune Diseases and Inflammation"
), rating = c(4, 4, 5, 5, 4, 5, 5, 5, 4, 5, 5, 4, 4, 3, 5, 5, 
5, 3, 3, 5, 5, 4, 3, 4, 4, 4, 3, 4, 3, 3, 2, 3, 4, 4, 3, 2, 5, 
3, 3, 3, 3, 4, 4, 3, 5, 3, 1, 3, 4, 3), dose = c(3, 3, 3, 3, 
3, 3, 1, 3, 2, 1, 2, 2, 2, 3, 2, 2, 2, 2, 2, 3, 3, 2, 2, 2, 3, 
3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 1, 2, 2, 1, 3, 2, 
3, 2, 2, 3), info = c("Affects / helps even in small doses very well at / against Anorexia and Cachexia.", 
"Affects / helps even in small doses very well at / against Autoimmune Diseases and Inflammation.", 
"Affects / helps even in small doses extremly well at / against Psychiatric Symptoms.", 
"Affects / helps even in small doses extremly well at / against Autoimmune Diseases and Inflammation.", 
"Affects / helps even in small doses very well at / against Pain.", 
"Affects / helps even in small doses extremly well at / against Autoimmune Diseases and Inflammation.", 
"Affects / helps only in heavy doses extremly well at / against Dependency and Withdrawal.", 
"Affects / helps even in small doses extremly well at / against Anorexia and Cachexia.", 
"Affects / helps in average doses very well at / against Spasticity.", 
"Affects / helps only in heavy doses extremly well at / against Movement Disorders.", 
"Affects / helps in average doses extremly well at / against Pain.", 
"Affects / helps in average doses very well at / against Glaucoma.", 
"Affects / helps in average doses very well at / against Epilepsy.", 
"Affects / helps even in small doses well at / against Asthma.", 
"Affects / helps in average doses extremly well at / against Dependency and Withdrawal.", 
"Affects / helps in average doses extremly well at / against Psychiatric Symptoms.", 
"Affects / helps in average doses extremly well at / against Autoimmune Diseases and Inflammation.", 
"Affects / helps in average doses well at / against Nausea and Vomiting.", 
"Affects / helps in average doses well at / against Anorexia and Cachexia.", 
"Affects / helps even in small doses extremly well at / against Spasticity.", 
"Affects / helps even in small doses extremly well at / against Movement Disorders.", 
"Affects / helps in average doses very well at / against Pain.", 
"Affects / helps in average doses well at / against Glaucoma.", 
"Affects / helps in average doses very well at / against Epilepsy.", 
"Affects / helps even in small doses very well at / against Asthma.", 
"Affects / helps even in small doses very well at / against Dependency and Withdrawal.", 
"Affects / helps in average doses well at / against Psychiatric Symptoms.", 
"Affects / helps in average doses very well at / against Autoimmune Diseases and Inflammation.", 
"Affects / helps in average doses well at / against Nausea and Vomiting.", 
"Affects / helps in average doses well at / against Anorexia and Cachexia.", 
"Affects / helps in average doses low at / against Spasticity.", 
"Affects / helps in average doses well at / against Movement Disorders.", 
"Affects / helps in average doses very well at / against Pain.", 
"Affects / helps in average doses very well at / against Glaucoma.", 
"Affects / helps in average doses well at / against Epilepsy.", 
"Affects / helps even in small doses low at / against Asthma.", 
"Affects / helps in average doses extremly well at / against Dependency and Withdrawal.", 
"Affects / helps in average doses well at / against Psychiatric Symptoms.", 
"Affects / helps in average doses well at / against Autoimmune Diseases and Inflammation.", 
"Affects / helps in average doses well at / against Nausea and Vomiting.", 
"Affects / helps only in heavy doses well at / against Anorexia and Cachexia.", 
"Affects / helps in average doses very well at / against Spasticity.", 
"Affects / helps in average doses very well at / against Movement Disorders.", 
"Affects / helps only in heavy doses well at / against Pain.", 
"Affects / helps even in small doses extremly well at / against Glaucoma.", 
"Affects / helps in average doses well at / against Epilepsy.", 
"Affects / helps even in small doses very low at / against Asthma.", 
"Affects / helps in average doses well at / against Dependency and Withdrawal.", 
"Affects / helps in average doses very well at / against Psychiatric Symptoms.", 
"Affects / helps even in small doses well at / against Autoimmune Diseases and Inflammation."
), votes = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 1L, 1L)), row.names = c(NA, 50L), class = "data.frame")
请告知我应该使用哪个函数或设置来避免筛选(
dplyr::filter(!word==“”)
)并删除具有空值的行


换句话说,我希望我的代码自动(使用设置或函数)对特定列中具有空值的行进行筛选。

我可以仅使用tidytext中的函数重新创建结果。 tm中的函数不需要,因为带有unnest_标记的tidytext已经负责标点和空格的删除(除非另有规定)。您可以将dplyr的
反连接
与tidytext中的
停止字
一起使用,以删除不需要的停止字

df %>%
  tidytext::unnest_tokens(input = name, 
                          output = word, 
                          token = "words", 
                          format = "text", 
                          drop = T, 
                          to_lower = T) %>%
  anti_join(tidytext::stop_words)

我只能用tidytext中的函数重新创建结果。 tm中的函数不需要,因为带有unnest_标记的tidytext已经负责标点和空格的删除(除非另有规定)。您可以将dplyr的
反连接
与tidytext中的
停止字
一起使用,以删除不需要的停止字

df %>%
  tidytext::unnest_tokens(input = name, 
                          output = word, 
                          token = "words", 
                          format = "text", 
                          drop = T, 
                          to_lower = T) %>%
  anti_join(tidytext::stop_words)

你的意思是完全避免
filter()
行?@tmfmnk I,尽管有这样的设置。你的意思是完全避免
filter()
行?@tmfmnk I,尽管有这样的设置