R 如何使用感伤或qdap检测否定句_R

R 如何使用感伤或qdap检测否定句

R 如何使用感伤或qdap检测否定句,r,R,我试图从含有否定词的医疗报告中提取（并最终分类）句子。例如： samples<-c('There is no evidence of a lump','Neither a contusion nor a scar was seen','No inflammation was evident','We found generalised badness here') 我试过的感伤套餐如下：提取情感词（myclonda$Endo\u ResultText，polarity\u dt=lex

我试图从含有否定词的医疗报告中提取（并最终分类）句子。例如：

samples<-c('There is no evidence of a lump','Neither a contusion nor a scar was seen','No inflammation was evident','We found generalised badness here')

我试过的感伤套餐如下：

提取情感词（myclonda$Endo\u ResultText，polarity\u dt=lexicon:：hash\u touction\u jockers，hyphen=”“）

这给了我中立、消极和积极的词语：

   element_id sentence_id     negative positive
1:          1           1                      
2:          2           1         scar         
3:          3           1 inflammation  evident
4:          4           1      badness    found

但我真的在寻找只包含否定词的句子，而不解释情绪，因此输出为：

element_id sentence_id negative positive 1: 1 1 There is no evidence of a lump 2: 2 1 Neither a contusion nor a scar was seen 3: 3 1 No inflammation was evident 4: 4 1 We found generalised badness here

如果我理解正确，如果其中一个单词与
词典：：hash\u touction\u jockers
中的肯定或否定注释匹配，则需要提取整个句子。对于这种情况，您可以使用下面的代码（如果需要，可以在中间步骤中使用
data.table
进行调整）。我希望这就是你要找的

library(lexicon) library(data.table) library(stringi) #check the content of the lexicon lex <- copy(lexicon::hash_sentiment_jockers) # x y # 1: abandon -0.75 # 2: abandoned -0.50 # 3: abandoner -0.25 # 4: abandonment -0.25 # 5: abandons -1.00 # --- # 10735: zealous 0.40 # 10736: zenith 0.40 # 10737: zest 0.50 # 10738: zombie -0.25 # 10739: zombies -0.25 #only consider binary positive or negative pos <- lex[y > 0] neg <- lex[y < 0] samples <-c('There is no evidence of a lump' ,'Neither a contusion nor a scar was seen' ,'No inflammation was evident' ,'We found generalised badness here') #get ids of the samples that inlcude positve/negative terms samples_pos <- which(stri_detect_regex(samples, paste(pos[,x], collapse = "|"))) samples_neg <- which(stri_detect_regex(samples, paste(neg[,x], collapse = "|"))) #set up data.frames with all positive/negative samples and their ids df_pos <- data.frame(sentence_id = samples_pos, positive = samples[samples_pos]) df_neg <- data.frame(sentence_id = samples_neg, negative = samples[samples_neg]) #combine the sets rbindlist(list(df_pos, df_neg), use.names = TRUE, fill = T) # sentence_id positive negative # 1: 3 No inflammation was evident NA # 2: 4 We found generalised badness here NA # 3: 2 NA Neither a contusion nor a scar was seen # 4: 3 NA No inflammation was evident # 5: 4 NA We found generalised badness here #the first sentence is missing, since none of its words is inlcuded in #the lexcicon, you might use stemming, etc. to increase coverage any(grepl("evidence", lexicon::hash_sentiment_jockers[,x])) #[1] FALSE

库（词典）库（数据表）图书馆（stringi） #检查词典的内容 lex我认为你只想根据否定词的存在来对文本进行正反分类，因此从词典中提取否定词应该会有所帮助 samples<-c('There is no evidence of a lump','Neither a contusion nor a scar was seen','No inflammation was evident','We found generalised badness here') polarity <- data.frame(text = samples, pol = NA) polarity$pol <- ifelse(grepl(paste(lexicon::hash_valence_shifters[y==1]$x,collapse = '|'), tolower(samples)),'Negative','Positive') polarity text pol 1 There is no evidence of a lump Negative 2 Neither a contusion nor a scar was seen Negative 3 No inflammation was evident Negative 4 We found generalised badness here Positive 示例 samples<-c('There is no evidence of a lump','Neither a contusion nor a scar was seen','No inflammation was evident','We found generalised badness here') polarity <- data.frame(text = samples, pol = NA) polarity$pol <- ifelse(grepl(paste(lexicon::hash_valence_shifters[y==1]$x,collapse = '|'), tolower(samples)),'Negative','Positive') polarity text pol 1 There is no evidence of a lump Negative 2 Neither a contusion nor a scar was seen Negative 3 No inflammation was evident Negative 4 We found generalised badness here Positive reshape2::dcast(polarity,text~pol) text Negative Positive 1 Neither a contusion nor a scar was seen Negative <NA> 2 No inflammation was evident Negative <NA> 3 There is no evidence of a lump Negative <NA> 4 We found generalised badness here <NA> Positive