Python 3.x 如何检查一个csv中的单词是否存在于另一个csv文件的另一列中_Python 3.x_Pandas_Csv

Python 3.x 如何检查一个csv中的单词是否存在于另一个csv文件的另一列中

python-3.x pandas csv

Python 3.x 如何检查一个csv中的单词是否存在于另一个csv文件的另一列中,python-3.x,pandas,csv,Python 3.x,Pandas,Csv,我有两个csv文件，一个是dictionary.csv，其中包含单词列表，另一个是story.csv。在story.csv中有许多专栏，其中一个专栏中包含了许多名为news_story的单词。我想检查dictionary.csv中的单词列表是否存在于news_story专栏中。之后，我想在一个名为new.csv的新csv文件中，打印新闻故事栏中包含dictionary.csv单词列表中单词的所有行这些是我到目前为止尝试过的代码 import csv import pandas as pd n

我有两个csv文件，一个是dictionary.csv，其中包含单词列表，另一个是story.csv。在story.csv中有许多专栏，其中一个专栏中包含了许多名为news_story的单词。我想检查dictionary.csv中的单词列表是否存在于news_story专栏中。之后，我想在一个名为new.csv的新csv文件中，打印新闻故事栏中包含dictionary.csv单词列表中单词的所有行

这些是我到目前为止尝试过的代码

import csv
import pandas as pd

news=pd.read_csv("story.csv")
dictionary=pd.read_csv("dictionary.csv")

pattern = '|'.join(dictionary)

exist=news['news_story'].str.contains(pattern)
for CHECK in exist:
    if not CHECK:
        news['NEWcolumn']='NO'
    else:
        news['NEWcolumn']='YES'

news.to_csv('New.csv')

我不断地得到一个否定，尽管应该有一些真理

story.csv

news_url news_title news_date news_story
goog.com functional 2019      This story is about a functional requirement
live.com pbandJ     2001      I made a sandwich today
key.com  uAndI      1992      A code name of a spy

首先将列转换为带有

header=None

的系列，以避免在以下情况下删除带有

squage=True的第一个值：
最后筛选人：
详细信息：
print (news[exist])
   news_url  news_title  news_date  \
0  goog.com  functional       2019   

                                     news_story  
0  This story is about a functional requirement  

您能创建两个文件的预期输出吗？模式是一个长字符串，带有单词和管道符号。你不可能在你的新闻报道中找到这样一个词。直截了当的解决方案是循环第一个文件中的单词，并在循环体中使用str.contains
。@jezrael您可以看到编辑过的示例非常感谢您的帮助！但是，如果我想将dictionary.csv中没有单词的行也保存在另一个csv文件中，该怎么办？@strawberrylatte-然后使用news[~exist]。to_csv（'New_not_exist.csv'）
New.csv
news_url news_title news_date news_story
goog.com functional   2019    This story is about a functional requirement

dictionary=pd.read_csv("dictionary.csv", header=None, squeeze=True)
print (dictionary)
0           red
1           tie
2          lace
3         books
4    functional
Name: 0, dtype: object

pattern = '|'.join(dictionary)
#for avoid match substrings use words boundaries
#pattern = '|'.join(r"\b{}\b".format(x) for x in dictionary)

exist = news['news_story'].str.contains(pattern)
news[exist].to_csv('New.csv')

print (news[exist])
   news_url  news_title  news_date  \
0  goog.com  functional       2019   

                                     news_story  
0  This story is about a functional requirement