Python 正在搜索文件\u 2中文件\u 1中的单词_Python

Python 正在搜索文件\u 2中文件\u 1中的单词

python

Python 正在搜索文件\u 2中文件\u 1中的单词,python,Python,我需要有以下输出：如果在文件2中找到文件1中的任何单词，则函数应返回True。否则函数应返回False：在文件_1中，每行应包含一个单词 def search_in_file(filepath_1, filepath_2): wordlist_1=[] f = open(filepath_1, "r") for line in f: wordlist_1.append(line) print wordlist_1 wordlist_2

我需要有以下输出：如果在文件2中找到文件1中的任何单词，则函数应返回True。否则函数应返回False：在文件_1中，每行应包含一个单词

def search_in_file(filepath_1, filepath_2):
    wordlist_1=[]
    f = open(filepath_1, "r")
    for line in f:
        wordlist_1.append(line)
    print wordlist_1

    wordlist_2=[]
    f = open(filepath_2, "r")
    for line in f:
        wordlist_2.append(line)
    print wordlist_2

    for i in wordlist_1:
        if i in wordlist_2:
            return True
        else:
            return False

我仍然得到了False，但是文件_1中的一些单词在文件_2中可见。有人能帮忙吗？

您可以使用以下方法：

def search_in_file(filepath_1, filepath_2):
    wordlist_1=set(open(filepath_1))
    wordlist_2=set(open(filepath_2))
    return wordlist_1 & wordlist_2 != set() # Check if set intersection is not empty
    # Of course, you could simply return wordlist_1 & wordlist_2,
    # that way you'd know the actual set of all matching words.

请注意，逐行读取文件时，将保留行尾。因此，如果文件的最后一行没有换行，则可能会丢失匹配项

def search_in_file(filepath_1, filepath_2):
 wordlist_1=[]
 f = open(filepath_1, "r")
 for line in f:
     wordlist_1.append(line)
 print wordlist_1

 wordlist_2=[]
 f = open(filepath_2, "r")
 for line in f:
     wordlist_2.append(line)
 print wordlist_2

 for i in wordlist_1:
     if i in wordlist_2:
         return True
 return False

使用

和

语句、

打开

和

读取

方法来获取文件内容

使用

split（）

方法创建文件内容的

列表


使用set
方法从两个列表中获取公共值
输入：
文件：“/home/vivek/Desktop/input1.txt”
文件：“/home/vivek/Desktop/input2.txt”
代码：
输出：
infogrid@infogrid-172:~$ python  workspace/vtestproject/study/test.py
result: ['Good', 'word', 'file', 'I', 'have', 'some', 'second', '5', '7', '6', 'from', 'first']
Words are common in two files.

一个函数只能返回一次，因此您只能得到比较每个列表中第一个单词的结果。这两个文件是每行包含一个单词，还是文件2可以包含多个单词？如果是这样，我们应该如何确定一个单词的起始/结束位置？换句话说，单词之间是如何分开的？这可能效率不高，但它起到了作用，所以我不明白为什么会被否决……这是通过行检查，而不是通过单词。@火山：嗯，他确实写道每行有一个单词，但他只在文件1中提到了这一点。我们需要澄清。蒂姆·皮耶茨克，那么你一定要从第一句话中去掉（）。只是一个想法-可能有任何-应用于从file1到file2的单词-在这种情况下比set更有效-毕竟，在大量字符串上构建set可能很耗时？如果文本包含，\？
等，会发生什么？很好。内容在\n
，\t
，\r
，`上拆分。因此，如果有
，

，`等等，那么你的词就是你的。我认为我们需要在比较之前从文件内容中删除这些字符。

file second 
I have some word from file first
Good 
5 6 7 8  9 0

def searchInFile(filepath_1, filepath_2):
    with open(filepath_1, "r") as fp:
        wordlist_1 = fp.read().split()

    with open(filepath_2, "r") as fp:
        wordlist_2 = fp.read().split()

    common = set(wordlist_1).intersection(set(wordlist_2))

    return list(common)


filepath_1 = "/home/vivek/Desktop/input1.txt"
filepath_2 = "/home/vivek/Desktop/input2.txt"

result = searchInFile(filepath_1, filepath_2)
print "result:", result
if result:
    print "Words are common in two files."
else:
    print "No Word is common in two files."

infogrid@infogrid-172:~$ python  workspace/vtestproject/study/test.py
result: ['Good', 'word', 'file', 'I', 'have', 'some', 'second', '5', '7', '6', 'from', 'first']
Words are common in two files.