Python 创建一个字典，将文本文件中的每个单词链接到文件中出现的行列表_Python_Python 2.7_Dictionary

Python 创建一个字典，将文本文件中的每个单词链接到文件中出现的行列表

python python-2.7 dictionary

Python 创建一个字典，将文本文件中的每个单词链接到文件中出现的行列表,python,python-2.7,dictionary,Python,Python 2.7,Dictionary,我想要的是构建一个函数，该函数接收一个文本文件作为参数，并返回一个字典，其中文本中的每个单词都与该单词出现在文本中的行列表相关联。这就是我想到的： def dictionary(file): in_file=open(file, 'r') words=[] d={} lines=in_file.readlines() for line in lines: words=words+line.split(' ') for j in

我想要的是构建一个函数，该函数接收一个文本文件作为参数，并返回一个字典，其中文本中的每个单词都与该单词出现在文本中的行列表相关联。这就是我想到的：

def dictionary(file):
    in_file=open(file, 'r')
    words=[]
    d={}
    lines=in_file.readlines()

    for line in lines:
        words=words+line.split(' ')

    for j in words:
        for i in range(len(lines)):
            if j in lines[i]:
                d[j]=i
    return d

然而，这并不是我想要的，因为它只显示单词出现的一行索引（而不是在列表中）。

提前感谢。

您可以存储一个列表，而不是在字典中为每个单词只存储一个外观值。当找到另一个匹配项时，可以很容易地更新：

def dictionary(file):
    in_file=open(ficheiro, 'r')
    words=[]
    d={}
    lines=in_file.readlines()

    for line in lines:
        words=words+line.split(' ')

    for j in words:
        if (j not in d):
            d[j] = []
            for i in range(len(lines)):
                if j in lines[i]:
                    d[j].append(i)
    return d

下面是一个函数，它应该完成您正在寻找的功能，并带有注释：

def dictionary(filename):
    # Pass the function a filename (string)

    # set up a dict to hold the results

    result = dict()

    # open the file and pass it to enumerate
    # this combination returns something like a list of
    # (index i.e. line number, line) pairs, which you can 
    # iterate over with the for-loop

    for idx, line in enumerate(open(filename)):

        # now take each line, strip any whitespace (most notably, 
        # the trailing newline character), then split the 
        # remaining line into a list of words contained in that line

        words = line.strip().split()

        # now iterate over the list of words

        for w in words:

            # if this is the first time you encounter this word, 
            # create a list to hold the line numbers within 
            # which this word is found

            if w not in result:
                result[w] = []

            # now add the current line number to the list of results for this word

            result[w].append(idx)

    # after all lines have been processed, return the result
    return result

指向相关函数的某些链接（它们在注释中无法正确显示）：

你到底想要什么？包含所有单词及其行号的字典？字典就是这样工作的，每个键一个值。您希望得到什么样的输出？您可以制作一个dict，其中每个值都是一个数字列表，如果这是您想要的。你想要什么？好吧，也许我不够清楚。我想要的是构建一个函数，该函数接收一个文本文件作为参数，并返回一个字典，其中文本中的每个单词都与该单词出现在文本中的行列表相关联。这实际上创建了一个非常奇怪的输出，例如，我的文本文件中的关键字/单词“rabbit”与以下值链接：“rabbit”：[12, 14, 17, 12, 14, 17]。换句话说，这个列表会重复很多次。对此有什么想法吗？这个输出与同一个单词的多次出现有关。每次你的整个文件都被再次解析。因此，重复。更新了我的答案。几乎没有更改，效果很好。但是我不熟悉这个符号，因为我是stiI’我只是一个学习者。枚举（开放式（f））中的行对idx有什么作用？：：当然，很公平。只是更新了注释。希望如此helps@Joe-登记入住，这回答了你的问题吗？