Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/file/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 以每行一个单词的形式保存到文本文件中的唯一单词_Python_File_Python 3.x_Io_Text Files - Fatal编程技术网

Python 以每行一个单词的形式保存到文本文件中的唯一单词

Python 以每行一个单词的形式保存到文本文件中的唯一单词,python,file,python-3.x,io,text-files,Python,File,Python 3.x,Io,Text Files,[使用Python 3.3.3] 我试图分析文本文件,清理它们,打印出唯一单词的数量,然后尝试将唯一单词列表保存到文本文件中,每行一个单词,每个唯一单词在清理后的单词列表中出现的次数。 所以我所做的就是把文本文件(哈珀总理的演讲)清理干净,只计算有效的字母字符和单个空格,然后计算唯一单词的数量,然后我需要为唯一单词创建一个保存的文本文件,每个唯一单词都在它自己的行和单词旁边,已清除列表中该单词出现的次数。这是我的 def uniqueFrequency(newWords): '''Fun

[使用Python 3.3.3]

我试图分析文本文件,清理它们,打印出唯一单词的数量,然后尝试将唯一单词列表保存到文本文件中,每行一个单词,每个唯一单词在清理后的单词列表中出现的次数。 所以我所做的就是把文本文件(哈珀总理的演讲)清理干净,只计算有效的字母字符和单个空格,然后计算唯一单词的数量,然后我需要为唯一单词创建一个保存的文本文件,每个唯一单词都在它自己的行和单词旁边,已清除列表中该单词出现的次数。这是我的

def uniqueFrequency(newWords):
    '''Function returns a list of unique words with amount of occurances of that
word in the text file.'''
    unique = sorted(set(newWords.split()))
    for i in unique:
        unique = str(unique) + i + " " + str(newWords.count(i)) + "\n"
    return unique

def saveUniqueList(uniqueLines, filename):
    '''Function saves result of uniqueFrequency into a text file.'''
    outFile = open(filename, "w")
    outFile.write(uniqueLines)
    outFile.close
newWords是文本文件的清理版本,只有单词和空格,没有其他内容。因此,我希望将newWords文件中的每个唯一单词保存到文本文件中,每行一个单词,在单词旁边,有该单词在newWords中出现的次数(不在唯一单词列表中,因为这样每个单词将有一次出现)。我的功能有什么问题?谢谢大家!

基于

unique = sorted(set(newWords.split()))
for i in unique:
    unique = str(unique) + i + " " + str(newWords.count(i)) + "\n"
我猜
newWords
不是一个字符串列表,而是一个长字符串。如果是这种情况,
newWords.count(i)
将为每个
i
返回
0

尝试:

上面的一行是在现有集合的末尾追加的“unique”,如果您使用其他变量名,如“var”,则应该正确返回

def uniqueFrequency(newWords):
    '''Function returns a list of unique words with amount of occurances of that
word in the text file.'''
    var = "";
    unique = sorted(set(newWords.split()))
    for i in unique:
        var = str(var) + i + " " + str(newWords.count(i)) + "\n"
    return var

尝试
集合。改为使用计数器
。它是为这种情况而设计的

演示内容如下:

(请注意,此解决方案还不够完美;它还没有删除单词.hint;use中的逗号。)

计数器
是一个专门的
dict
,以单词作为键,以计数作为值。所以你可以这样使用它:

 cnts = Counter(txt)
 with open('counts.txt', 'w') as outfile:
     for c in counts:
         outfile.write("{} {}\n".format(c, cnts[c]))
注意,在这个解决方案中,我使用了一些很好的Python概念

  • a
  • dict
    上的迭代(这是一个)

你怎么知道它不起作用?这个答案也很棒。这让我知道了这个词的正确出现次数,所以谢谢你!
def uniqueFrequency(newWords):
    '''Function returns a list of unique words with amount of occurances of that
word in the text file.'''
    var = "";
    unique = sorted(set(newWords.split()))
    for i in unique:
        var = str(var) + i + " " + str(newWords.count(i)) + "\n"
    return var
In [1]: from collections import Counter

In [2]: txt = """I'm trying to analyse text files, clean them up, print the amount of unique words, then try to save the unique words list to a text file, one word per line with the amount of times each unique word appears in the cleaned up list of words. SO what i did was i took the text file (a speech from prime minister harper), cleaned it up by only counting valid alphabetical characters and single spaces, then i counted the amount of unique words, then i needed to make a saved text file of the unique words, with each unique word being on its own line and beside the word, the number of occurances of that word in the cleaned up list. Here's what i have."""

In [3]: Counter(txt.split())
Out[3]: Counter({'the': 10, 'of': 7, 'unique': 6, 'i': 5, 'to': 4, 'text': 4, 'word': 4, 'then': 3, 'cleaned': 3, 'up': 3, 'amount': 3, 'words,': 3, 'a': 2, 'with': 2, 'file': 2, 'in': 2, 'line': 2, 'list': 2, 'and': 2, 'each': 2, 'what': 2, 'did': 1, 'took': 1, 'from': 1, 'words.': 1, '(a': 1, 'only': 1, 'harper),': 1, 'was': 1, 'analyse': 1, 'one': 1, 'number': 1, 'them': 1, 'appears': 1, 'it': 1, 'have.': 1, 'characters': 1, 'counted': 1, 'list.': 1, 'its': 1, "I'm": 1, 'own': 1, 'by': 1, 'save': 1, 'spaces,': 1, 'being': 1, 'clean': 1, 'occurances': 1, 'alphabetical': 1, 'files,': 1, 'counting': 1, 'needed': 1, 'that': 1, 'make': 1, "Here's": 1, 'times': 1, 'print': 1, 'up,': 1, 'beside': 1, 'trying': 1, 'on': 1, 'try': 1, 'valid': 1, 'per': 1, 'minister': 1, 'file,': 1, 'saved': 1, 'single': 1, 'words': 1, 'SO': 1, 'prime': 1, 'speech': 1, 'word,': 1})
 cnts = Counter(txt)
 with open('counts.txt', 'w') as outfile:
     for c in counts:
         outfile.write("{} {}\n".format(c, cnts[c]))