尝试从.txt文件（python）中计算特定字符串_Python_String_File_Counting

尝试从.txt文件（python）中计算特定字符串

python string file

尝试从.txt文件（python）中计算特定字符串,python,string,file,counting,Python,String,File,Counting,实际上，我有一个这样的条目： (observatório=astronómico, de o, universidade=de=coimbra) (centro=de=astronomia, de o, universidade=do=porto=catarina=lobo) (núcleo=interactivo=de=astronomia, em o, centro=de=interpretação=ambiental=da=ponta=do=sal) (câmara=municipa

实际上，我有一个这样的条目：

(observatório=astronómico, de o, universidade=de=coimbra)
(centro=de=astronomia, de o, universidade=do=porto=catarina=lobo)
(núcleo=interactivo=de=astronomia, em o,    centro=de=interpretação=ambiental=da=ponta=do=sal)
(câmara=municipal, de, cascais)
(câmara, de, nova=iorque)
(presidência, de o, pe)
(fortis, em, bruxelas)
(macquarie=futures, de o, eua)
(força=internacional=de=assistência=e=segurança, constituir o, força=de=reacção=rápida=do=comandante)
(forças=nacionais=destacadas, em o, afeganistão)
(nato, em o, afeganistão)
(nato, em o, afeganistão)

并且需要计算一个字符串重复多少次并将其输出到另一个.txt。我是用dict做的，但去掉特殊角色让我很沮丧

# -*- coding: utf-8 -*-
# !/usr/bin/python
from Tkinter import Tk
from tkFileDialog import askopenfilename

Tk().withdraw() 
filename = askopenfilename()
file = open(filename, "r+")
wordcount = {}
for word in file.read().split():
     if word not in wordcount:
    wordcount[word] = 1
       else:
    wordcount[word] += 1
for k, v in wordcount.iteritems():
   print k, "=", v, "vez(es)"

关于如何正确地计算它，并以任何人都可以读取和知道字符串（由于输入格式，可以是一行）出现多少次的方式输出它的任何提示

由于您的文本文件不包含单词字符，因此您需要提取单词，为此，您可以使用

regex

，然后可以使用

集合。Counter

获取包含单词频率的词典：

>>> from collections import Counter
>>> words=re.findall('\w+',s)
>>> Counter(words)
Counter({'o': 14, 'de': 12, 'em': 5, 'afeganist': 3, 'for': 3, 'do': 3, 'a': 3, 'ncia': 2, 'universidade': 2, 'mara': 2, 'astronomia': 2, 'nato': 2, 'centro': 2, 'c': 2, 'cascais': 1, 'ponta': 1, 'coimbra': 1, 'sal': 1, 'pida': 1, 'observat': 1, 'rio': 1, 'as': 1, 'catarina': 1, 'seguran': 1, 'macquarie': 1, 'nacionais': 1, 'nova': 1, 'eua': 1, 'interpreta': 1, 'internacional': 1, 'constituir': 1, 'pe': 1, 'reac': 1, 'bruxelas': 1, 'lobo': 1, 'assist': 1, 'municipal': 1, 'comandante': 1, 'da': 1, 'mico': 1, 'ambiental': 1, 'astron': 1, 'iorque': 1, 'fortis': 1, 'porto': 1, 'e': 1, 'futures': 1, 'n': 1, 'r': 1, 'interactivo': 1, 'presid': 1, 'destacadas': 1, 'cleo': 1})

\w+

将匹配长度为1或更多的单词字符的任何组合

对于特定单词的计数，您可以使用

list.count（）

方法：

>>> words.count('coimbra')
1
>>> words.count('a')
3

那么你的问题是什么？很抱歉缩进不好，在使用stackoverflow的文本编辑器时遇到了一些问题，我是一个新手，实际上无法正确计算它，并去除特殊字符，尝试将其与str.count（）一起使用，但没有成功。请确保您的代码在编辑器/IDE中正确缩进空格，复制并粘贴到您的问题中，突出显示它，然后单击

{}

代码格式按钮。我可以如何输出它？以“好的图形”的方式？我需要计算出现的行数/字符串数。@GabrielMachado我不熟悉

TK

，但是

Counter

就像一本字典，你可以像从字典中获取元素一样获取它的元素！