Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python字符和字数_Python_String_Count_Character_Counter - Fatal编程技术网

Python字符和字数

Python字符和字数,python,string,count,character,counter,Python,String,Count,Character,Counter,我是python的初学者,我想知道如何使用两个txt文件来计算字符数,以及如何计数10个最常见的字符。还有如何将文件中的所有字符转换为小写,并消除除a-z以外的所有字符 以下是我尝试过但没有成功的方法: from string import ascii_lowercase from collections import Counter with open ('document1.txt' , 'document2.txt') as f: print Counter(letter for

我是python的初学者,我想知道如何使用两个txt文件来计算字符数,以及如何计数10个最常见的字符。还有如何将文件中的所有字符转换为小写,并消除除a-z以外的所有字符

以下是我尝试过但没有成功的方法:

from string import ascii_lowercase
from collections import Counter

with open ('document1.txt' , 'document2.txt') as f:
    print Counter(letter for line in f
                    for letter in line.lower()
                    if letter in ascii_lowercase)

下面是一个简单的例子。您可以调整此代码以满足您的需要

from string import ascii_lowercase
from collections import Counter

with open('file1.txt', 'r') as file1data: #opening an reading file one
    file1 = file1data.read().lower() #convert the entire file contents to lower

with open('file2.txt', 'r') as file2data: #opening an reading file two
    file2 = file2data.read().lower() 

#The contents of both file 1 and 2 are stored in fil1 and file2 variables
#Examples of how to work with one file repeat for two files
file1_list = []
for ch in file1:
    if ch in ascii_lowercase: #makes sure only lowercase alphabet is appended.  All Non alphabet characters are removed
        file1_list.append(ch)
    elif ch in [" ", ".", ",", "'"]: #remove this elif block is you just want the letters
        file1_list.append(ch) #make sure basic punctionation is kept

print "".join(file1_list) #this line is not needed. Just to show what the text looks like now
print Counter(file1_list).most_common(10) #prints the top ten
print Counter(file1_list) #prints the number of characters and how many times they repeat
既然您已经查看了上面的混乱情况,并且对每一行都有了概念,那么这里有一个更干净的版本,它符合您的要求

from string import ascii_lowercase
from collections import Counter

with open('file1.txt', 'r') as file1data: 
    file1 = file1data.read().lower()

with open('file2.txt', 'r') as file2data: 
    file2 = file2data.read().lower() 

file1_list = []
for ch in file1:
    if ch in ascii_lowercase: 
        file1_list.append(ch)

file2_list = []
for ch in file2:
    if ch in ascii_lowercase: 
        file2_list.append(ch)



all_counter = Counter(file1_list + file2_list) 
top_ten_counter = Counter(file1_list + file2_list).most_common(10) 

print sorted(all_counter.items()) 
print sorted(top_ten_counter)

下面是一个简单的例子。您可以调整此代码以满足您的需要

from string import ascii_lowercase
from collections import Counter

with open('file1.txt', 'r') as file1data: #opening an reading file one
    file1 = file1data.read().lower() #convert the entire file contents to lower

with open('file2.txt', 'r') as file2data: #opening an reading file two
    file2 = file2data.read().lower() 

#The contents of both file 1 and 2 are stored in fil1 and file2 variables
#Examples of how to work with one file repeat for two files
file1_list = []
for ch in file1:
    if ch in ascii_lowercase: #makes sure only lowercase alphabet is appended.  All Non alphabet characters are removed
        file1_list.append(ch)
    elif ch in [" ", ".", ",", "'"]: #remove this elif block is you just want the letters
        file1_list.append(ch) #make sure basic punctionation is kept

print "".join(file1_list) #this line is not needed. Just to show what the text looks like now
print Counter(file1_list).most_common(10) #prints the top ten
print Counter(file1_list) #prints the number of characters and how many times they repeat
既然您已经查看了上面的混乱情况,并且对每一行都有了概念,那么这里有一个更干净的版本,它符合您的要求

from string import ascii_lowercase
from collections import Counter

with open('file1.txt', 'r') as file1data: 
    file1 = file1data.read().lower()

with open('file2.txt', 'r') as file2data: 
    file2 = file2data.read().lower() 

file1_list = []
for ch in file1:
    if ch in ascii_lowercase: 
        file1_list.append(ch)

file2_list = []
for ch in file2:
    if ch in ascii_lowercase: 
        file2_list.append(ch)



all_counter = Counter(file1_list + file2_list) 
top_ten_counter = Counter(file1_list + file2_list).most_common(10) 

print sorted(all_counter.items()) 
print sorted(top_ten_counter)
试着这样做:

>>> from collections import Counter
>>> import re
>>> words = re.findall(r'\w+', "{} {}".format(open('your_file1').read().lower(), open('your_file2').read().lower()))
>>> Counter(words).most_common(10)
试着这样做:

>>> from collections import Counter
>>> import re
>>> words = re.findall(r'\w+', "{} {}".format(open('your_file1').read().lower(), open('your_file2').read().lower()))
>>> Counter(words).most_common(10)

不幸的是,如果不重新写入文件,就无法插入到文件中间。正如前面的海报所示,您可以使用seek将内容附加到文件或覆盖其中的一部分,但如果您想在文件的开头或中间添加内容,则必须重写它

这是一个操作系统的东西,不是Python的东西。这在所有语言中都是一样的

我通常做的是从文件中读取,进行修改并将其写入一个名为myfile.txt.tmp或类似的新文件。这比将整个文件读入内存要好,因为该文件可能太大了。完成临时文件后,我将其重命名为与原始文件相同的名称

这是一种很好的、安全的方法,因为如果文件写入由于任何原因崩溃或中止,您仍然拥有未触及的原始文件

要从多个文件中查找最常用的单词

from collections import Counter
import re
with open(''document1.txt'') as f1, open(''document1.txt'') as f2:
    words = re.findall(r'\w+', f1.read().lower()) + re.findall(r'\w+', f2.read().lower())
    >>>Counter(words).most_common(10)
    "wil give you most 10 common words"
如果你想要最多10个常用字符


不幸的是,如果不重新写入文件,就无法插入到文件中间。正如前面的海报所示,您可以使用seek将内容附加到文件或覆盖其中的一部分,但如果您想在文件的开头或中间添加内容,则必须重写它

这是一个操作系统的东西,不是Python的东西。这在所有语言中都是一样的

我通常做的是从文件中读取,进行修改并将其写入一个名为myfile.txt.tmp或类似的新文件。这比将整个文件读入内存要好,因为该文件可能太大了。完成临时文件后,我将其重命名为与原始文件相同的名称

这是一种很好的、安全的方法,因为如果文件写入由于任何原因崩溃或中止,您仍然拥有未触及的原始文件

要从多个文件中查找最常用的单词

from collections import Counter
import re
with open(''document1.txt'') as f1, open(''document1.txt'') as f2:
    words = re.findall(r'\w+', f1.read().lower()) + re.findall(r'\w+', f2.read().lower())
    >>>Counter(words).most_common(10)
    "wil give you most 10 common words"
如果你想要最多10个常用字符


您收到的错误是什么,而且with语句的格式不正确。使用openfile.txt,r作为数据:不能使用同一With语句打开两个文件。您需要两个with语句。您收到的错误是什么,而且with语句的格式不正确。使用openfile.txt,r作为数据:不能使用同一With语句打开两个文件。你需要两份声明,谢谢。这确实有效,但它正在提取整个文件。我怎样才能让它只显示柜台?还有,我如何让它显示出来,比如:一个20B10C14等,而不是一个20B10C14,对上面的代码进行了编辑,应该可以帮你完成。谢谢。这确实有效,但它正在提取整个文件。我怎样才能让它只显示柜台?还有,我如何让它显示出来,比如:20B10C14等,而不是20B10C14,对上面的代码进行了编辑,应该可以帮到你。