Python 2.7 计算文本中的行数_Python 2.7_Count

Python 2.7 计算文本中的行数

python-2.7

Python 2.7 计算文本中的行数,python-2.7,count,Python 2.7,Count,我试图数一数一组文件摘要中有多少个句子。下面的第一段代码是打开所有文件夹和文件。第二部分抓取文件ID和摘要来计算句子编号。我想要一个结果列表，显示每个文件有多少个句子（这里是行数） # open multiple files. import re, os topdir = r'E:\Grad\LIS\LIS590 Text mining\Part123\Part1' matches = [] for root, dirnames, filenames in os.walk(topdir):

我试图数一数一组文件摘要中有多少个句子。下面的第一段代码是打开所有文件夹和文件。第二部分抓取文件ID和摘要来计算句子编号。我想要一个结果列表，显示每个文件有多少个句子（这里是行数）

# open multiple files.
import re, os
topdir = r'E:\Grad\LIS\LIS590 Text mining\Part123\Part1' 
matches = []
for root, dirnames, filenames in os.walk(topdir):
    for filename in filenames:
        if filename.endswith(('.txt','.pdf')):
            matches.append(os.path.join(root, filename))

capturedfiles = []
capturedabstracts = []
Abs=open('countsent.csv','w')
for filepath in matches:
    with open (filepath,'rt') as mytext:
        mytext=mytext.read()

    # code to capture file IDs.
    grabFile=re.findall(r'File\s+\:\s+(\w\d{7})',mytext)
    if len(grabFile) == 0:
        matchFile= "N/A"
    else:
        matchFile = grabFile[0]
    capturedfiles.append(matchFile)
    #print capturedfiles

    # code to capture abstracts
    newtext=re.sub(r'\n',' ',mytext)
    newtext=re.sub(r'\s+',' ',newtext)
    grabAbs=re.findall(r'Abstract\s+\:(\w.+)',newtext)
    if len(grabAbs) == 0:
        matchAbs= "N/A"
    else:
        matchAbs = grabAbs[0]
    capturedabstracts.append(matchAbs)

    lineCount = 0
    lines = matchAbs[0].split('. ')
    for line in lines:
        lineCount +=1
        Abs.write(matchFile + '|' + str(lineCount) + '\n')
Abs.close()

输出不正确。


a9000006 | 1
a9000031 | 1
a9000038 | 1
a9000040 | 1
a9000043|1

我需要计算每个摘要中的句子总数。为此，我使用了lineCount，但结果是错误的。我不知道怎么纠正它。下面是我想要的结果示例：a9000006 | 4 a9000031 | 11。4和11是这些摘要中的句子数，非常感谢

“如何输出SEntenceNumber+matchFile？”问题是什么？您的代码中没有

SentenceNumber

。您希望发生什么（输入/输出）？会发生什么？提供。已更新我的代码。句子编号为lineCount。但是输出行数不是正确的值，我希望它是每个文件的总句子数。1。删除不相关的代码。先让它为单个文件工作2。您的代码已输出lineCount+matchFile 3。用通俗易懂的英语描述你希望你的代码做什么以及会发生什么。你看到结果了吗？这不是我想要的。我已经解释过我需要计算每个文件中的句子总数。为此，我使用了

lineCount

，但结果是错误的。我不知道怎么纠正它。下面是我想要的结果示例：

a90000006 | 4 a9000031 | 11

。4和11是这些摘要中的句子数。不要将相关信息放在评论中。更新你的问题。您的代码提示您没有使用“行/句子计数”的常规定义，代码被破坏。因此，你应该用通俗易懂的英语来描述你想要的东西