Python以64的倍数获取行数_Python_File_Count

Python以64的倍数获取行数

python file

Python以64的倍数获取行数,python,file,count,Python,File,Count,我不太明白如何正确地使用谷歌。我试图遍历一个包含输入文件中的行的列表。为了记录错误，我正在跟踪每一行的行号我想将循环的结果写入输出文件。我已经将换行符放在list.append函数调用中，它可以很好地确定文件中的一行是否有问题。每次迭代后，它都会写入一个换行符在64块中，我想写两个换行符，它们在输出文件中是可区分的。这是我到目前为止所拥有的 import sys fname = sys.argv[1] list = [] output = "hashes.txt" with open(fn

我不太明白如何正确地使用谷歌。我试图遍历一个包含输入文件中的行的列表。为了记录错误，我正在跟踪每一行的行号

我想将循环的结果写入输出文件。我已经将换行符放在list.append函数调用中，它可以很好地确定文件中的一行是否有问题。每次迭代后，它都会写入一个换行符

在64块中，我想写两个换行符，它们在输出文件中是可区分的。这是我到目前为止所拥有的

import sys

fname = sys.argv[1]
list = []
output = "hashes.txt"
with open(fname) as f:
    content = f.readlines()
    num_line = 0
    for line in content:
        if line:    
            num_line += 1
            line = line.split(',')
            try:
                //if num_line == 64??? Not Sure how to iterate in blocks of 64\\
                list.append(line[1] + "\n\n")
            except Exception, ex:
                print("Problem on line", line, num_line)

with open(output, 'w') as w:
    w.writelines(list)

在这方面：

//if num_line == 64??? Not Sure how to iterate in blocks of 64\\

您正在寻找以下内容：

if not num_line % 64:

当行号除以64的余数为零时，它将进入

if

块

哦，您希望Python注释使用

，而不是

正如Cyphase所提到的，您需要的是

if line.strip（）：

，而不是

if line:

，因为换行符是一个字符。

除非您以后要处理这些行，否则您可以同时读写，而无需存储这些行。另外，

list

对于变量名来说是一个糟糕的选择，因为它是内置方法

list（）

的名称

您还有一个不会引发任何异常的try/catch，请尝试以下版本的代码：

import sys

fname = sys.argv[1]
# list = [] -- not needed
output = "hashes.txt"
with open(fname) as f, open(output, 'w') as out:
    num_line = 0
    for line in f:
        if line.strip():    
            num_line += 1
            bits = line.strip().split(',')
            try:
                output_line = bits[1]
            except IndexError:
                print("Problem on line", line, num_line)
                continue # skip the rest of the loop,
                         # go to the next line
            if not num_line % 64:
                out.write('{}\n\n'.format(output_line))
            else:
                out.write('{}\n'.format(output_line))

这个脚本应该做与您希望自己做的相同的事情，只是它要干净得多

import sys

from itertools import zip_longest


# From itertools recipes:
# https://docs.python.org/3/library/itertools.html
def grouper(iterable, n, fillvalue=None):
    "Collect data into fixed-length chunks or blocks"
    # grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)


def main(outfile_path, infile_path, group_size):
    with open(infile_path) as infile, open(outfile_path) as outfile:
        # Filter out lines with zero non-whitespace characters
        nonempty_lines = (line for line in infile if line.strip())

        # Filter out lines that don't have a second value
        splittable_lines = (line for line in nonempty_lines if ',' in line)

        # Get second values from lines that have one
        all_values = (line.split(',')[1] for line in splittable_lines)

        # Filter out empty second values
        nonempty_values = (value for value in all_values if values)

        # Create output lines
        output_lines = ('{}\n'.format(value) for value in nonempty_values)

        for group_of_output_lines in grouper(output_lines, group_size):
            outfile.writelines(group_of_output_lines)
            outfile.write('\n')


if __name__ == '__main__':
    main(outfile_path='hashes.txt', infile_path=sys.argv[1], group_size=64)

grouper（）

是一个生成器，它将生成包含来自

iterable

的

项组的元组，我们使用这些元组按64项进行分组

main（）

有很好的注释，所以除非有人发现不清楚的地方，否则我不会在这里解释。

可以使用模运算符，但有更好的方法；给我一分钟。如果不是空行，你只想让行号增加？使用模数，每当计数器达到64时重置计数器，不管怎样。因此，你想将每行的第二个逗号分隔项写入输出文件中它自己的行，再加上每64行一个额外的换行符，以便它们在输出中分开？没那么复杂。我只是想让循环知道，如果计数器达到64的倍数，那么将其添加到换行符中，而不是1。带模数的谓词需要一个

not

（或者切换

if..elif

内容）。如果该行没有comma@BurhanKhalid，

'foo'，.split（'，）

['foo'，']

。然后过滤掉这些空值。啊，是的，但这太复杂了，无法避免异常。对我来说似乎很简单：）。