如何在python中删除部分文件？_Python_File_Text Processing

如何在python中删除部分文件？

python file

如何在python中删除部分文件？,python,file,text-processing,Python,File,Text Processing,我有一个名为a.txt的文件，它如下所示： # get a iterator over the lines in the file: with open("input.txt", 'rt') as lines: # while the line is not empty drop it for line in lines: if not line.strip(): break # now lines is at the point

我有一个名为a.txt的文件，它如下所示：

# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
    # while the line is not empty drop it
    for line in lines:
        if not line.strip():
            break

    # now lines is at the point after the first paragraph
    # so write out everything from here
    with open("output.txt", 'wt') as out:
        out.writelines(lines)

with open('a.txt', 'r') as file:
    lines = file.readlines()

blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line

with open('a.txt', 'w') as file:
    file.write('\n'.join(lines))

我是第一行
我是第二行。
这里可能有更多的线

我在一条空线以下。
我是一条线。
这里有更多的线

现在，我想删除空行上方的内容（包括空行本身）。

我怎样才能以Pythonic的方式做到这一点呢？

简单的方法是从上到下逐个迭代文件中的行：

#!/usr/bin/env python

with open("4692065.txt", 'r') as src, open("4692065.cut.txt", "w") as dest:
    keep = False
    for line in src:
        if keep: dest.write(line)
        if line.strip() == '': keep = True

从上到下逐行迭代文件中的行的简单方法：

#!/usr/bin/env python

with open("4692065.txt", 'r') as src, open("4692065.cut.txt", "w") as dest:
    keep = False
    for line in src:
        if keep: dest.write(line)
        if line.strip() == '': keep = True

基本上你不能从一个文件的开头删除东西，所以你必须写一个新的文件

我认为蟒蛇式的方式是这样的：

# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
    # while the line is not empty drop it
    for line in lines:
        if not line.strip():
            break

    # now lines is at the point after the first paragraph
    # so write out everything from here
    with open("output.txt", 'wt') as out:
        out.writelines(lines)

with open('a.txt', 'r') as file:
    lines = file.readlines()

blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line

with open('a.txt', 'w') as file:
    file.write('\n'.join(lines))

以下是一些更简单的版本，对于较旧的Python版本，没有

with

：

lines = open("input.txt", 'rt')
for line in lines:
    if not line.strip():
        break
open("output.txt", 'wt').writelines(lines)

这是一个非常直接的版本，只需在空行处拆分文件：

# first, read everything from the old file
text = open("input.txt", 'rt').read()

# split it at the first empty line ("\n\n")
first, rest = text.split('\n\n',1)

# make a new file and write the rest
open("output.txt", 'wt').write(rest)

请注意，这可能非常脆弱，例如windows经常使用

\r\n

作为一个换行符，因此空行应该是

\r\n\r\n

。但是通常你知道文件的格式只使用一种换行符，所以这可能没问题。

基本上你不能从文件的开头删除内容，所以你必须写入一个新文件

我认为蟒蛇式的方式是这样的：

# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
    # while the line is not empty drop it
    for line in lines:
        if not line.strip():
            break

    # now lines is at the point after the first paragraph
    # so write out everything from here
    with open("output.txt", 'wt') as out:
        out.writelines(lines)

with open('a.txt', 'r') as file:
    lines = file.readlines()

blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line

with open('a.txt', 'w') as file:
    file.write('\n'.join(lines))

以下是一些更简单的版本，对于较旧的Python版本，没有

with

：

lines = open("input.txt", 'rt')
for line in lines:
    if not line.strip():
        break
open("output.txt", 'wt').writelines(lines)

这是一个非常直接的版本，只需在空行处拆分文件：

# first, read everything from the old file
text = open("input.txt", 'rt').read()

# split it at the first empty line ("\n\n")
first, rest = text.split('\n\n',1)

# make a new file and write the rest
open("output.txt", 'wt').write(rest)

请注意，这可能非常脆弱，例如windows经常使用

\r\n

作为一个换行符，因此空行应该是

\r\n\r\n

。但通常情况下，您知道文件的格式只使用一种换行符，所以这可能很好。

知道文件将有多大吗

您可以将文件读入内存：

f = open('your_file', 'r')
lines = f.readlines()

它将逐行读取文件并将这些行存储在列表中

然后，关闭文件并用“w”重新打开：

f.close()
f = open('your_file', 'w')
for line in lines:
    if your_if_here:
        f.write(line)

这将覆盖当前文件。然后，您可以从列表中选择要写回的行。但是，如果文件变得很大，可能不是一个好主意，因为整个文件必须驻留在内存中。但是，它不需要创建第二个文件来转储输出。

知道该文件将有多大吗

from itertools import dropwhile, islice

def content_after_emptyline(file_object):
    return islice(dropwhile(lambda line: line.strip(), file_object), 1, None)

with open("filename") as f:
    for line in content_after_emptyline(f):
        print line,

您可以将文件读入内存：

f = open('your_file', 'r')
lines = f.readlines()

它将逐行读取文件并将这些行存储在列表中

然后，关闭文件并用“w”重新打开：

f.close()
f = open('your_file', 'w')
for line in lines:
    if your_if_here:
        f.write(line)

这将覆盖当前文件。然后，您可以从列表中选择要写回的行。但是，如果文件变得很大，可能不是一个好主意，因为整个文件必须驻留在内存中。但是，它不需要您创建第二个文件来转储输出。

该模块（来自标准库）对于此类事情非常方便。它设置了一些设置，使您可以像“就地”编辑文件一样操作：

from itertools import dropwhile, islice

def content_after_emptyline(file_object):
    return islice(dropwhile(lambda line: line.strip(), file_object), 1, None)

with open("filename") as f:
    for line in content_after_emptyline(f):
        print line,

这个模块（来自标准库）对于这种事情很方便。它设置了一些设置，使您可以像“就地”编辑文件一样操作：

你可以这样做：

# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
    # while the line is not empty drop it
    for line in lines:
        if not line.strip():
            break

    # now lines is at the point after the first paragraph
    # so write out everything from here
    with open("output.txt", 'wt') as out:
        out.writelines(lines)

with open('a.txt', 'r') as file:
    lines = file.readlines()

blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line

with open('a.txt', 'w') as file:
    file.write('\n'.join(lines))

这使工作变得更简单。

你可以做一些类似的事情：

# get a iterator over the lines in the file:
with open("input.txt", 'rt') as lines:
    # while the line is not empty drop it
    for line in lines:
        if not line.strip():
            break

    # now lines is at the point after the first paragraph
    # so write out everything from here
    with open("output.txt", 'wt') as out:
        out.writelines(lines)

with open('a.txt', 'r') as file:
    lines = file.readlines()

blank_line = lines.index('\n')
lines = lines[blank_line+1:] #\n is the index of the blank line

with open('a.txt', 'w') as file:
    file.write('\n'.join(lines))

这使工作更简单。

更正为使行位于空行下方。更正为使行位于空行下方。您可以使用“\n”代替它，它为当前平台存储正确的行分隔符。@ikostia:这太愚蠢了。您正在处理的文件可以与操作系统默认使用的文件具有不同的换行符。您可以使用而不是“\n”，它为当前平台存储正确的换行符。@ikostia:这太愚蠢了。您正在处理的文件可以与操作系统默认使用的文件具有不同的换行符。