为格式化文本文件而编写的代码中的小错误（不正确的间距）（Python 3）_Python_Python 3.x

为格式化文本文件而编写的代码中的小错误（不正确的间距）（Python 3）

python python-3.x

为格式化文本文件而编写的代码中的小错误（不正确的间距）（Python 3）,python,python-3.x,Python,Python 3.x,如果这是一个愚蠢的问题，我很抱歉我有一些文本，我正试图格式化以使其更易于阅读，所以我尝试用Python编写一个简短的程序来为我编写。我最初使用“查找并替换”选项删除了MS Word中多余的段落分隔符。输入文本如下所示： This is a sentence. So is this one. And this. (empty line) This is the next line (empty line) and some lines are like this. $$This is a se

如果这是一个愚蠢的问题，我很抱歉

我有一些文本，我正试图格式化以使其更易于阅读，所以我尝试用Python编写一个简短的程序来为我编写。我最初使用“查找并替换”选项删除了MS Word中多余的段落分隔符。输入文本如下所示：

This is a sentence. So is this one. And this.
(empty line)
This is the next line
(empty line)
and some lines are like this.

$$This is a sentence. So is this one. And this.
$$This is the next line and some lines are like this.

我想消除所有的空行，这样行与行之间就没有间隔，并确保没有句子像上面一点那样挂在中间。所有新行都应该以两个空格开头，用下面的$符号表示。因此，格式化后，它应该如下所示：

This is a sentence. So is this one. And this.
(empty line)
This is the next line
(empty line)
and some lines are like this.

$$This is a sentence. So is this one. And this.
$$This is the next line and some lines are like this.

我写了以下脚本：

import os

directory = "C:/Users/DELL/Desktop/"
filename = "test.txt"
path = os.path.join(directory, filename)
with open(path,"r") as f_in, open(directory+"output.txt","w+") as f_out:
    temp = "  "
    for line in f_in:
        curr_line = line.strip()
        temp += curr_line
        #print("Current line:\n%s\n\ntemp line: %s" % (curr_line, temp))
        if curr_line:
            if temp[-1]==".": #check if sentence is complete
                f_out.write(temp)
                temp = "\n  " #two blank spaces here

它消除了所有的空行，将新行缩进两个空格，并连接挂起的句子，但没有插入必要的空格-因此输出当前看起来像是在单词行和和之间缺少空格

我试图通过将以下代码行更改为以下内容来解决此问题：

temp += " " + curr_line
temp = "\n " #one space instead of two

这是行不通的，我也不知道为什么。这可能是文本本身的问题，但我会检查一下

任何建议都将不胜感激，如果有更好的方式来做我想做的事情，而不是像我写的那样把事情弄得一团糟，那么我也想知道这一点

编辑：我好像已经修好了。我的文本很长，所以一开始我没有注意到有两行被两行空行隔开，所以我试图修复它的尝试没有成功。我将一行移到下面一点，以生成以下循环，它似乎已经修复了它：

for line in f_in:
        curr_line = line.strip()
        #print("Current line:\n%s\n\ntemp line: %s" % (curr_line, temp))
        if curr_line:
            temp += " " + curr_line
            if temp[-1]==".": #check if sentence is complete
                f_out.write(temp)
                temp = "\n "

我还看到下面的一个答案最初包含了一些正则表达式，我想在将来的某个时候我必须了解这一点。

谢谢大家的帮助。

这应该行得通。它的效率和你的一样，但效率要高一点。不使用字符串串接+++=这很慢，而是将不完整的行保存为列表。然后写2个空格，每个不完整的句子都用空格连接，然后换行，这样就简化了只在一行完成时才写

temp = []
with open(path_in, "r") as f_in, open(path_out, "w") as f_out:
    for line in f_in:
        curr_line = line.strip()
        if curr_line:
            temp.append(curr_line)
            if curr_line.endswith('.'):  # write our line
                f_out.write('  ')
                f_out.write(' '.join(temp))
                f_out.write('\n')
                temp.clear()  # reset temp

输出

  This is a sentence. So is this one. And this.
  This is the next line and some lines are like this.

解释会很好，因为OP是新的，并且正在询问如何做一些事情，而不仅仅是给我代码。我自己设法解决了我的问题。我很愚蠢，没有检查我的输入文本，其中两行文本之间有两个空行，一行包含空格，但这也很有帮助。我不知道startswith和endswith方法存在，或者字符串连接很慢。我会研究你的解决方案。谢谢@FHTMitchell.temp+=curr\u行编译吗？您可能需要一个运算符来组合空格和curr_行，例如+。缺少一个加法运算符，是从旧代码复制的。多谢各位@统一过程