Python 文件读写增加额外的最后一个数字_Python_Python 2.7

Python 文件读写增加额外的最后一个数字

python python-2.7

Python 文件读写增加额外的最后一个数字,python,python-2.7,Python,Python 2.7,我为我爸爸写了一个快速而草率的python脚本，以便从给定文件夹中读取文本文件，并用特定格式替换顶行。对于加号（+）和逗号（，）的混合，我深表歉意。这样做的目的是为了替换如下内容： Sounding: BASF CPT-1 Depth: 1.05 meter(s) 比如说： Tempo(ms); Amplitude(cm/s) Valores provisorios da Sismica; Profundidade[m] = 1.05 我以为我已经解

我为我爸爸写了一个快速而草率的python脚本，以便从给定文件夹中读取文本文件，并用特定格式替换顶行。对于加号（+）和逗号（，）的混合，我深表歉意。这样做的目的是为了替换如下内容：

Sounding: BASF CPT-1          
   Depth:   1.05 meter(s)

比如说：

Tempo(ms); Amplitude(cm/s)      Valores provisorios da Sismica; Profundidade[m] =  1.05

我以为我已经解决了所有的问题，直到我爸爸提到所有的文本文件都在新行中重复了最后一个数字。以下是一些输出示例：

-声誉不足，无法发布超过2个链接，抱歉

这是我的密码：

TIME    AMPLITUDE  
(ms)


#imports
import glob, inspect, os, re
from sys import argv

#work
is_correct = False
succeeded = 0
failed = 0

while not is_correct:
    print "Please type the folder name: "
    folder_name = raw_input()
    full_path = os.path.dirname(os.path.abspath(__file__)) + "\\" + folder_name + "\\"
    print "---------Looking in the following folder: " + full_path
    print "Is this correct? (Y/N)"
    confirm_answer = raw_input()

    if confirm_answer == 'Y':
        is_correct = True
    else:
        is_correct = False

files_list = glob.glob(full_path + "\*.txt")
print "Files found: ", files_list

for file_name in files_list:
    new_header = "Tempo(ms); Amplitude(cm/s)      Valores provisorios da Sismica; Profundidade[m] ="
    current_file = open(file_name, "r+")
    print "---------Looking at: " + current_file.name
    file_data = current_file.read()
    current_file.close()

    match = re.search("Depth:\W(.+)\Wmeter", file_data)
    if match:
        new_header = new_header + str(match.groups(1)[0]) + "\n"
        print "Depth captured: ", match.groups()
        print "New header to be added: ", new_header
    else:
        print "Match failed!"

    match_replace = re.search("(Sounding.+\s+Depth:.+\s+TIME\s+AMPLITUDE\s+.+\s+)   \d", file_data)
    if match_replace:
        print "Replacing text ..."
        text_to_replace = match_replace.group(1)
        print "SANITY CHECK - Text found: ", text_to_replace
        new_data = file_data.replace(text_to_replace, new_header)
        current_file = open(file_name, "r+")
        current_file.write(new_data)
        current_file.close()
        succeeded = succeeded + 1
    else:
        print "Text not found!"
        failed = failed + 1

    # this was added after I noticed the mysterious repeated number (quick fix)
    # why do I need this?
    lines = file(file_name, 'r').readlines() 
    del lines[-1] 
    file(file_name, 'w').writelines(lines) 

print "--------------------------------"
print "RESULTS"
print "--------------------------------"
print "Succeeded: " , succeeded
print "Failed: ", failed
    #template -- new_data = file_data.replace("Sounding: BASF CPT-1\nDepth:  29.92 meter(s)\nTIME    AMPLITUDE  \n(ms)\n\n")

我到底做错了什么？我不知道为什么会在末尾添加额外的数字（正如您在上面的“修改的文本文件-断开”链接中看到的）。我相信这很简单，但我没有看到。如果要复制中断的输出，只需注释掉以下行：

    lines = file(file_name, 'r').readlines() 
    del lines[-1] 
    file(file_name, 'w').writelines(lines)

问题是，当您要将新数据写入文件时，您正在以模式

r+

打开文件，这意味着“打开文件进行读写，从开头开始”。然后，代码从开头开始将数据写入文件。但是，您的新数据比文件中已有的数据短，而且由于文件没有被截断，因此额外的数据位会留在文件末尾

快速解决方案：在

if match\u replace:

部分中，更改此行：

current_file = open(file_name, "r+")

为此：

current_file = open(file_name, "w")

这将以写入模式打开文件，并在写入之前截断文件。我刚刚测试过，效果很好。

哦，解释得很好。我删除了我的答案，完全走错了方向。@aaron_world_traveler:这就解决了问题！谢谢你。@ragnarswanson-没问题，我的荣幸！