Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
奇怪/新奇的Python行为。打印字符串会产生与保存到文件不同的输出_Python_Html_File_File Io_Pycharm - Fatal编程技术网

奇怪/新奇的Python行为。打印字符串会产生与保存到文件不同的输出

奇怪/新奇的Python行为。打印字符串会产生与保存到文件不同的输出,python,html,file,file-io,pycharm,Python,Html,File,File Io,Pycharm,我有一些密码。它意味着在一个文件夹上迭代,找到两个点,并删除不在这两个点之间的所有内容,例如 <head> <title>This is bad HTML</title> </head> <body> <h1> Remove me</h1> <div class="title"> <h1> This is the good data, keep me</h1&g

我有一些密码。它意味着在一个文件夹上迭代,找到两个点,并删除不在这两个点之间的所有内容,例如

<head>
   <title>This is bad HTML</title>
</head>
<body>
  <h1> Remove me</h1>
  <div class="title">
    <h1> This is the good data, keep me</h1>

    <p> Keep this text </p>

  </div>
  <div class="footer">
    <h1> Remove me, I am pointless</h1>
  </div>
</body>
这将保存到我的html中,并附加_mod:

<div class="title">
<h1>Title</h1>
</div>
</div>
<div class="footer">

标题
除此之外,尽管标题和页脚div之间有大量的文本,尽管它在控制台中正确打印


我是不是忽略了什么?看起来是个很新奇的虫子

一个旁白:
files=[fn for fn for fn in os.listdir(“C:/Users/FOLDER”)如果fn.endswith('.html')]
另一个旁白:
str(start)
start更可取(更像pythonic)。\uu str(uuuu)(
。此注释
\newContent=content[start:end-1]
让我相信你在某个地方弄乱了
内容
,但没有在发布的代码中显示出来。@StevenRumbalski我已经做了这两个更改,谢谢。尽管注释掉了代码,但我没有做任何更改。所见即所得-所有代码都已发布。我在故障排除时注释掉了变量赋值!:)你的代码在我的机器上运行良好。您确定检查了正确的文件吗?
start=<div class="title">
end=<div class="footer">
import os

dir = os.listdir("C:/Users/FOLDER")

files = []

for file in dir:
    if file[-5:] == '.html':
        files.insert(0, file)

for fileName in files:
    file = open("C:/Users/FOLDER/" + fileName)
    content = file.read()
    file.close()

    start = content.find('<div class="title">')
    end = content.find('<div class="footer">')

    print "Start => " + start.__str__() #This is -1 if nothing found
    print "End =>" + end.__str__()      #Same

    start = start if (start != -1) else 0 #Removing -1 if found
    end = end if (end != -1) else len(content) #Same

    print "Edited start => " + start.__str__() #Verifying -1 changed
    print "Edited end -> " + end.__str__()     #Same

    print "CONTENTS=========>>>>>>>>>>>>>" + content[start:end-1] #prints perfectly fine


    #newContent = content[start:end-1]


    file = open("C:/Users/FOLDER/" + fileName[0:-5] + "_mod" + ".html", 'w')
    file.write(content[start:end]) #Writes only the start and end nodes to file, example below
    file.close()
<div class="title">
<h1>Title</h1>
</div>
</div>
<div class="footer">