为什么python脚本必须运行两次？_Python_Input_Web Scraping_Output

为什么python脚本必须运行两次？

python input web-scraping

为什么python脚本必须运行两次？,python,input,web-scraping,output,Python,Input,Web Scraping,Output,我编写了这个python脚本来收集web数据，并将输出打印到一个单独的文件中。 “refID.txt”文件有一个ID列表，对于每个ID，必须从站点提取数据。输出将打印到“output.txt”文件中。这是我的密码 import urllib import re referencefile = open("refID.txt") IDlist = referencefile.read() refIDlist = IDlist.split("\n")

我编写了这个python脚本来收集web数据，并将输出打印到一个单独的文件中。 “refID.txt”文件有一个ID列表，对于每个ID，必须从站点提取数据。输出将打印到“output.txt”文件中。这是我的密码

import urllib
import re

referencefile = open("refID.txt")

IDlist = referencefile.read()

refIDlist = IDlist.split("\n")

f = open("output.txt", 'w')

i=0
while i<len(refIDlist):
  url = "http://www.ncbi.nlm.nih.gov/clinvar/variation/"+refIDlist[i]
  htmlfile = urllib.urlopen(url)
  htmltext = htmlfile.read()
  regex = '<dt>Variant type:</dt><dd>(.+?)</dd>'
  pattern = re.compile(regex)
  Vtype = re.findall(pattern,htmltext)
  vt = Vtype[0]
  printing = "Variation type of " + refIDlist[i] + " is " + str(vt)
  print >> f, printing
  i+=1

导入urllib
进口稀土
referencefile=open（“refID.txt”）
IDlist=referencefile.read（）
refIDlist=IDlist.split（“\n”）
f=打开（“output.txt”，“w”）
i=0
当i>f时，打印
i+=1

我的问题是，要在“output.txt”文件中打印输出，代码必须运行两次。如果脚本运行一次，则不会打印输出。但是，如果代码第二次运行，输出将被打印。当代码只运行一次时，如何打印输出？

尝试使用

打开（'output.txt'，w'）作为f:

然后是要在打开的文件上运行的代码。这将自动关闭它。请参见

如果要处理文件，应始终记住关闭文件，以确保正确读取和写入数据，并确保释放资源

import urllib
import re

with open("refID.txt", 'r') as referencefile:
    IDlist = referencefile.read()

refIDlist = IDlist.split("\n")

with open("output.txt", 'w') as f:
    i=0
    while i<len(refIDlist):
      url = "http://www.ncbi.nlm.nih.gov/clinvar/variation/"+refIDlist[i]
      htmlfile = urllib.urlopen(url)
      htmltext = htmlfile.read()
      regex = '<dt>Variant type:</dt><dd>(.+?)</dd>'
      pattern = re.compile(regex)
      Vtype = re.findall(pattern,htmltext)
      vt = Vtype[0]
      printing = "Variation type of " + refIDlist[i] + " is " + str(vt)
      print >> f, printing
      i+=1

导入urllib
进口稀土
以open（“refID.txt”，“r”）作为引用文件：
IDlist=referencefile.read（）
refIDlist=IDlist.split（“\n”）
将open（“output.txt”，“w”）作为f：
i=0
当i>f时，打印
i+=1

我没有编写f.close（）和reference file.close（），而是用

with

语句打开了这两个文件。这是处理文件时的最佳做法，因为它会在文件超出范围时自动关闭文件。有关with语句的详细信息，请参阅。

是否调用f.close（）？我不确定这一点，但我知道XlsxWriter在关闭流之前不会将数据写入文件。也许在再次调用open（）之前，您的数据一直保存在内存中？我没有。我一定要这样做吗？我会试试看。我不确定打印是什么功能，但是你试过把它换成写（打印）？只向文件写入一次比每次循环都写入一次更有效。@Will-I在最后添加了f.close（）。而且它有效！谢谢我试试这个。非常感谢。这不管用。出现以下错误消息-“警告：'with'将成为Python 2.6中的保留关键字”您正在使用的Python版本。这是标准。python2.7和升级到python2.7或更高版本。如果更新到Python3及更高版本，则必须将print语句更改为print方法，例如print（）