Python 将URL标题保存在文本文件中_Python_Scrapy

Python 将URL标题保存在文本文件中

python scrapy

Python 将URL标题保存在文本文件中,python,scrapy,Python,Scrapy,您好，我有下面的剪贴代码，我想保存在一个文件中提供的URL的所有标题，但它只保存lasturl3的标题 from scrapy.spider import BaseSpider from scrapy.selector import Selector from scrapy.http import HtmlResponse from kirt.items import KirtItem class KirtSpider(BaseSpider):

您好，我有下面的剪贴代码，我想保存在一个文件中提供的URL的所有标题，但它只保存lasturl3的标题

    from scrapy.spider import BaseSpider
    from scrapy.selector import Selector
    from scrapy.http import HtmlResponse
    from kirt.items import KirtItem 

    class KirtSpider(BaseSpider):

        name = "spider-name"

        allowed_domains = ["url1","url2","url3"]

        start_urls = ["url1","url2","url3"]


    def parse(self,response):

        sel = Selector(response)
        title = str(sel.xpath('//title/text()').extract())

        with open('alltitles.txt','w') as f:
            f.seek(0)
            f.write(title)

问题就在这里，有两种不同的方式：

    with open('alltitles.txt','w') as f:
        f.seek(0)
        f.write(title)

以“w”模式打开文件不仅会打开该文件，而且如果已经有具有该名称的文件，它会首先删除该文件。您应该改为使用模式“a”打开文件，如果现有文件存在，则在现有文件中添加新行

但是，在这之后，还可以调用f.seek0，它将文件写入指针倒回文件的开头，并使其覆盖当前文件内容。这段代码更像：

    with open('alltitles.txt','a') as f:
        # write out the title and add a newline.
        f.write(title + "\n")

问题就在这里，有两种不同的方式：

    with open('alltitles.txt','w') as f:
        f.seek(0)
        f.write(title)

但是，在这之后，还可以调用f.seek0，它将文件写入指针倒回文件的开头，并使其覆盖当前文件内容。这段代码更像：

    with open('alltitles.txt','a') as f:
        # write out the title and add a newline.
        f.write(title + "\n")

检查你的缩进-我猜你想用。。。。条款缩进了一个止损点，是吗？对不起，没让你明白。哦，抓住你了。让我check@bgporter我只是打错了。我的源文件很好，现在也在这里更正了。你知道我如何保存所有的标题而不是一个吗。。。我感觉它每次都在保存标题并更新它。检查你的缩进-我猜你想用。。。。条款缩进了一个止损点，是吗？对不起，没让你明白。哦，抓住你了。让我check@bgporter我只是打错了。我的源文件很好，现在也在这里更正了。你知道我如何保存所有的标题而不是一个吗。。。我感觉每次都在保存标题并更新它。宾果谢谢，有一件事我也忘了先提到我们也要把标题转换成字符串。宾果谢谢，有一件事我也忘了先提到我们也要把标题转换成字符串。