Python Scrapy在spider中从mongodb获取数据_Python_Scrapy

Python Scrapy在spider中从mongodb获取数据

python scrapy

Python Scrapy在spider中从mongodb获取数据,python,scrapy,Python,Scrapy,我已经创建了一个spider，它可以从列表页面刮取网站产品。在我的spider中，是否有任何方式可以连接到mongodb。获取存储的url的列表，并刮取这些url的谢谢。您可以从spider本身的mongodb导入URL from pymongo import MongoClient() import scrapy class Myspider(scrapy.Spider): def __init__(self): self.db = MongoClient()

我已经创建了一个

spider

，它可以从列表页面刮取网站

产品

。在我的

spider

中，是否有任何方式可以连接到

mongodb

。获取存储的

url的列表

，并刮取这些url的

谢谢。

您可以从

spider

本身的

mongodb

导入

URL

from pymongo import MongoClient()
import scrapy

class Myspider(scrapy.Spider):

    def __init__(self):
        self.db = MongoClient() #you can add db-url and port as parameter to MongoClient(), localhost by default
        self.urls = self.db.db_name.collection.find() #use appropriate finding criteria here according to the structure of data resides in that collection

    def parse(self, response):
        # other codes
        for url in self.urls: # self.urls refers to the url's fetched from db
            #do operations with the urls

您可以在

spider

本身中从

mongodb

导入

url

from pymongo import MongoClient()
import scrapy

class Myspider(scrapy.Spider):

    def __init__(self):
        self.db = MongoClient() #you can add db-url and port as parameter to MongoClient(), localhost by default
        self.urls = self.db.db_name.collection.find() #use appropriate finding criteria here according to the structure of data resides in that collection

    def parse(self, response):
        # other codes
        for url in self.urls: # self.urls refers to the url's fetched from db
            #do operations with the urls

您可以在

spider

本身中从

mongodb

导入

url

from pymongo import MongoClient()
import scrapy

class Myspider(scrapy.Spider):

    def __init__(self):
        self.db = MongoClient() #you can add db-url and port as parameter to MongoClient(), localhost by default
        self.urls = self.db.db_name.collection.find() #use appropriate finding criteria here according to the structure of data resides in that collection

    def parse(self, response):
        # other codes
        for url in self.urls: # self.urls refers to the url's fetched from db
            #do operations with the urls

您可以在

spider

本身中从

mongodb

导入

url

from pymongo import MongoClient()
import scrapy

class Myspider(scrapy.Spider):

    def __init__(self):
        self.db = MongoClient() #you can add db-url and port as parameter to MongoClient(), localhost by default
        self.urls = self.db.db_name.collection.find() #use appropriate finding criteria here according to the structure of data resides in that collection

    def parse(self, response):
        # other codes
        for url in self.urls: # self.urls refers to the url's fetched from db
            #do operations with the urls

您可以从db导入并在spider中使用该URL。您可以从db导入并在spider中使用该URL。您可以从db导入并在spider中使用该URL。您可以从db导入并在spider中使用该URL。