Python scrapy在运行时无法导入模块';它在我的蟒蛇床上

Python scrapy在运行时无法导入模块';它在我的蟒蛇床上,python,import,scrapy,pythonpath,Python,Import,Scrapy,Pythonpath,我有一个功能性的scrapy项目,然后我决定清理它。为了做到这一点,我把我的数据库模块从我的项目的零碎部分拿出来,我不能再包含它了。现在,该项目如下所示: myProject/ database/ __init__.py model.py databaseFactory.py myScrapy/ __init__.py settings.py myScrapy/

我有一个功能性的scrapy项目,然后我决定清理它。为了做到这一点,我把我的数据库模块从我的项目的零碎部分拿出来,我不能再包含它了。现在,该项目如下所示:

myProject/
    database/
        __init__.py
        model.py
        databaseFactory.py
    myScrapy/
        __init__.py
        settings.py
        myScrapy/
            __init__.py
            pipeline.py
        spiders/
            spiderA.py
            spiderB.py
    api/
        __init__.py
    config/
        __init__.py
(仅显示与我的问题相关的文件) 我想在scrapy中使用databaseFactory

我在.bashrc中添加了以下行:

PYTHONPATH=$PYTHONPATH:my/path/to/my/project
export PYTHONPATH
因此,当启动ipython时,我可以做以下事情:

In [1]: import database.databaseFactory as databaseFactory

In [2]: databaseFactory
Out[2]: <module 'database.databaseFactory' from '/my/path/to/my/project/database/databaseFactory.pyc'>
我可以享受以下信息:

Traceback (most recent call last):
  File "/usr/local/bin/scrapy", line 11, in <module>
    sys.exit(execute())
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 143, in execute
    _run_print_help(parser, _run_command, cmd, args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 89, in _run_print_help
    func(*a, **kw)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/cmdline.py", line 150, in _run_command
    cmd.run(args, opts)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py", line 60, in run
    self.crawler_process.start()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 92, in start
    if self.start_crawling():
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 124, in start_crawling
    return self._start_crawler() is not None
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 139, in _start_crawler
    crawler.configure()
  File "/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py", line 47, in configure
    self.engine = ExecutionEngine(self, self._spider_closed)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/engine.py", line 65, in __init__
    self.scraper = Scraper(crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/core/scraper.py", line 66, in __init__
    self.itemproc = itemproc_cls.from_crawler(crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 50, in from_crawler
    return cls.from_settings(crawler.settings, crawler)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py", line 29, in from_settings
    mwcls = load_object(clspath)
  File "/usr/local/lib/python2.7/dist-packages/scrapy/utils/misc.py", line 42, in load_object
    raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'myScrapy.pipelines.QueueExportPipe': No module named database.databaseFactory
回溯(最近一次呼叫最后一次):
文件“/usr/local/bin/scrapy”,第11行,在
sys.exit(execute())
文件“/usr/local/lib/python2.7/dist packages/scrapy/cmdline.py”,执行中的第143行
_运行\u打印\u帮助(解析器、\u运行\u命令、cmd、args、opts)
文件“/usr/local/lib/python2.7/dist packages/scrapy/cmdline.py”,第89行,在“运行”和“打印”帮助中
func(*a,**千瓦)
文件“/usr/local/lib/python2.7/dist packages/scrapy/cmdline.py”,第150行,在_run_命令中
cmd.run(参数、选项)
文件“/usr/local/lib/python2.7/dist-packages/scrapy/commands/crawl.py”,第60行,正在运行
self.crawler_进程.start()
文件“/usr/local/lib/python2.7/dist-packages/scrapy/crawler.py”,第92行,开头
如果self.start_crawling():
文件“/usr/local/lib/python2.7/dist packages/scrapy/crawler.py”,第124行,开始爬网
返回self.\u start\u crawler()不是None
文件“/usr/local/lib/python2.7/dist packages/scrapy/crawler.py”,第139行,在“开始”爬虫程序中
crawler.configure()
文件“/usr/local/lib/python2.7/dist packages/scrapy/crawler.py”,第47行,在configure中
self.engine=ExecutionEngine(self,self.\u spider\u关闭)
文件“/usr/local/lib/python2.7/dist packages/scrapy/core/engine.py”,第65行,在__
self.scraper=铲运机(履带式)
文件“/usr/local/lib/python2.7/dist packages/scrapy/core/scraper.py”,第66行,在__
self.itemproc=itemproc\u cls.from\u爬虫程序(爬虫程序)
文件“/usr/local/lib/python2.7/dist-packages/scrapy/middleware.py”,第50行,来自爬虫程序
返回cls.from_设置(crawler.settings,crawler)
文件“/usr/local/lib/python2.7/dist packages/scrapy/middleware.py”,第29行,在from_设置中
mwcls=加载对象(clspath)
文件“/usr/local/lib/python2.7/dist packages/scrapy/utils/misc.py”,第42行,在load\u对象中
引发导入错误(“加载对象“%s”时出错:%s”%(路径,e))
ImportError:加载对象“myScrapy.pipelines.QueueExportPipe”时出错:没有名为database.databaseFactory的模块

为什么scrapy忽略了我的Pytopath?我现在怎么办?我真的不想在代码中使用sys.path.append()

您必须告诉python您的PYTHONPATH:

export PYTHONPATH=/path/to/myProject/
然后运行scrapy:

sudo scrapy crawl spiderName 2> error.log

默认情况下,在使用sudo启动命令时,不会使用普通上下文,因此会忘记PYTHONPATH。要使PYTHONPATH与sudo配合使用,请执行以下步骤:

  • 将PYTHONPATH添加到sudoers文件中的默认值env_keep+=“ENV1 ENV2…”
  • 删除默认值!sudoers文件中的环境重置(如果存在)
使用“sys.path.append()”有什么不对?我尝试了许多其他方法,并确定“scrapy”不支持用户定义包的“$PYTHONPATH”。我怀疑它会在框架通过查找阶段后加载目录。但是我尝试了“sys.path.append()”,它正在工作


Jun

我已经在.bashrc文件中完成了。如果你是对的,而我在启动废料之前,在控制台中不小心做了这件事,我试了一下。当然它不起作用,因为它没有改变任何东西
sudo scrapy crawl spiderName 2> error.log