Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/azure/13.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在Azure Web App上运行Scrapy/Portia_Azure_Scrapy_Portia - Fatal编程技术网

如何在Azure Web App上运行Scrapy/Portia

如何在Azure Web App上运行Scrapy/Portia,azure,scrapy,portia,Azure,Scrapy,Portia,我正在尝试在Microsoft Azure Web应用程序上运行Scrapy或Portia。 我通过创建虚拟环境安装了Scrapy: D:\Python27\Scripts\virtualenv.exe D:\home\Python 然后安装Scrapy: D:\home\Python\Scripts\pip install Scrapy 这个装置似乎起作用了。但执行spider会返回以下输出: D:\home\Python\Scripts\tutorial>d:\home\pytho

我正在尝试在Microsoft Azure Web应用程序上运行Scrapy或Portia。 我通过创建虚拟环境安装了Scrapy:

D:\Python27\Scripts\virtualenv.exe D:\home\Python
然后安装Scrapy:

D:\home\Python\Scripts\pip install Scrapy
这个装置似乎起作用了。但执行spider会返回以下输出:

D:\home\Python\Scripts\tutorial>d:\home\python\scripts\scrapy.exe crawl example 2015-09-13 23:09:31 [scrapy] INFO: Scrapy 1.0.3 started (bot: tutorial)

2015-09-13 23:09:31 [scrapy] INFO: Optional features available: ssl, http11

2015-09-13 23:09:31 [scrapy] INFO: Overridden settings: {'NEWSPIDER_MODULE': 'tutorial.spiders', 'SPIDER_MODULES': ['tutorial.spiders'], 'BOT_NAME': 'tutorial'}

2015-09-13 23:09:34 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, LogStats, CoreStats, SpiderState

Unhandled error in Deferred:

2015-09-13 23:09:35 [twisted] CRITICAL: Unhandled error in Deferred:

Traceback (most recent call last):

  File "D:\home\Python\lib\site-packages\scrapy\cmdline.py", line 150, in _run_command

    cmd.run(args, opts)

  File "D:\home\Python\lib\site-packages\scrapy\commands\crawl.py", line 57, in run

    self.crawler_process.crawl(spname, **opts.spargs)

  File "D:\home\Python\lib\site-packages\scrapy\crawler.py", line 153, in crawl

    d = crawler.crawl(*args, **kwargs)

  File "D:\home\Python\lib\site-packages\twisted\internet\defer.py", line 1274, in unwindGenerator

    return _inlineCallbacks(None, gen, Deferred())

--- <exception caught here> ---

  File "D:\home\Python\lib\site-packages\twisted\internet\defer.py", line 1128, in _inlineCallbacks

    result = g.send(result)

  File "D:\home\Python\lib\site-packages\scrapy\crawler.py", line 71, in crawl

    self.engine = self._create_engine()

  File "D:\home\Python\lib\site-packages\scrapy\crawler.py", line 83, in _create_engine

    return ExecutionEngine(self, lambda _: self.stop())

  File "D:\home\Python\lib\site-packages\scrapy\core\engine.py", line 66, in __init__

    self.downloader = downloader_cls(crawler)

  File "D:\home\Python\lib\site-packages\scrapy\core\downloader\__init__.py", line 65, in __init__

    self.handlers = DownloadHandlers(crawler)

  File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\__init__.py", line 23, in __init__

    cls = load_object(clspath)

  File "D:\home\Python\lib\site-packages\scrapy\utils\misc.py", line 44, in load_object

    mod = import_module(module)

  File "D:\Python27\Lib\importlib\__init__.py", line 37, in import_module

    __import__(name)

  File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\s3.py", line 6, in <module>

    from .http import HTTPDownloadHandler

  File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\http.py", line 5, in <module>

    from .http11 import HTTP11DownloadHandler as HTTPDownloadHandler

  File "D:\home\Python\lib\site-packages\scrapy\core\downloader\handlers\http11.py", line 15, in <module>

    from scrapy.xlib.tx import Agent, ProxyAgent, ResponseDone, \

  File "D:\home\Python\lib\site-packages\scrapy\xlib\tx\__init__.py", line 3, in <module>

    from twisted.web import client

  File "D:\home\Python\lib\site-packages\twisted\web\client.py", line 42, in <module>

    from twisted.internet.endpoints import TCP4ClientEndpoint, SSL4ClientEndpoint

  File "D:\home\Python\lib\site-packages\twisted\internet\endpoints.py", line 34, in <module>

    from twisted.internet.stdio import StandardIO, PipeAddress

  File "D:\home\Python\lib\site-packages\twisted\internet\stdio.py", line 30, in <module>

    from twisted.internet import _win32stdio

  File "D:\home\Python\lib\site-packages\twisted\internet\_win32stdio.py", line 7, in <module>

    import win32api

exceptions.ImportError: No module named win32api



2015-09-13 23:09:35 [twisted] CRITICAL:
D:\home\Python\Scripts\tutorial>D:\home\Python\Scripts\scrapy.exe爬网示例2015-09-13 23:09:31[scrapy]信息:scrapy 1.0.3已启动(机器人程序:教程)
2015-09-13 23:09:31[scrapy]信息:可选功能:ssl、http11
2015-09-13 23:09:31[scrapy]信息:覆盖的设置:{'NEWSPIDER_模块':'tutorial.SPIDER','SPIDER_模块':['tutorial.SPIDER'],'BOT_NAME':'tutorial'}
2015-09-13 23:09:34[scrapy]信息:启用的扩展:CloseSpider、TelnetConsole、LogStats、CoreStats、SpiderState
延迟中未处理的错误:
2015-09-13 23:09:35[twisted]严重:延迟中未处理的错误:
回溯(最近一次呼叫最后一次):
文件“D:\home\Python\lib\site packages\scrapy\cmdline.py”,第150行,在\u run\u命令中
cmd.run(参数、选项)
文件“D:\home\Python\lib\site packages\scrapy\commands\crawl.py”,第57行,正在运行
self.crawler\u process.crawl(spname,**opts.spargs)
文件“D:\home\Python\lib\site packages\scrapy\crawler.py”,第153行,在爬网中
d=爬网器。爬网(*args,**kwargs)
文件“D:\home\Python\lib\site packages\twisted\internet\defer.py”,第1274行,在unwindGenerator中
return _inlineCallbacks(无、gen、Deferred())
---  ---
文件“D:\home\Python\lib\site packages\twisted\internet\defer.py”,第1128行,在_inlineCallbacks中
结果=g.send(结果)
文件“D:\home\Python\lib\site packages\scrapy\crawler.py”,第71行,在爬网中
self.engine=self.\u创建\u引擎()
文件“D:\home\Python\lib\site packages\scrapy\crawler.py”,第83行,在创建引擎中
返回ExecutionEngine(self,lambda:self.stop())
文件“D:\home\Python\lib\site packages\scrapy\core\engine.py”,第66行,在\uuu init中__
self.downloader=downloader\u cls(爬虫程序)
文件“D:\home\Python\lib\site packages\scrapy\core\downloader\\ uuu init\uuu.py”,第65行,在\uu init中__
self.handlers=下载处理程序(爬虫程序)
文件“D:\home\Python\lib\site packages\scrapy\core\downloader\handlers\\uuuuu init\uuuuu.py”,第23行,在\uuu init中__
cls=加载对象(clspath)
文件“D:\home\Python\lib\site packages\scrapy\utils\misc.py”,第44行,在load\u对象中
mod=导入模块(模块)
文件“D:\Python27\Lib\importlib\\uuuu init\uuuuu.py”,第37行,在导入模块中
__导入(名称)
文件“D:\home\Python\lib\site packages\scrapy\core\downloader\handlers\s3.py”,第6行,在
从.http导入HTTPDownloadHandler
文件“D:\home\Python\lib\site packages\scrapy\core\downloader\handlers\http.py”,第5行,在
从.http11导入HTTP11DownloadHandler作为HTTPDownloadHandler
文件“D:\home\Python\lib\site packages\scrapy\core\downloader\handlers\http11.py”,第15行,在
来自scrapy.xlib.tx进口代理、代理、响应单\
文件“D:\home\Python\lib\site packages\scrapy\xlib\tx\\ uuuuu init\uuuuuu.py”,第3行,在
从twisted.web导入客户端
文件“D:\home\Python\lib\site packages\twisted\web\client.py”,第42行,在
从twisted.internet.endpoints导入TCP4ClientEndpoint、SSL4ClientEndpoint
文件“D:\home\Python\lib\site packages\twisted\internet\endpoints.py”,第34行,在
从twisted.internet.stdio导入标准IO,管道地址
文件“D:\home\Python\lib\site packages\twisted\internet\stdio.py”,第30行,在
从twisted.internet导入_win32stdio
文件“D:\home\Python\lib\site packages\twisted\internet\\u win32stdio.py”,第7行,在
导入win32api
exceptions.ImportError:没有名为win32api的模块
2015-09-13 23:09:35[扭曲]关键:
文档中说我必须安装pywin32。我不知道如何通过命令行下载/安装它,因为我在web应用程序环境中

甚至可以在Azure Web应用程序上运行Scrapy或Portia,还是我必须在Azure上使用成熟的虚拟机


谢谢大家!

您不能在Azure Web应用程序上“运行”通用Windows应用程序。在Azure上作为web应用程序运行的东西必须专门构建才能做到这一点。 因此,您必须在Azure上使用成熟的虚拟机

Azure Webapps似乎可以运行一些Python应用程序,如果它们构建在某些框架上:

请注意,您可以从运行spider(有一个免费计划和免责声明:在那里工作)。然后,您可以使用API或直接转储来获取数据。