Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/html/72.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 刮不刮只爬_Python_Html_Web Scraping_Scrapy - Fatal编程技术网

Python 刮不刮只爬

Python 刮不刮只爬,python,html,web-scraping,scrapy,Python,Html,Web Scraping,Scrapy,我正在用python和scrapy在一个网站上练习,但它给出了这个错误 DEBUG: Crawled (200) <GET http://careers.kfc.com.au/apply/?postcode=2000> (referer: None) 这是日志文件 2017-04-30 14:15:02[scrapy.utils.log]信息:scrapy 1.3.3已启动(bot:scrapybot) 2017-04-30 14:15:02[scrapy.utils.log]信

我正在用python和scrapy在一个网站上练习,但它给出了这个错误

 DEBUG: Crawled (200) <GET http://careers.kfc.com.au/apply/?postcode=2000> (referer: None)
这是日志文件

2017-04-30 14:15:02[scrapy.utils.log]信息:scrapy 1.3.3已启动(bot:scrapybot)
2017-04-30 14:15:02[scrapy.utils.log]信息:覆盖的设置:{'SPIDER_LOADER_WARN_ONLY':True,'log_FILE':'kukur.txt'}
2017-04-30 14:15:02[scrapy.middleware]信息:启用的扩展:
['scrapy.extensions.logstats.logstats',
'scrapy.extensions.telnet.TelnetConsole',
'scrapy.extensions.corestats.corestats']
2017-04-30 14:15:02[剪贴簿中间件]信息:启用的下载程序中间件:
['scrapy.downloaderMiddleware.httpauth.HttpAuthMiddleware',
'scrapy.downloadermiddleware.downloadtimeout.DownloadTimeoutMiddleware',
'scrapy.downloaderMiddleware.defaultheaders.DefaultHeadersMiddleware',
'scrapy.DownloaderMiddleware.useragent.UserAgentMiddleware',
'scrapy.DownloaderMiddleware.retry.RetryMiddleware',
'scrapy.DownloaderMiddleware.redirect.MetaRefreshMiddleware',
'scrapy.DownloaderMiddleware.httpcompression.HttpCompressionMiddleware',
'scrapy.DownloaderMiddleware.redirect.RedirectMiddleware',
“scrapy.DownloaderMiddleware.cookies.CookiesMiddleware”,
'scrapy.downloadermiddleware.stats.DownloaderStats']
2017-04-30 14:15:02[剪贴簿中间件]信息:启用的蜘蛛中间件:
['scrapy.spidermiddleware.httperror.httperror中间件',
'刮皮.SpiderMiddleware.场外.场外Iddleware',
“scrapy.Spidermiddleware.referer.RefererMiddleware”,
'scrapy.spiderMiddleware.urllength.UrlLengthMiddleware',
'scrapy.spidermiddleware.depth.DepthMiddleware']
2017-04-30 14:15:02[碎片中间件]信息:启用的项目管道:
[]
2017-04-30 14:15:02[刮屑.核心.发动机]信息:十字轴已打开
2017-04-30 14:15:02[scrapy.extensions.logstats]信息:爬网0页(0页/分钟),爬网0项(0项/分钟)
2017-04-30 14:15:02[scrapy.extensions.telnet]调试:telnet控制台在127.0.0.1:6023上侦听
2017-04-30 14:15:04[刮屑核心引擎]调试:爬网(200)(参考:无)
2017-04-30 14:15:04[刮屑芯发动机]信息:关闭卡盘(已完成)
2017-04-30 14:15:04[scrapy.statscollectors]信息:倾销scrapy统计数据:
{'downloader/request_bytes':236,
“下载程序/请求计数”:1,
“downloader/request\u method\u count/GET”:1,
“downloader/response_字节”:6478,
“下载程序/响应计数”:1,
“下载程序/响应状态\计数/200”:1,
“完成原因”:“完成”,
“完成时间”:datetime.datetime(2017,4,30,8,45,4704154),
“日志计数/调试”:2,
“日志计数/信息”:7,
“响应\u已接收\u计数”:1,
“调度程序/出列”:1,
“调度程序/出列/内存”:1,
“调度程序/排队”:1,
“调度程序/排队/内存”:1,
“开始时间”:datetime.datetime(2017,4,30,8,45,2,192149)}
2017-04-30 14:15:04[刮屑堆芯发动机]信息:十字轴关闭(完成)

设置选择器、
jobdes
joblink
声明存在问题

以下是初始化它的正确方法:

SET_SELECTOR = 'div.jobs-in-your-area'
jobdes = 'div.job-description div#description p ::text'
joblink = 'div.job-description div.apply-now a.button ::attr(href)'
下面是您在
scrapy shell
中运行的
spider
:以及一个示例输出

>>> # SET_SELECTOR modified
>>> SET_SELECTOR = 'div.jobs-in-your-area'
>>> 
>>> for attr in response.css(SET_SELECTOR):
...     suberbname = 'a.accordion-title.location-title ::text'
...     
...     for nextattr in attr.css('ul.accordion li.accordion-item'):
...         jobdestitle = 'a.accordion-title.job-title ::text'
...         # Jobdes and joblink modified
...         jobdes = 'div.job-description div#description p ::text'
...         joblink = 'div.job-description div.apply-now a.button ::attr(href)'
...         
...         print('SUBERB_NAME: ',attr.css(suberbname).extract_first())
...         print('JOBTITLE: ', nextattr.css(jobdestitle).extract_first())
...         print('JOB_DESCRIP: ', nextattr.css(jobdes).extract())
...         print('JOB_DESCRIP_LINK: ', nextattr.css(joblink).extract_first())
... 
SUBERB_NAME:  Artarmon
JOBTITLE:  Customer Service Team Member
JOB_DESCRIP:  ['Company Information', 'KFC', " is the world's most popular chicken restaurant chain,\xa0specializing in our famous Original Recipe® fried chicken. It all started with one cook who created a finger lickin' good recipe more than ", '75', ' years ago, a list of secret herbs and spices scratched out on the back of the door to his kitchen. That cook was\xa0', 'Colonel Harland Sanders', ", of course, and today we still follow his formula for success, with real cooks breading and freshly preparing our delicious chicken by hand. Our aim is to put a smile on people's faces around the world and give every customer a special experience on each occasion. Our vision is that our jobs will be the best in the world for those committed to serving great food and looking after customers better than anyone else.", 'The Role', 'Customer Service Team Members are responsible for ensuring the provision of fresh, quality products, friendly and efficient service and maintaining clean and well-presented facilities for our valued customers!', 'Requirements/ key selection criteria', 'Experience', 'No experience necessary as full Training will be provided to all employees. Retail Traineeships are also available for employees who meet the required criteria.', 'Benefits:', "Working with KFC will give you financial independence, you'll receive recognition for your efforts and gain skills to set you on your career path. KFC is a place where good things happen as soon as you walk through the door.", 'Company Information', 'KFC', " is the world's most popular chicken restaurant chain,\xa0specializing in our famous Original Recipe® fried chicken. It all started with one cook who created a finger lickin' good recipe more than ", '75', ' years ago, a list of secret herbs and spices scratched out on the back of the door to his kitchen. That cook was\xa0', 'Colonel Harland Sanders', ", of course, and today we still follow his formula for success, with real cooks breading and freshly preparing our delicious chicken by hand. Our aim is to put a smile on people's faces around the world and give every customer a special experience on each occasion. Our vision is that our jobs will be the best in the world for those committed to serving great food and looking after customers better than anyone else.", 'The Role', 'Food Service Team Members consistently prepare high quality food products that create irresistible tastes for our customers whilst maintaining clean and well-presented facilities.', 'Requirements/ key selection criteria', 'Experience', 'No experience necessary as full training will be provided to all employees. Retail Traineeships are also available for employees who meet the required criteria.', 'Benefits:', "Working with KFC will give you financial independence, you'll receive recognition for your efforts and gain skills to set you on your career path. KFC is a place where good things happen as soon as you walk through the door."]
JOB_DESCRIP_LINK:  http://applynow.net.au/jobs/KFC553-customer-service-team-member
注意:使用和当
刮削时,使
调试
快得多


SET\u选择器
jobdes
joblink
声明存在问题

以下是初始化它的正确方法:

SET_SELECTOR = 'div.jobs-in-your-area'
jobdes = 'div.job-description div#description p ::text'
joblink = 'div.job-description div.apply-now a.button ::attr(href)'
下面是您在
scrapy shell
中运行的
spider
:以及一个示例输出

>>> # SET_SELECTOR modified
>>> SET_SELECTOR = 'div.jobs-in-your-area'
>>> 
>>> for attr in response.css(SET_SELECTOR):
...     suberbname = 'a.accordion-title.location-title ::text'
...     
...     for nextattr in attr.css('ul.accordion li.accordion-item'):
...         jobdestitle = 'a.accordion-title.job-title ::text'
...         # Jobdes and joblink modified
...         jobdes = 'div.job-description div#description p ::text'
...         joblink = 'div.job-description div.apply-now a.button ::attr(href)'
...         
...         print('SUBERB_NAME: ',attr.css(suberbname).extract_first())
...         print('JOBTITLE: ', nextattr.css(jobdestitle).extract_first())
...         print('JOB_DESCRIP: ', nextattr.css(jobdes).extract())
...         print('JOB_DESCRIP_LINK: ', nextattr.css(joblink).extract_first())
... 
SUBERB_NAME:  Artarmon
JOBTITLE:  Customer Service Team Member
JOB_DESCRIP:  ['Company Information', 'KFC', " is the world's most popular chicken restaurant chain,\xa0specializing in our famous Original Recipe® fried chicken. It all started with one cook who created a finger lickin' good recipe more than ", '75', ' years ago, a list of secret herbs and spices scratched out on the back of the door to his kitchen. That cook was\xa0', 'Colonel Harland Sanders', ", of course, and today we still follow his formula for success, with real cooks breading and freshly preparing our delicious chicken by hand. Our aim is to put a smile on people's faces around the world and give every customer a special experience on each occasion. Our vision is that our jobs will be the best in the world for those committed to serving great food and looking after customers better than anyone else.", 'The Role', 'Customer Service Team Members are responsible for ensuring the provision of fresh, quality products, friendly and efficient service and maintaining clean and well-presented facilities for our valued customers!', 'Requirements/ key selection criteria', 'Experience', 'No experience necessary as full Training will be provided to all employees. Retail Traineeships are also available for employees who meet the required criteria.', 'Benefits:', "Working with KFC will give you financial independence, you'll receive recognition for your efforts and gain skills to set you on your career path. KFC is a place where good things happen as soon as you walk through the door.", 'Company Information', 'KFC', " is the world's most popular chicken restaurant chain,\xa0specializing in our famous Original Recipe® fried chicken. It all started with one cook who created a finger lickin' good recipe more than ", '75', ' years ago, a list of secret herbs and spices scratched out on the back of the door to his kitchen. That cook was\xa0', 'Colonel Harland Sanders', ", of course, and today we still follow his formula for success, with real cooks breading and freshly preparing our delicious chicken by hand. Our aim is to put a smile on people's faces around the world and give every customer a special experience on each occasion. Our vision is that our jobs will be the best in the world for those committed to serving great food and looking after customers better than anyone else.", 'The Role', 'Food Service Team Members consistently prepare high quality food products that create irresistible tastes for our customers whilst maintaining clean and well-presented facilities.', 'Requirements/ key selection criteria', 'Experience', 'No experience necessary as full training will be provided to all employees. Retail Traineeships are also available for employees who meet the required criteria.', 'Benefits:', "Working with KFC will give you financial independence, you'll receive recognition for your efforts and gain skills to set you on your career path. KFC is a place where good things happen as soon as you walk through the door."]
JOB_DESCRIP_LINK:  http://applynow.net.au/jobs/KFC553-customer-service-team-member
注意:使用和当
刮削时,使
调试
快得多