org.apache.solr.common.SolrException:错误请求错误请求:http://localhost:8080/solr/update?wt=javabin&版本=2
请帮帮我,伙计们 我试图使用NUTCH对站点进行爬网,但它给了我错误“org.apache.solr.common.SolrException:错误请求错误请求:http://localhost:8080/solr/update?wt=javabin&版本=2,java,solr,hadoop,Java,Solr,Hadoop,请帮帮我,伙计们 我试图使用NUTCH对站点进行爬网,但它给了我错误“java.io.IOException:Job failed!” 我正在运行这个命令“bin/nutch solrindex http://:8080/solr/crawl/crawdb-linkdb crawl/linkdb crawl/segments/*”,我正在使用nutch 1.5.1和solr 3.6.1以及jdk java-7-openjdk-i386和ubuntu 12.04 在hadoop.log中,存在于N
java.io.IOException:Job failed!
”
我正在运行这个命令“bin/nutch solrindex http://:8080/solr/crawl/crawdb-linkdb crawl/linkdb crawl/segments/*
”,我正在使用nutch 1.5.1和solr 3.6.1以及jdk java-7-openjdk-i386和ubuntu 12.04
在hadoop.log中,存在于NUTCH/log文件夹中显示以下内容:
2012-09-13 12:56:10,524 INFO solr.SolrIndexer - SolrIndexer: starting at 2012-09-13 12:56:10
2012-09-13 12:56:10,604 INFO indexer.IndexerMapReduce - IndexerMapReduce: crawldb: crawl/crawldb
2012-09-13 12:56:10,604 INFO indexer.IndexerMapReduce - IndexerMapReduce: linkdb: crawl/linkdb
2012-09-13 12:56:10,604 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20120910160403
2012-09-13 12:56:10,711 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20120910160448
2012-09-13 12:56:10,715 INFO indexer.IndexerMapReduce - IndexerMapReduces: adding segment: crawl/segments/20120910160631
2012-09-13 12:56:10,760 WARN util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2012-09-13 12:56:11,212 INFO plugin.PluginRepository - Plugins: looking in: /home/zapbuild/Nutch/plugins
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - Plugin Auto-activation mode: [true]
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - Registered Plugins:
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - the nutch core extension points (nutch-extensionpoints)
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - Regex URL Normalizer (urlnormalizer-regex)
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - CyberNeko HTML Parser (lib-nekohtml)
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - OPIC Scoring Plug-in (scoring-opic)
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - Basic URL Normalizer (urlnormalizer-basic)
2012-09-13 12:56:11,310 INFO plugin.PluginRepository - Tika Parser Plug-in (parse-tika)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Basic Indexing Filter (index-basic)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Html Parse Plug-in (parse-html)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Anchor Indexing Filter (index-anchor)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - HTTP Framework (lib-http)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Regex URL Filter (urlfilter-regex)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Regex URL Filter Framework (lib-regex-filter)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Pass-through URL Normalizer (urlnormalizer-pass)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Http Protocol Plug-in (protocol-http)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Registered Extension-Points:
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch Protocol (org.apache.nutch.protocol.Protocol)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch URL Filter (org.apache.nutch.net.URLFilter)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch Content Parser (org.apache.nutch.parse.Parser)
2012-09-13 12:56:11,311 INFO plugin.PluginRepository - Nutch Scoring (org.apache.nutch.scoring.ScoringFilter)
2012-09-13 12:56:11,313 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:11,314 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:11,314 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:14,104 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:14,104 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:14,104 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:17,135 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:17,136 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:17,136 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:20,204 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:20,205 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:20,205 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:23,297 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:23,297 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:23,297 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:26,232 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:26,232 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:26,233 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:29,252 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:29,252 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:29,252 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:32,284 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:32,284 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:32,284 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:35,258 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:35,258 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:35,258 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:38,283 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:38,284 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:38,284 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:41,278 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:41,278 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:41,278 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:44,334 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:44,334 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:44,334 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:47,338 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:47,338 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:47,338 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:50,360 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:50,360 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:50,360 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:53,309 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:53,310 INFO anchor.AnchorIndexingFilter - Anchor deduplication is: off
2012-09-13 12:56:53,310 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: content dest: content
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: title dest: title
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: host dest: host
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: segment dest: segment
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: boost dest: boost
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: digest dest: digest
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: tstamp dest: tstamp
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: url dest: id
2012-09-13 12:56:53,357 INFO solr.SolrMappingReader - source: url dest: url
2012-09-13 12:56:53,409 INFO solr.SolrWriter - Indexing 18 documents
2012-09-13 12:56:53,604 WARN mapred.LocalJobRunner - job_local_0001
org.apache.solr.common.SolrException: Missing solr core name in path
Missing solr core name in path
request: http://<host name>:8983/solr/update?wt=javabin&version=2
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.nutch.indexer.solr.SolrWriter.close(SolrWriter.java:142)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:48)
at org.apache.hadoop.mapred.ReduceTask$OldTrackingRecordWriter.close(ReduceTask.java:466)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:530)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:420)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:260)
2012-09-13 12:56:53,981 ERROR solr.SolrIndexer - java.io.IOException: Job failed!
2012-09-13 12:56:10524信息solr.SolrIndexer-SolrIndexer:从2012-09-13 12:56:10开始
2012-09-13 12:56:10604 INFO indexer.IndexerMapReduce-IndexerMapReduce:crawldb:crawl/crawldb
2012-09-13 12:56:10604信息索引器
2012-09-13 12:56:10604 INFO indexer.IndexerMapReduce-IndexerMapReduces:添加段:爬网/段/20120910160403
2012-09-13 12:56:10711信息索引器IndexerMapReduce-索引器MapReduces:添加段:爬网/段/20120910160448
2012-09-13 12:56:10715信息索引器IndexerMapReduce-索引器MapReduces:添加段:爬网/段/20120910160631
2012-09-13 12:56:10760警告util.NativeCodeLoader-无法为您的平台加载本机hadoop库。。。在适用的情况下使用内置java类
2012-09-13 12:56:11212信息plugin.PluginRepository-插件:查找:/home/zapbuild/Nutch/Plugins
2012-09-13 12:56:11310信息插件。插件存储-插件自动激活模式:[正确]
2012-09-13 12:56:11310信息插件.PluginRepository-注册插件:
2012-09-13 12:56:11310信息plugin.PluginRepository-nutch核心扩展点(nutch extensionpoints)
2012-09-13 12:56:11310信息plugin.PluginRepository-正则表达式URL规范化器(urlnormalizer Regex)
2012-09-13 12:56:11310信息plugin.PluginRepository-CyberNeko HTML解析器(lib-nekohtml)
2012-09-13 12:56:11310 INFO plugin.PluginRepository-OPIC评分插件(评分OPIC)
2012-09-13 12:56:11310信息plugin.PluginRepository-基本URL规范化程序
2012-09-13 12:56:11310信息plugin.PluginRepository-Tika解析器插件(parse Tika)
2012-09-13 12:56:11311信息plugin.PluginRepository-基本索引过滤器(索引基本)
2012-09-13 12:56:11311 INFO plugin.PluginRepository-Html解析插件(解析Html)
2012-09-13 12:56:11311信息plugin.PluginRepository-锚索引过滤器(索引锚)
2012-09-13 12:56:11311信息plugin.PluginRepository-HTTP框架(lib-HTTP)
2012-09-13 12:56:11311信息plugin.PluginRepository-Regex URL过滤器(urlfilter Regex)
2012-09-13 12:56:11311信息plugin.PluginRepository-正则表达式URL过滤器框架(lib正则表达式过滤器)
2012-09-13 12:56:11311信息plugin.PluginRepository-通过URL规范化器传递(URL规范化器传递)
2012-09-13 12:56:11311 INFO plugin.PluginRepository-Http协议插件(Http协议)
2012-09-13 12:56:11311信息plugin.PluginRepository-注册扩展点:
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch URL规范化程序(org.apache.Nutch.net.URLNormalizer)
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch协议(org.apache.Nutch.Protocol.Protocol)
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch段合并过滤器(org.apache.Nutch.Segment.SegmentMergeFilter)
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch URL过滤器(org.apache.Nutch.net.URLFilter)
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch索引过滤器(org.apache.Nutch.indexer.IndexingFilter)
2012-09-13 12:56:11311信息plugin.PluginRepository-HTML解析过滤器(org.apache.nutch.Parse.htmlparsersfilter)
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch内容解析器(org.apache.Nutch.parse.Parser)
2012-09-13 12:56:11311信息plugin.PluginRepository-Nutch评分(org.apache.Nutch.Scoring.ScoringFilter)
2012-09-13 12:56:11313 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:11314信息锚定。锚定重复数据消除是:关闭
2012-09-13 12:56:11314 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:14104 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:14104信息锚定。锚定重复数据消除是:关闭
2012-09-13 12:56:14104 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:17135 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:17136信息锚定。锚定重复数据消除是:关闭
2012-09-13 12:56:17136 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:20204 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:20205 INFO-anchor.AnchorIndexingFilter-锚点重复数据消除是:关闭
2012-09-13 12:56:20205 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:23297 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:23297信息锚定。锚定重复数据消除是:关闭
2012-09-13 12:56:23297 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2012-09-13 12:56:26232 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.basic.BasicIndexingFilter
2012-09-13 12:56:26232信息锚定。锚定重复数据消除是:关闭
2012-09-13 12:56:26233 INFO indexer.IndexingFilters-添加org.apache.nutch.indexer.anchor.AnchorIndexingFilter