Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop 卡桑德拉上的Nutch 2.3.1';开始_Hadoop_Cassandra_Nutch - Fatal编程技术网

Hadoop 卡桑德拉上的Nutch 2.3.1';开始

Hadoop 卡桑德拉上的Nutch 2.3.1';开始,hadoop,cassandra,nutch,Hadoop,Cassandra,Nutch,我正试着和卡桑德拉一起运行Nutch2.3.1。按照上面的步骤。最后,当我尝试使用命令启动nutch时: bin/crawl urls/ test http://localhost:8983/solr/ 2 我得到了以下例外: GeneratorJob: starting GeneratorJob: filtering: false GeneratorJob: normalizing: false GeneratorJob: topN: 50000 GeneratorJob: java.lan

我正试着和卡桑德拉一起运行Nutch2.3.1。按照上面的步骤。最后,当我尝试使用命令启动nutch时:

bin/crawl urls/ test http://localhost:8983/solr/ 2
我得到了以下例外:

GeneratorJob: starting
GeneratorJob: filtering: false
GeneratorJob: normalizing: false
GeneratorJob: topN: 50000
GeneratorJob: java.lang.RuntimeException: job failed: name=[test]generate: 1454483370-31180, jobid=job_local1380148534_0001
    at     org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:120)
    at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:227)
    at org.apache.nutch.crawl.GeneratorJob.generate(GeneratorJob.java:256)
    at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:322)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.nutch.crawl.GeneratorJob.main(GeneratorJob.java:330)

Error running:
  /home/user/apache-nutch-2.3.1/runtime/local/bin/nutch generate -D    mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D mapred.reduce.tasks.speculative.execution=false -D mapred.map.tasks.speculative.execution=false -D mapred.compress.map.output=true -topN 50000 -noNorm -noFilter -adddays 0 -    crawlId webmd -batchId 1454483370-31180
Failed with exit value 255.
当我检查logs/hadoop.log时,下面是错误消息:

2016-02-03 15:18:14,741 ERROR connection.HConnectionManager - Could not start connection pool for host localhost(127.0.0.1):9160
...
2016-02-03 15:18:15,185 ERROR store.CassandraStore - All host pools marked down. Retry burden pushed out to client.
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked down. Retry burden pushed out to client.
    at me.prettyprint.cassandra.connection.HConnectionManager.getClientFromLBPolicy(HConnectionManager.java:390)
但我的cassandra服务器已启动:

runtime/local$ netstat -l |grep 9160
tcp        0      0 172.16.230.130:9160     *:*                     LISTEN 

有人能在这个问题上提供帮助吗?谢谢。

卡桑德拉的地址不是
localhost
,而是
172.16.230.130
。这就是Nutch无法连接到Cassandra商店的原因

希望这有帮助

勒库克多