Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/cassandra/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
使用Cassandra在Hadoop上运行Nutch时出现空指针异常_Hadoop_Cassandra_Nutch - Fatal编程技术网

使用Cassandra在Hadoop上运行Nutch时出现空指针异常

使用Cassandra在Hadoop上运行Nutch时出现空指针异常,hadoop,cassandra,nutch,Hadoop,Cassandra,Nutch,我在Hadoop集群上运行Nutch,爬网的数据存储在Cassandra集群中。运行Nutch作业时,出现以下错误: java.lang.NullPointerException at org.apache.avro.util.Utf8.<init>(Utf8.java:38) at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100) at org.apache.ha

我在Hadoop集群上运行Nutch,爬网的数据存储在Cassandra集群中。运行Nutch作业时,出现以下错误:

java.lang.NullPointerException
    at org.apache.avro.util.Utf8.<init>(Utf8.java:38)
    at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
    at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
    at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
    at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:396)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)

您有一个与安全权限相关的异常。例如,检查文件夹的权限。我刚刚发现我正在运行“爬网”,这是不推荐使用的。运行“爬虫”修复了它!
$HADOOP_HOME/bin/hadoop jar /nutch/apache-nutch-2.2.1.job org.apache.nutch.crawl.Crawl urls -dir crawl -depth 3 -topN 5