Warning: file_get_contents(/data/phpspider/zhask/data//catemap/5/reporting-services/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop Mahout-运行trainnb时出错_Hadoop_Mahout - Fatal编程技术网

Hadoop Mahout-运行trainnb时出错

Hadoop Mahout-运行trainnb时出错,hadoop,mahout,Hadoop,Mahout,使用Mahout-seq2sparse命令,我成功地在HDFS中创建了以下文件夹 df-count dictionary.file-0 frequency.file-0 tf-vectors tfidf-vectors tokenized-documents wordcount 之后,我使用以下语法运行trainnb命令 mahout trainnb -i tweet-vectors -el -li labelindex -o model -ow -c 我得到以下错误。有人知道同样的解决方案

使用Mahout-seq2sparse命令,我成功地在HDFS中创建了以下文件夹

df-count
dictionary.file-0
frequency.file-0
tf-vectors
tfidf-vectors
tokenized-documents
wordcount
之后,我使用以下语法运行trainnb命令

mahout trainnb -i tweet-vectors -el -li labelindex -o model -ow -c
我得到以下错误。有人知道同样的解决方案吗

Exception in thread "main" java.lang.IllegalStateException: hdfs://machineinfo:8020/user/hhhh/tweetvectors/df-count
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:115)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106)
        at com.google.common.collect.Iterators$8.transform(Iterators.java:860)
        at com.google.common.collect.TransformedIterator.next(TransformedIterator.java:48)
        at com.google.common.collect.Iterators$5.hasNext(Iterators.java:597)
        at com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43)
        at org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:122)
        at org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:180)
        at org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:94)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:194)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:616)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:160)
Caused by: java.io.FileNotFoundException: File does not exist: /user/hhhh/tweet-vectors/df-count
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975)
        at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1967)
        at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735)
        at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
        at org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1499)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1486)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1479)
        at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1474)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.<init>(SequenceFileIterator.java:63)
        at org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110)
        ... 22 more
线程“main”java.lang.IllegalStateException中的异常:hdfs://machineinfo:8020/user/hhhh/tweetvectors/df-计数 位于org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:115) 位于org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:106) 位于com.google.common.collect.Iterators$8.transform(Iterators.java:860) 位于com.google.common.collect.transformeditor.next(transformeditor.java:48) 位于com.google.common.collect.Iterators$5.hasNext(Iterators.java:597) 位于com.google.common.collect.ForwardingIterator.hasNext(ForwardingIterator.java:43) 位于org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:122) 位于org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:180) 位于org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:94) 位于org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) 位于org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:64) 在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处 在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)中 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中 位于java.lang.reflect.Method.invoke(Method.java:616) 位于org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) 位于org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) 位于org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:194) 在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处 在sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)中 在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中 位于java.lang.reflect.Method.invoke(Method.java:616) 位于org.apache.hadoop.util.RunJar.main(RunJar.java:160) 原因:java.io.FileNotFoundException:文件不存在:/user/hhh/tweet vectors/df count 位于org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:2006) 在org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1975) 位于org.apache.hadoop.hdfs.DFSClient$DFSInputStream(DFSClient.java:1967) 位于org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:735) 位于org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165) 位于org.apache.hadoop.io.SequenceFile$Reader.openFile(SequenceFile.java:1499) 位于org.apache.hadoop.io.SequenceFile$Reader。(SequenceFile.java:1486) 位于org.apache.hadoop.io.SequenceFile$Reader。(SequenceFile.java:1479) 位于org.apache.hadoop.io.SequenceFile$Reader。(SequenceFile.java:1474) 位于org.apache.mahout.common.iterator.sequencefile.SequenceFileIterator.(SequenceFileIterator.java:63) 位于org.apache.mahout.common.iterator.sequencefile.SequenceFileDirIterator$1.apply(SequenceFileDirIterator.java:110) ... 还有22个
看来mahout无法在HDFS中看到文件
/user/hhh/tweet vectors/df count

首先,尝试
hadoop dfs-ls/user/hhh/tweet vectors/df count
验证文件是否存在。 如果它不存在,那就是你的问题。如果确实存在,请检查它是文件还是目录。mahout似乎在寻找一个文件,而不是一个目录

如果存在并且是一个文件,则验证mahout是否连接到存储该文件的同一hadoop namenode实例