Java 层叠教程字数示例错误

Java 层叠教程字数示例错误,java,hadoop,cascading,Java,Hadoop,Cascading,我现在正在学习级联。现在我在它的官方网站上看第二个教程,它是关于工作计数的例子。我从中复制代码并尝试运行,它总是给我以下错误: Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [[token][com.starscriber.cascadingtest.Main.main(Main.java:44)] unable to resolve

我现在正在学习级联。现在我在它的官方网站上看第二个教程,它是关于工作计数的例子。我从中复制代码并尝试运行,它总是给我以下错误:

Exception in thread "main" cascading.flow.planner.PlannerException: could not build flow from assembly: [[token][com.starscriber.cascadingtest.Main.main(Main.java:44)] 
unable to resolve argument selector: [{1}:'text'], with incoming: [{1}:'doc01        A rain shadow is a dry area on the lee back side of a mountainous area.']] at cascading.flow.planner.FlowPlanner.handleExceptionDuringPlanning(FlowPlanner.java:576)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:263)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:80)
at cascading.flow.FlowConnector.connect(FlowConnector.java:459)
at com.starscriber.cascadingtest.Main.main(Main.java:58)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

Caused by: cascading.pipe.OperatorException: [token][com.starscriber.cascadingtest.Main.main(Main.java:44)] 
unable to resolve argument selector: [{1}:'text'], with incoming: [{1}:'doc01        A rain shadow is a dry area on the lee back side of a mountainous area.']
at cascading.pipe.Operator.resolveArgumentSelector(Operator.java:345)
at cascading.pipe.Each.outgoingScopeFor(Each.java:368)
at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:628)
at cascading.flow.planner.ElementGraph.resolveFields(ElementGraph.java:610)
at cascading.flow.hadoop.planner.HadoopPlanner.buildFlow(HadoopPlanner.java:248)
... 8 more

Caused by: cascading.tuple.FieldsResolverException: 
could not select fields: [{1}:'text'], from: [{1}:'doc01        A rain shadow is a dry area on the lee back side of a mountainous area.']
at cascading.tuple.Fields.indexOf(Fields.java:1008)
at cascading.tuple.Fields.select(Fields.java:1064)
at cascading.pipe.Operator.resolveArgumentSelector(Operator.java:341)
... 12 more
怎么会呢??我复制了与官方Github完全相同的代码,没有任何更改

String docPath = args[0];
String wcPath = args[1];

Properties properties = new Properties();          
AppProps.setApplicationJarClass(properties, Main.class);
HadoopFlowConnector flowConnector = new HadoopFlowConnector(properties);

// create source and sink taps
Tap docTap = new Hfs(new TextDelimited(true, "\t"), docPath);
Tap wcTap = new Hfs(new TextDelimited(true, "\t"), wcPath);

// specify a regex operation to split the "document" text lines into a token stream
Fields token = new Fields("token");
Fields text = new Fields("text");
RegexSplitGenerator splitter = new RegexSplitGenerator(token, "[ \\[\\]\\(\\),.]");
// only returns "token"
Pipe docPipe = new Each("token", text, splitter, Fields.RESULTS);

// determine the word counts
Pipe wcPipe = new Pipe("wc", docPipe);
wcPipe = new GroupBy(wcPipe, token);
wcPipe = new Every(wcPipe, Fields.ALL, new Count(), Fields.ALL);

// connect the taps, pipes, etc., into a flow
FlowDef flowDef = FlowDef.flowDef()
            .setName("wc")
            .addSource(docPipe, docTap)
            .addTailSink(wcPipe, wcTap);

// write a DOT file and run the flow
Flow wcFlow = flowConnector.connect(flowDef);
wcFlow.writeDOT("dot/wc.dot");
wcFlow.complete();
问题出在哪里

这是输入文件:

doc01        A rain shadow is a dry area on the lee back side of a mountainous area.
doc02        This sinking, dry air produces a rain shadow, or area in the lee of a mountain with less rain and cloudcover.
doc03        A rain shadow is an area of dry land that lies on the leeward (or downwind) side of a mountain.
doc04        This is known as the rain shadow effect and is the primary cause of leeward deserts of mountain ranges, such as California's Death Valley.
doc05        Two Women. Secrets. A Broken Land. [DVD Australia]

检查输入文件中两个字段docId和text之间是否有tab。程序需要两个选项卡分隔的字段,但在您的情况下,它将整行读取到一个字段。

正如其他人已经提到的,您需要具有示例所需的相同标题。不要复制代码,而是尝试克隆存储库,这样您就不会出现与文件格式有关的任何错误

请检查输入文件。我的第一行有标题,
doc\u id
text
。我相信
textdimited
需要这些参数,因为第一个参数是
true