Hadoop 配置单元中的映射端联接出错
我发现,如果要在联接中仅运行映射阶段,则必须设置以下属性。即reduce=0。然后我得到如下错误。如果我将该属性设置为false,则reduce run和join成功发生Hadoop 配置单元中的映射端联接出错,hadoop,join,hive,Hadoop,Join,Hive,我发现,如果要在联接中仅运行映射阶段,则必须设置以下属性。即reduce=0。然后我得到如下错误。如果我将该属性设置为false,则reduce run和join成功发生 hive> set hive.auto.convert.join=true; hive> set hive.mapjoin.smalltable.filesize=(default it will be 25MB); Query returned non-zero code: 1, cause: 'SET hi
hive> set hive.auto.convert.join=true;
hive> set hive.mapjoin.smalltable.filesize=(default it will be 25MB);
Query returned non-zero code: 1, cause: 'SET hive.mapjoin.smalltable.filesize=(default it will be 25MB)' FAILED because hive.mapjoin.smalltable.filesize expects LONG type value.
hive> SELECT /*+ MAPJOIN(expense) */ c.ID, c.NAME, o.AMOUNT, o.DATE FROM emp c CROSS JOIN expense o ON (c.ID = o.emp_ID);
Query ID = acadgild_20161226234949_6ede202c-7f91-42ac-a0c9-3b2617fad0ae
Total jobs = 1
java.io.IOException: Cannot run program "/home/acadgild/hadoop-2.6.0/bin/hadoop" (in directory "/home/acadgild"): error=2, No such file or directory
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at java.lang.Runtime.exec(Runtime.java:620)
at java.lang.Runtime.exec(Runtime.java:450)
at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.executeInChildVM(MapredLocalTask.java:289)
at org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask.execute(MapredLocalTask.java:137)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:160)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1604)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1364)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1177)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1004)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:994)
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:247)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:199)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:410)
at org.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:783)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:677)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:616)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: java.io.IOException: error=2, No such file or directory
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:248)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 23 more
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.mr.MapredLocalTask
hive> set hive.auto.convert.join=false;
hive>
配置单元中的映射侧连接可以通过两种方式执行 通过在join语句中指定关键字“/*+MAPJOINb*/” 通过将以下属性设置为true hive.auto.convert.join=true 我想你是想把这两种方法结合起来。请查看此博客以了解更多信息
希望这有帮助。正如错误所示,程序在/home/acadgild中找不到本机hadoop库/ 您可以做的是,尝试运行以下命令:
cp -r /usr/local/hadoop-2.6.0/ /home/acadgild/
现在进入蜂巢,尝试这样做,它应该会工作