Hadoop 如何使用Tez引擎修复配置单元中的间歇性文件未找到错误

Hadoop 如何使用Tez引擎修复配置单元中的间歇性文件未找到错误,hadoop,hive,tez,Hadoop,Hive,Tez,当我使用Tez引擎在配置单元中运行查询时,会出现间歇性的FileNotFoundException错误 ERROR : Status: Failed ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1508808910527_45616_1_00, diagnostics=[Task failed, taskId=task_1508808910527_45616_1_00_000066, diagnostics=[TaskAtte

当我使用Tez引擎在配置单元中运行查询时,会出现间歇性的FileNotFoundException错误

ERROR : Status: Failed
ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1508808910527_45616_1_00, diagnostics=[Task failed, taskId=task_1508808910527_45616_1_00_000066, diagnostics=[TaskAttempt 0 failed, info=[Container container_e09_1508808910527_45616_01_000033 finished with diagnostics set to [Container failed, exitCode=-1000. File does not exist: hdfs://server02.corp.company.com:8020/tmp/hive/username/_tez_session_dir/b65ddde9-110e-47fc-ae1c-33a1f754f839/nzcodec.jar
java.io.FileNotFoundException: File does not exist: hdfs://server02.corp.company.com:8020/tmp/hive/username/_tez_session_dir/b65ddde9-110e-47fc-ae1c-33a1f754f839/nzcodec.jar
        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1309)
        at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
        at org.apache.hadoop.yarn.util.FSDownload.copy(FSDownload.java:253)
        at org.apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.java:63)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:361)
        at org.apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.java:359)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:359)
        at org.apache.hadoop.yarn.util.FSDownload.call(FSDownload.java:62)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
查询从暂存表中选择数据,对其重新分区并将其写入报告表

INSERT OVERWRITE TABLE ${reporting_table} PARTITION (day, app_name) select <all the fields> from ${staging_table} where day = '${day}'
我对同一组数据多次运行了相同的查询,失败是间歇性的

我的纱线设置如下所示:

yarn.nodemanager.resource.memory-mb     83968
yarn.scheduler.minimum-allocation-mb    2048
SET hive.execution.engine=tez;
SET tez.am.resource.memory.mb=2048;
SET hive.tez.container.size=2048;

SET hive.merge.tezfiles=true;
SET hive.merge.smallfiles.avgsize=128000000;
SET hive.merge.size.per.task=128000000;
查询中的My Tez设置如下所示:

yarn.nodemanager.resource.memory-mb     83968
yarn.scheduler.minimum-allocation-mb    2048
SET hive.execution.engine=tez;
SET tez.am.resource.memory.mb=2048;
SET hive.tez.container.size=2048;

SET hive.merge.tezfiles=true;
SET hive.merge.smallfiles.avgsize=128000000;
SET hive.merge.size.per.task=128000000;
我已经仔细研究了这些建议,但我仍然看到这个问题。调整容器大小似乎没有帮助


我是否可以修改另一组设置来防止出现这种情况?

您解决过这个问题吗?