Warning: file_get_contents(/data/phpspider/zhask/data//catemap/0/hadoop/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Hadoop spark submit,客户端无法通过:[令牌,KERBEROS]进行身份验证;_Hadoop_Apache Spark_Kerberos - Fatal编程技术网

Hadoop spark submit,客户端无法通过:[令牌,KERBEROS]进行身份验证;

Hadoop spark submit,客户端无法通过:[令牌,KERBEROS]进行身份验证;,hadoop,apache-spark,kerberos,Hadoop,Apache Spark,Kerberos,我用kerberos设置了hadoop集群,但当我运行spark submit时,它抛出了异常 17/10/19 08:46:53 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.92.4, executor 1): java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.secu

我用kerberos设置了hadoop集群,但当我运行spark submit时,它抛出了异常

17/10/19 08:46:53 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, 192.168.92.4, executor 1): java.io.IOException: Failed on local exception: java.io.IOException: org.apache.hadoop.security.AccessControlException: Client cannot authenticate via:[TOKEN, KERBEROS]; Host Details : local host is: "slave2/192.168.92.4"; destination host is: "master.hadoop":9000; 
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
    at org.apache.hadoop.ipc.Client.call(Client.java:1479)
    at org.apache.hadoop.ipc.Client.call(Client.java:1412)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
    at com.sun.proxy.$Proxy15.getBlockLocations(Unknown Source)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
    at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
    at com.sun.proxy.$Proxy16.getBlockLocations(Unknown Source)
    at org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226)
    at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
    at org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
    at org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:306)
    at org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:272)
    at org.apache.hadoop.hdfs.DFSInputStream.<init>(DFSInputStream.java:264)
    at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304)
    at org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
    at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
    at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312)
    at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
    at org.apache.hadoop.mapred.LineRecordReader.<init>(LineRecordReader.java:109)
    at org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:246)
    at org.apache.spark.rdd.HadoopRDD$$anon$1.<init>(HadoopRDD.scala:245)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:203)
    at org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:748)
17/10/19 08:46:53 WARN scheduler.TaskSetManager:在0.0阶段丢失任务0.0(TID 0,192.168.92.4,executor 1):java.io.IOException:本地异常失败:java.io.IOException:org.apache.hadoop.security.AccessControlException:客户端无法通过:[令牌,KERBEROS]进行身份验证;主机详细信息:本地主机为:“slave2/192.168.92.4”;目标主机是:“master.hadoop”:9000;
位于org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
位于org.apache.hadoop.ipc.Client.call(Client.java:1479)
位于org.apache.hadoop.ipc.Client.call(Client.java:1412)
位于org.apache.hadoop.ipc.protobufrpceengine$Invoker.invoke(protobufrpceengine.java:229)
位于com.sun.proxy.$Proxy15.getBlockLocations(未知源)
位于org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getBlockLocations(ClientNamenodeProtocolTranslatorPB.java:255)
在sun.reflect.NativeMethodAccessorImpl.invoke0(本机方法)处
位于sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)中
位于java.lang.reflect.Method.invoke(Method.java:498)
位于org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
位于org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
位于com.sun.proxy.$Proxy16.getBlockLocations(未知源)
位于org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1226)
位于org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1213)
位于org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1201)
位于org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:306)
位于org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:272)
位于org.apache.hadoop.hdfs.DFSInputStream。(DFSInputStream.java:264)
位于org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:1526)
位于org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:304)
位于org.apache.hadoop.hdfs.DistributedFileSystem$3.doCall(DistributedFileSystem.java:299)
位于org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
位于org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:312)
位于org.apache.hadoop.fs.FileSystem.open(FileSystem.java:769)
位于org.apache.hadoop.mapred.LineRecordReader。(LineRecordReader.java:109)
位于org.apache.hadoop.mapred.TextInputFormat.getRecordReader(TextInputFormat.java:67)
位于org.apache.spark.rdd.HadoopRDD$$anon$1.liftedTree1$1(HadoopRDD.scala:246)
位于org.apache.spark.rdd.HadoopRDD$$anon$1。(HadoopRDD.scala:245)
位于org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:203)
位于org.apache.spark.rdd.HadoopRDD.compute(HadoopRDD.scala:94)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:323)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:287)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:323)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:287)
在org.apache.spark.rdd.MapPartitionsRDD.compute上(MapPartitionsRDD.scala:38)
在org.apache.spark.rdd.rdd.computeOrReadCheckpoint(rdd.scala:323)上
位于org.apache.spark.rdd.rdd.iterator(rdd.scala:287)
位于org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
位于org.apache.spark.scheduler.Task.run(Task.scala:108)
位于org.apache.spark.executor.executor$TaskRunner.run(executor.scala:335)
位于java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
位于java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
运行(Thread.java:748)
我看到了kerberos的日志,当spark应用程序运行时,只有主服务器向KDC发送身份验证请求。从服务器没有向KDC发送身份验证请求。

也许您可以使用“spark提交--master Thread--部署模型客户端--keytab$keytab_文件_路径--principal$your_principal

这个命令似乎只适用于纱线客户端模式

也许您可以使用“spark submit--master Thread--deploy model client--keytab$keytab\u file\u path--principal$your\u principal


这个命令似乎只适用于客户机模式

您是如何设置kerberos的?通过kinit或spark submit附带的文件?您正在尝试哪种spark执行模式?本地、独立、纱线客户端、纱线集群?Spark的版本?从哪个发行版?Hadoop的版本?从哪个发行版?@ThiagoBaldim@SamsonScharfrichter非常感谢。当我使用“client”作为部署模式参数时出现异常。我已经解决了这个问题。我更改了spark submit命令的参数。如下所示:
--master warn--deploy mode cluster--keytab/etc/krb5.keytab--principal根/bigdataserver03@EXAMPLE.COM
您是如何设置kerberos的?通过kinit或spark submit附带的文件?您正在尝试哪种spark执行模式?本地、独立、纱线客户端、纱线集群?Spark的版本?从哪个发行版?Hadoop的版本?从哪个发行版?@ThiagoBaldim@SamsonScharfrichter非常感谢。当我使用“client”作为部署模式参数时出现异常。我已经解决了这个问题。我更改了spark submit命令的参数。如下所示:
--master warn--deploy mode cluster--keytab/etc/krb5.keytab--principal root/bigdataserver03@EXAMPLE.COM