Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Apache spark Spark:在群集模式下运行时,如何显示log4j记录器输出到控制台?_Apache Spark_Apache Spark Sql - Fatal编程技术网

Apache spark Spark:在群集模式下运行时,如何显示log4j记录器输出到控制台?

Apache spark Spark:在群集模式下运行时,如何显示log4j记录器输出到控制台?,apache-spark,apache-spark-sql,Apache Spark,Apache Spark Sql,我有一个属性文件,我正在纱线簇模式下使用--files提交给spark submit [myserviceuser@XXX.XXX.XXX.XXX]$ cat testprop.prop name:aiman country:india 我打算从该文件中读取属性值,并使用log4jlogger将其显示在屏幕上。 我正在使用以下内容提交带有--文件的作业 spark-submit \ --class org.main.ReadLocalFile \ --master yarn \ --deplo

我有一个属性文件,我正在纱线簇模式下使用
--files
提交给
spark submit

[myserviceuser@XXX.XXX.XXX.XXX]$ cat testprop.prop
name:aiman
country:india
我打算从该文件中读取属性值,并使用
log4j
logger将其显示在屏幕上。
我正在使用以下内容提交带有--文件的作业

spark-submit \
--class org.main.ReadLocalFile \
--master yarn \
--deploy-mode cluster \
--files testprop.prop#testprop.prop \
spark_cluster_file_read-0.0.1.jar
作业正在运行到完成,并显示成功消息,但我无法在控制台上看到输出。
客户机模式下运行时,我能够读取testprop.prop文件并显示输出,但在集群模式下运行时,我无法读取。我猜,登录到控制台在集群模式下不起作用。然后我应该如何登录到控制台?
以下是我正在使用的代码:

package org.main;
导入java.io.InputStream;
导入java.util.Properties;
导入org.apache.log4j.LogManager;
导入org.apache.log4j.Logger;
导入org.apache.spark.sql.SparkSession;
导入org.xml.sax.InputSource;
导入scala.xml.Source;
公共类ReadLocalFile{
公共静态void main(字符串args[])引发异常
{
最终记录器日志=LogManager.getLogger(ReadLocalFile.class);
ConsoleAppender logConsole=新的ConsoleAppender();
log.addAppender(日志控制台);
SparkSession spark=SparkSession.builder().master(“纱线”).config(“spark.submit.deployMode”,“cluster”).getOrCreate();
Properties prop=新属性();
InputStream in=null;
试一试{
InputSource propFile=Source.fromFile(“testprop.prop”);
in=propFile.getByTestStream();
道具荷载(in);
}
捕获(例外e){
e、 printStackTrace();
log.error(“================引发的异常==============”);
系统出口(1);
}  
log.info(“============================Value:“+prop.getProperty(“name”)”);
spark.close();
}
}
记录如下:

SPARK_MAJOR_VERSION is set to 2, using Spark2
19/07/25 07:59:50 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
19/07/25 07:59:51 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
19/07/25 07:59:51 INFO O: Set a new configuration for the first time.
19/07/25 07:59:51 INFO d: Method not implemented in this version of Hadoop: org.apache.hadoop.fs.FileSystem$Statistics.getBytesReadLocalHost
19/07/25 07:59:51 INFO deprecation: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id
19/07/25 07:59:51 INFO u: Scheduling statistics report every 2000 millisecs
19/07/25 07:59:52 INFO RequestHedgingRMFailoverProxyProvider: Looking for the active RM in [rm1, rm2]...
19/07/25 07:59:52 INFO RequestHedgingRMFailoverProxyProvider: Found active RM [rm2]
19/07/25 07:59:52 INFO Client: Requesting a new application from cluster with 24 NodeManagers
19/07/25 07:59:52 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (102400 MB per container)
19/07/25 07:59:52 INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
19/07/25 07:59:52 INFO Client: Setting up container launch context for our AM
19/07/25 07:59:52 INFO Client: Setting up the launch environment for our AM container
19/07/25 07:59:52 INFO Client: Preparing resources for our AM container
19/07/25 07:59:52 INFO HadoopFSCredentialProvider: getting token for: hdfs://meldstg/user/myserviceuser
19/07/25 07:59:52 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 7451415 for myserviceuser on ha-hdfs:meldstg
19/07/25 07:59:54 INFO metastore: Trying to connect to metastore with URI thrift://XXX.XXX.XXX:9083
19/07/25 07:59:54 INFO metastore: Connected to metastore.
19/07/25 07:59:55 INFO HiveCredentialProvider: Get Token from hive metastore: Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 1a 65 62 64 70 62 75 73 73 40 43 41 42 4c 45 2e 43 4f 4d 43 41 53 54 2e 43 4f 4d 04 68 69 76 65 00 8a 01 6c 28 24 c8 e0 8a 01 6c 4c 31 4c e0 8e 82 98 8e 03 08
19/07/25 07:59:55 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs://meldstg/hdp/apps/2.6.3.20-2/spark2/spark2-hdp-yarn-archive.tar.gz
19/07/25 07:59:55 INFO Client: Source and destination file systems are the same. Not copying hdfs://meldstg/hdp/apps/2.6.3.20-2/spark2/spark2-hdp-yarn-archive.tar.gz
19/07/25 07:59:55 INFO Client: Uploading resource file:/home/myserviceuser/aiman/spark_cluster_file_read-0.0.1-SNAPSHOT-jar-with-dependencies.jar -> hdfs://meldstg/user/myserviceuser/.sparkStaging/application_1563540853319_78111/spark_cluster_file_read-0.0.1-SNAPSHOT-jar-with-dependencies.jar
19/07/25 07:59:56 INFO Client: Uploading resource file:/home/myserviceuser/aiman/testprop.prop#testprop.prop -> hdfs://meldstg/user/myserviceuser/.sparkStaging/application_1563540853319_78111/testprop.prop
19/07/25 07:59:56 INFO Client: Uploading resource file:/tmp/spark-bcf53d4d-1bac-47f4-87d6-2e35c0e8b501/__spark_conf__7386751978371777143.zip -> hdfs://meldstg/user/myserviceuser/.sparkStaging/application_1563540853319_78111/__spark_conf__.zip
19/07/25 07:59:56 INFO SecurityManager: Changing view acls to: myserviceuser
19/07/25 07:59:56 INFO SecurityManager: Changing modify acls to: myserviceuser
19/07/25 07:59:56 INFO SecurityManager: Changing view acls groups to:
19/07/25 07:59:56 INFO SecurityManager: Changing modify acls groups to:
19/07/25 07:59:56 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(myserviceuser); groups with view permissions: Set(); users  with modify permissions: Set(myserviceuser); groups with modify permissions: Set()
19/07/25 07:59:56 INFO Client: Submitting application application_1563540853319_78111 to ResourceManager
19/07/25 07:59:56 INFO YarnClientImpl: Submitted application application_1563540853319_78111
19/07/25 07:59:57 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 07:59:57 INFO Client:
         client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
         diagnostics: AM container is launched, waiting for AM container to Register with RM
         ApplicationMaster host: N/A
         ApplicationMaster RPC port: -1
         queue: orion
         start time: 1564041596720
         final status: UNDEFINED
         tracking URL: http://XXXX.XXXX.XXX/proxy/application_1563540853319_78111/
         user: myserviceuser
19/07/25 07:59:58 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 07:59:59 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 08:00:00 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 08:00:01 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 08:00:02 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 08:00:03 INFO Client: Application report for application_1563540853319_78111 (state: ACCEPTED)
19/07/25 08:00:04 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:04 INFO Client:
         client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
         diagnostics: N/A
         ApplicationMaster host: XXX.XXX.XXX.XXX
         ApplicationMaster RPC port: 0
         queue: orion
         start time: 1564041596720
         final status: UNDEFINED
         tracking URL: http://XXXX.XXXX.XXX/proxy/application_1563540853319_78111/
         user: myserviceuser
19/07/25 08:00:05 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:06 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:07 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:08 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:09 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:10 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:11 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:12 INFO Client: Application report for application_1563540853319_78111 (state: RUNNING)
19/07/25 08:00:13 INFO Client: Application report for application_1563540853319_78111 (state: FINISHED)
19/07/25 08:00:13 INFO Client:
         client token: Token { kind: YARN_CLIENT_TOKEN, service:  }
         diagnostics: N/A
         ApplicationMaster host: XXX.XXX.XXX.XXX
         ApplicationMaster RPC port: 0
         queue: orion
         start time: 1564041596720
         final status: SUCCEEDED
         tracking URL: http://XXXX.XXXX.XXX/proxy/application_1563540853319_78111/
         user: myserviceuser
19/07/25 08:00:14 INFO ShutdownHookManager: Shutdown hook called
19/07/25 08:00:14 INFO ShutdownHookManager: Deleting directory /tmp/spark-bcf53d4d-1bac-47f4-87d6-2e35c0e8b501

我哪里出错了?

您不能在群集模式下打印到控制台,因为驱动程序可能永远不会在启动应用程序的同一节点上。您必须检查纱线/资源管理器历史记录中的日志

您是否尝试过使用
SparkFiles.get(“testprop.prop”)
读取文件,该文件提供了文件路径我可以使用
Source.fromFile()
读取文件。问题是我无法将其显示到控制台。让我修改一下这篇文章的标题。如果您是在集群模式下运行的,请转到spark hiostory服务器,查看STDout,它将与任何其他println语句一起打印。我还没有看到spark应用程序在集群模式下的控制台上打印日志。Okkzzz…..因此,如果我需要将执行日志(以及代码中的
log.info()
)返回到启动它的客户端计算机,那么我应该怎么做呢?我从Thread cli文档开始,您可以下载日志。