Apache spark 找不到Spark文件。在本地模式下工作正常,但在群集模式下失败

Apache spark 找不到Spark文件。在本地模式下工作正常,但在群集模式下失败,apache-spark,Apache Spark,大家好,, 我有一个简单的spark应用程序,其中几乎没有spring上下文和规则xml文件。所有这些文件都是项目的一部分,位于资源文件夹(reource\db\rule\rule2.xml)下,在spark本地模式下工作正常。当我在纱线集群模式下运行相同的应用程序时,它抱怨没有找到文件rule2.xml及其Maven构建jar的一部分。是否需要为群集模式指定不同格式的文件?要使应用程序在群集模式下工作,是否需要进行任何更改?任何帮助都将不胜感激 下面是我在其中读取xml文件的代码 JaxbU

大家好,, 我有一个简单的spark应用程序,其中几乎没有spring上下文和规则xml文件。所有这些文件都是项目的一部分,位于资源文件夹(reource\db\rule\rule2.xml)下,在spark本地模式下工作正常。当我在纱线集群模式下运行相同的应用程序时,它抱怨没有找到文件rule2.xml及其Maven构建jar的一部分。是否需要为群集模式指定不同格式的文件?要使应用程序在群集模式下工作,是否需要进行任何更改?任何帮助都将不胜感激

下面是我在其中读取xml文件的代码

 JaxbUtils.unmarshalRule(
            ByteStreams.toByteArray(
            Resources.getResource(String.format("db/rule/rule2.xml", id)).openStream()));

Here is the error log

/24 15:57:07 INFO storage.BlockManager: Registering executor with local external shuffle service.
15/09/24 15:57:07 INFO util.AkkaUtils: Connecting to HeartbeatReceiver: akka.tcp://sparkDriver@bdaolc011node08.sabre.com:40589/user/HeartbeatReceiver
15/09/24 15:57:09 INFO executor.CoarseGrainedExecutorBackend: Got assigned task 0
15/09/24 15:57:09 INFO executor.Executor: Running task 0.0 in stage 0.0 (TID 0)
15/09/24 15:57:09 INFO broadcast.TorrentBroadcast: Started reading broadcast variable 0
15/09/24 15:57:09 INFO storage.MemoryStore: ensureFreeSpace(3132) called with curMem=0, maxMem=555755765
15/09/24 15:57:09 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.1 KB, free 530.0 MB)
15/09/24 15:57:09 INFO storage.BlockManagerMaster: Updated info of block broadcast_0_piece0
15/09/24 15:57:09 INFO broadcast.TorrentBroadcast: Reading broadcast variable 0 took 134 ms
15/09/24 15:57:09 INFO storage.MemoryStore: ensureFreeSpace(6144) called with curMem=3132, maxMem=555755765
15/09/24 15:57:09 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 6.0 KB, free 530.0 MB)
15/09/24 15:57:12 INFO support.ClassPathXmlApplicationContext: Refreshing org.springframework.context.support.ClassPathXmlApplicationContext@3c6db742: startup date [Thu Sep 24 15:57:12 CDT 2015]; root of context hierarchy
15/09/24 15:57:12 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [spring/rules-engine-spring.xml]
15/09/24 15:57:13 INFO xml.XmlBeanDefinitionReader: Loading XML bean definitions from class path resource [spring/ere-spring.xml]
15/09/24 15:57:13 INFO support.DefaultListableBeanFactory: Overriding bean definition for bean 'nativeRuleBuilder': replacing [Generic bean: class [com.sabre.sp.ere.core.loader.DroolsNativeRuleBuilder]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/ere-spring.xml]] with [Generic bean: class [com.sabre.sp.ere.core.loader.DroolsNativeRuleBuilder]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/rules-engine-spring.xml]]
15/09/24 15:57:13 INFO support.DefaultListableBeanFactory: Overriding bean definition for bean 'rulesExecutor': replacing [Generic bean: class [com.sabre.sp.ere.core.executor.DroolsRulesExecutor]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/ere-spring.xml]] with [Generic bean: class [com.sabre.sp.ere.core.executor.DroolsRulesExecutor]; scope=; abstract=false; lazyInit=false; autowireMode=0; dependencyCheck=0; autowireCandidate=true; primary=false; factoryBeanName=null; factoryMethodName=null; initMethodName=null; destroyMethodName=null; defined in class path resource [spring/rules-engine-spring.xml]]
15/09/24 15:57:13 INFO support.PropertySourcesPlaceholderConfigurer: Loading properties file from class path resource [spring/ere-test.properties]
15/09/24 15:57:13 WARN support.PropertySourcesPlaceholderConfigurer: Could not load properties from class path resource [spring/ere-test.properties]: class path resource [spring/ere-test.properties] cannot be opened because it does not exist
15/09/24 15:57:13 INFO support.PropertySourcesPlaceholderConfigurer: Loading properties file from class path resource [spring/ere-spring.properties]
15/09/24 15:57:13 INFO annotation.AutowiredAnnotationBeanPostProcessor: JSR-330 'javax.inject.Inject' annotation found and supported for autowiring
15/09/24 15:57:13 INFO jdbc.JDBCRDD: closed connection
java.lang.IllegalArgumentException: resource spring/rule2.xml not found.
at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115)
at com.google.common.io.Resources.getResource(Resources.java:152)
at com.sabre.rules.AppRuleExecutor.rule(AppRuleExecutor.java:50)
at com.sabre.rules.AppRuleExecutor.executeRules(AppRuleExecutor.java:39)
at com.sabre.rules.RuleComponent.executeRules(RuleComponent.java:43)
at com.sabre.rules.SMAAlertImpl$1.call(SMAAlertImpl.java:60)
at com.sabre.rules.SMAAlertImpl$1.call(SMAAlertImpl.java:37)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.api.java.JavaRDDLike$$anonfun$fn$4$1.apply(JavaRDDLike.scala:143)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.RDD$$anonfun$14.apply(RDD.scala:634)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:277)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:244)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:61)
at org.apache.spark.scheduler.Task.run(Task.scala:64)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:203)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

假设您的代码位于main()中,如下所示:

JaxbUtils.unmarshalRule( ByteStreams.toByteArray( getResources(String.format(args[0],id)).openStream())

因此,您需要按如下方式调用代码:

spark submit--主纱线--部署模式集群--文件db/rule/rule2.xml mySample.jar rule2.xml


它是干什么的?它告诉Thread以集群模式部署代码/jar,并保留容器上“-files”中指定的文件。在集群模式下,可以在集群中的任何一个节点中创建容器。但是,一旦在“-files”中指定了它,您就只需要通过它的名称来引用它,而不是指定完全限定的路径

假设您的代码位于main()中,如下所示:

JaxbUtils.unmarshalRule( ByteStreams.toByteArray( getResources(String.format(args[0],id)).openStream())

因此,您需要按如下方式调用代码:

spark submit--主纱线--部署模式集群--文件db/rule/rule2.xml mySample.jar rule2.xml


它是干什么的?它告诉Thread以集群模式部署代码/jar,并保留容器上“-files”中指定的文件。在集群模式下,可以在集群中的任何一个节点中创建容器。但是,一旦在“-files”中指定了它,您就只需要通过它的名称来引用它,而不是指定完全限定的路径

必须在所有工作节点上访问输入文件。这意味着您必须将文件(例如参见
spark submit
--files
参数)复制到工作节点,或者使用分布式文件系统。在Thread上运行时,必须将本地文件指定为file://eg。file:///tmp/foo.txttried 执行以下代码并得到相同的文件未找到错误。JaxbUtils.unmarshalRule(ByteStreams.toByteArray)(Resources.getResource(String.format()file:///db/rule/rule%d.xml“,id)).openStream());我需要在hdfs中保留相同的文件吗?有关于此问题的更新吗?我被这个问题困住了,如果您能提供任何帮助,我将不胜感激。输入文件必须在所有工作节点上都可以访问。这意味着您必须将文件(例如参见
spark submit
--files
参数)复制到工作节点,或者使用分布式文件系统。在Thread上运行时,必须将本地文件指定为file://eg。file:///tmp/foo.txttried 执行以下代码并得到相同的文件未找到错误。JaxbUtils.unmarshalRule(ByteStreams.toByteArray)(Resources.getResource(String.format()file:///db/rule/rule%d.xml“,id)).openStream());我需要在hdfs中保留相同的文件吗?有关于此问题的更新吗?我有点被这个问题缠住了,如果有任何帮助,我将不胜感激。