Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/359.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python Spark Droping SparkListenerEvent,因为事件队列中没有剩余空间_Python_Apache Spark_Pyspark_Cloudera Cdh - Fatal编程技术网

Python Spark Droping SparkListenerEvent,因为事件队列中没有剩余空间

Python Spark Droping SparkListenerEvent,因为事件队列中没有剩余空间,python,apache-spark,pyspark,cloudera-cdh,Python,Apache Spark,Pyspark,Cloudera Cdh,我正在使用pyspark运行我的spark任务,当我试图执行以下python脚本时,我遇到了以下错误 from pyspark.sql.functions import count, sum, stddev_pop, mean, length comp_df = sqlContext.sql('SELECT * FROM default.cdr_data') df1 = comp_df.groupBy('number', 'type', 'week').agg(sum('callduratio

我正在使用pyspark运行我的spark任务,当我试图执行以下python脚本时,我遇到了以下错误

from pyspark.sql.functions import count, sum, stddev_pop, mean, length

comp_df = sqlContext.sql('SELECT * FROM default.cdr_data')
df1 = comp_df.groupBy('number', 'type', 'week').agg(sum('callduration').alias('call_sum'), count('callduration').alias('call_count'), sum('iscompethot').alias('call_count_comp'))
df2 = df1.groupBy('number', 'type').agg((stddev_pop('call_sum') / mean('call_sum')).alias('coefficiant_of_variance'), sum('call_count').alias('call_count'), sum('call_count_competitor').alias('call_count_comp'))
错误和警告:

ERROR scheduler.LiveListenerBus: Dropping SparkListenerEvent because no remaining room in event queue. This likely means one of the SparkListeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler.
WARN scheduler.LiveListenerBus: Dropped 1 SparkListenerEvents since Thu Jan 01 05:30:00 IST 1970
接着又出现了一系列的错误

ERROR cluster.YarnScheduler: Lost executor 149 on slv1.cdh-prod.com: Slave lost
WARN scheduler.TaskSetManager: Lost task 12.0 in stage 1.0 (TID 88088, slv1.cdh-prod.com, executor 149): ExecutorLostFailure (executor 149 exited caused by one of the running tasks) Reason: Slave lost
ERROR client.TransportClient: Failed to send RPC 5181111296686128218 to slv2.cdh-prod.com/192.168.x.xx:57156: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(83360,88076,Map(slv3.cdh-prod.com -> 88076, slv1.cdh-prod.com -> 88076, slv2.cdh-prod.com -> 88076),Set()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 5181111296686128218 to slv2.cdh-prod.com/192.168.x.xx:57156: java.nio.channels.ClosedChannelException
        at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:239)
        at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:226)
        at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
        at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:567)
        at io.netty.util.concurrent.DefaultPromise.tryFailure(DefaultPromise.java:424)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetFailure(AbstractChannel.java:801)
        at io.netty.channel.AbstractChannel$AbstractUnsafe.write(AbstractChannel.java:699)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.write(DefaultChannelPipeline.java:1122)
        at io.netty.channel.AbstractChannelHandlerContext.invokeWrite(AbstractChannelHandlerContext.java:633)
        at io.netty.channel.AbstractChannelHandlerContext.access$1900(AbstractChannelHandlerContext.java:32)
        at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.write(AbstractChannelHandlerContext.java:908)
        at io.netty.channel.AbstractChannelHandlerContext$WriteAndFlushTask.write(AbstractChannelHandlerContext.java:960)
        at io.netty.channel.AbstractChannelHandlerContext$AbstractWriteTask.run(AbstractChannelHandlerContext.java:893)
        at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
        at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.nio.channels.ClosedChannelException
我尝试了一些解决方案,
增加了spark.scheduler.listenerbus.eventqueue.size的值,但该值无效。我甚至尝试使用一个小数据集,但仍然出现错误。

Spark的版本是什么?你注意到“自1970年1月1日星期四起05:30:00”中的年份了吗?可能重复:也请检查并投票支持它!这是已知的错误与火花的实际副本。需要使用更高版本的spark 2.3.0除了增加spark.scheduler.listenerbus.eventwueue.size值外,还可以尝试减少分区以避免巨大的混乱,使用coalesce并减少到更小的值,val new_comp_df=comp_df.coalesce(10)。这将有效地减少大量洗牌,您将能够避免此警告Spark的版本是什么?你注意到“自1970年1月1日星期四起05:30:00”中的年份了吗?可能重复:也请检查并投票支持它!这是已知的错误与火花的实际副本。需要使用更高版本的spark 2.3.0除了增加spark.scheduler.listenerbus.eventwueue.size值外,还可以尝试减少分区以避免巨大的混乱,使用coalesce并减少到更小的值,val new_comp_df=comp_df.coalesce(10)。这将有效地减少大量洗牌,您将能够避免此警告