Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/image-processing/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python py4j.protocol.Py4JError:调用o112.save时出错_Python_Apache Spark_Pyspark_Apache Spark Sql - Fatal编程技术网

Python py4j.protocol.Py4JError:调用o112.save时出错

Python py4j.protocol.Py4JError:调用o112.save时出错,python,apache-spark,pyspark,apache-spark-sql,Python,Apache Spark,Pyspark,Apache Spark Sql,我正在大学服务器上运行pyspark作业提交: 我的配置是: --主纱线--部署模式群集--num executors 150--executor cores 4--executor memory 28g--driver memory 28g 我的前几个步骤运行正常: df = spark.read.format('csv') \ .option('header',True) \ .option('multiLine

我正在大学服务器上运行pyspark作业提交:

我的配置是:
--主纱线--部署模式群集--num executors 150--executor cores 4--executor memory 28g--driver memory 28g

我的前几个步骤运行正常:

df = spark.read.format('csv') \
                    .option('header',True) \
                    .option('multiLine', True) \
                    .load(data_file)
df.show()

udf_function = udf(stamp, StringType())
new_df = df.withColumn("column_a", udf_function(struct([df[x] for x in df.columns])))
new_df.show()   
当我尝试分别运行以下命令时,会出现两个非常相似的错误:

命令1:

new_df.select("column_a").distinct().show(100)
错误:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "python_stamp.py", line 93, in <module>
    main()
  File "python_stamp.py", line 82, in main
    new_df.select("planning_cluster_id").distinct().show(100)
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/pyspark.zip/pyspark/sql/dataframe.py", line 380, in show
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "python_stamp.py", line 91, in <module>
    main()
  File "python_stamp.py", line 83, in main
    new_df.write.mode("overwrite").format("csv").option("delimiter", ",").option("header", "true").save(save_path)
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/pyspark.zip/pyspark/sql/readwriter.py", line 738, in save
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/protocol.py", line 336, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o112.save
错误:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "python_stamp.py", line 93, in <module>
    main()
  File "python_stamp.py", line 82, in main
    new_df.select("planning_cluster_id").distinct().show(100)
  File "/hadoop4/yarn/nm/usercache/apps/appcache/application_1593105789029_2249545/container_e01_1593105789029_2249545_02_000002/pyspark.zip/pyspark/sql/dataframe.py", line 380, in show
ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "python_stamp.py", line 91, in <module>
    main()
  File "python_stamp.py", line 83, in main
    new_df.write.mode("overwrite").format("csv").option("delimiter", ",").option("header", "true").save(save_path)
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/pyspark.zip/pyspark/sql/readwriter.py", line 738, in save
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/hadoop1/yarn/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/protocol.py", line 336, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o112.save
错误:root:发送命令时异常。
回溯(最近一次呼叫最后一次):
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第1159行,在send_命令中
raise Py4JNetworkError(“来自Java端的答案为空”)
py4j.protocol.Py4JNetworkError:来自Java端的答案为空
在处理上述异常期间,发生了另一个异常:
回溯(最近一次呼叫最后一次):
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第985行,在send_命令中
响应=连接。发送命令(命令)
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第1164行,在send_命令中
“接收时出错”,e,接收时出现协议错误)
py4j.protocol.Py4JNetworkError:接收时出错
回溯(最近一次呼叫最后一次):
文件“python_stamp.py”,第91行,在
main()
文件“python_stamp.py”,第83行,在main中
新的_-df.write.mode(“覆盖”).format(“csv”).option(“分隔符”、“,”).option(“标题”、“真”).save(保存路径)
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/pyspark.zip/pyspark/sql/readwriter.py”,第738行,保存
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第1257行,在__
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/pyspark.zip/pyspark/sql/utils.py”,第63行,装饰图
文件“/hadoop1/thread/nm/usercache/apps/appcache/application_1593105789029_2249417/container_e01_1593105789029_2249417_02_000002/py4j-0.10.7-src.zip/py4j/protocol.py”,第336行,在get_return_值中
py4j.protocol.Py4JError:调用o112.save时出错
有人知道背后的原因吗?我非常确信这不是因为内存错误,因为前面显示表、加载表的步骤都正常运行


其他信息:当我在pyspark shell上运行所有这些命令时,它们运行得非常好。

来自Java端的回答为空
表示JVM崩溃了。您有什么可能的补救措施吗?我的内存用完了吗?