Warning: file_get_contents(/data/phpspider/zhask/data//catemap/1/amazon-web-services/12.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Amazon web services 保存到拼花地板时发生火花错误:索引器:从空面板弹出_Amazon Web Services_Apache Spark_Pyspark_Amazon Emr - Fatal编程技术网

Amazon web services 保存到拼花地板时发生火花错误:索引器:从空面板弹出

Amazon web services 保存到拼花地板时发生火花错误:索引器:从空面板弹出,amazon-web-services,apache-spark,pyspark,amazon-emr,Amazon Web Services,Apache Spark,Pyspark,Amazon Emr,使用AWS EMR服务运行带有纱线的Spark cluster时,我遇到此错误: ERROR:root:Exception while sending command. Traceback (most recent call last): File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7

使用AWS EMR服务运行带有纱线的Spark cluster时,我遇到此错误:

ERROR:root:Exception while sending command.
Traceback (most recent call last):
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1159, in send_command
    raise Py4JNetworkError("Answer from Java side is empty")
py4j.protocol.Py4JNetworkError: Answer from Java side is empty

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 985, in send_command
    response = connection.send_command(command)
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1164, in send_command
    "Error while receiving", e, proto.ERROR_ON_RECEIVE)
py4j.protocol.Py4JNetworkError: Error while receiving
Traceback (most recent call last):
  File "process_ecommerce.py", line 131, in <module>
    cfg["partitions"]["info"]
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/__pyfiles__/spark_utils.py", line 10, in save_dataframe
    .parquet(path)
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/pyspark.zip/pyspark/sql/readwriter.py", line 844, in parquet
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __call__
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/pyspark.zip/pyspark/sql/utils.py", line 63, in deco
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/protocol.py", line 336, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling o343.parquet
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:36063)
Traceback (most recent call last):
  File "/mnt/yarn/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 929, in _get_connection
    connection = self.deque.pop()
IndexError: pop from an empty deque
错误:root:发送命令时异常。
回溯(最近一次呼叫最后一次):
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第1159行,在send_命令中
raise Py4JNetworkError(“来自Java端的答案为空”)
py4j.protocol.Py4JNetworkError:来自Java端的答案为空
在处理上述异常期间,发生了另一个异常:
回溯(最近一次呼叫最后一次):
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第985行,在send_命令中
响应=连接。发送命令(命令)
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第1164行,在send_命令中
“接收时出错”,e,接收时出现协议错误)
py4j.protocol.Py4JNetworkError:接收时出错
回溯(最近一次呼叫最后一次):
文件“process_ecommerce.py”,第131行,在
cfg[“分区”][“信息”]
文件“/mnt/thread/usercache/hadoop/appcache/application\u 1594292341949\u 0004/container\u 1594292341949\u 0004\u 01\u000001/\uuuuuuuuuuupyfiles\uuuuuuuuuuuu/spark\u utils.py”,保存数据框第10行
.拼花地板(小径)
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/pyspark.zip/pyspark/sql/readwriter.py”,第844行,拼花地板
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第1257行,在__
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/pyspark.zip/pyspark/sql/utils.py”,第63行,装饰
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/protocol.py”,第336行,在get_return_值中
py4j.protocol.Py4JError:调用o343.parquet时出错
错误:py4j.java_网关:尝试连接到java服务器时出错(127.0.0.1:36063)
回溯(最近一次呼叫最后一次):
文件“/mnt/thread/usercache/hadoop/appcache/application_1594292341949_0004/container_1594292341949_0004_01_000001/py4j-0.10.7-src.zip/py4j/java_gateway.py”,第929行,在get_连接中
连接=self.deque.pop()
索引器:从空文件中弹出
我运行的集群有1个主节点和20个r5.2xlarge类型的从节点。它们每个都有8个CPU和64gb。SPARK的配置为:

  • 20GB执行器存储器
  • 30 gb执行器内存开销
  • 每个执行器8芯
  • 每个任务1个cpu

如何解决此错误?

您如何触发此错误?什么都不做,该过程处理一个样本,但处理整个数据集时出现此错误而失败。@Shadowtrooper:您能够解决此问题吗?在使用Pyspark将大文件写入s3时,我遇到了类似的错误。提前谢谢。这是遗嘱执行人的决议问题。尝试为每个执行器设置更多内存。您如何触发该错误?无,该过程处理一个样本,但在整个数据集上失败,并出现此错误。@Shadowtrooper:您能够解决此问题吗?在使用Pyspark将大文件写入s3时,我遇到了类似的错误。提前谢谢。这是遗嘱执行人的决议问题。尝试为每个执行器设置更多内存。