Python pyspark和pandas对该列的阅读方式不同

Python pyspark和pandas对该列的阅读方式不同,python,pandas,pyspark,Python,Pandas,Pyspark,我有一个数据框,如下所示,由pandas正确读取: 我使用的代码非常简单: import pandas as pd from pyspark.sql import SparkSession spark = SparkSession.builder.getOrCreate() hede = spark.read.parquet(r"C:/users/batuhan.engin/desktop/date=2021-04-01") he = pd.read_parquet(path

我有一个数据框,如下所示,由pandas正确读取:

我使用的代码非常简单:

import pandas as pd
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
hede = spark.read.parquet(r"C:/users/batuhan.engin/desktop/date=2021-04-01")
he = pd.read_parquet(path=r"C:/users/batuhan.engin/desktop/date=2021-04-01", engine='pyarrow')
错误是,Pypark错误地读取了sales_revenue列,我不明白为什么

he[he.product_id == 2461]:
    sales_type sales_channel  sales_quantity  sales_revenue  tax_amount currency  product_id  store_id
27     Regular     Wholesale             6.0     818.500000         NaN     None        2461       300
110    Regular     Wholesale             2.0     272.829987         NaN     None        2461        42
132    Regular     Wholesale            18.0    2475.540039         NaN     None        2461       314
但当我阅读Pyspark的文章时,销售收入栏是不正确的。事实上,我甚至无法想象pyspark是如何在sales_revenue一栏中得出这些值的:

hede.filter("product_id == 2461").show()
+----------+-------------+--------------+-------------+----------+--------+----------+--------+
|sales_type|sales_channel|sales_quantity|sales_revenue|tax_amount|currency|product_id|store_id|
+----------+-------------+--------------+-------------+----------+--------+----------+--------+
|   Regular|    Wholesale|           6.0|   -186969.84|      null|    null|      2461|     300|
|   Regular|    Wholesale|           2.0|       -444.8|      null|    null|      2461|      42|
|   Regular|    Wholesale|          18.0|      6553.92|      null|    null|      2461|     314|
+----------+-------------+--------------+-------------+----------+--------+----------+--------+

有什么想法吗?可能是版本或软件包?

欢迎使用stackoverflow,您能否至少添加在pandas和pyspark数据帧中加载数据的方式?据我所知,如果从同一来源读取数据,没有明显的理由会发生这种情况。您能否共享实际的数据文件(仅示例)?