Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python PySpark通过从十进制列中删除刻度来写入csv_Python_Apache Spark_Pyspark_Apache Spark Sql - Fatal编程技术网

Python PySpark通过从十进制列中删除刻度来写入csv

Python PySpark通过从十进制列中删除刻度来写入csv,python,apache-spark,pyspark,apache-spark-sql,Python,Apache Spark,Pyspark,Apache Spark Sql,我想将数据帧写入CSV文件,并在写入文件时删除比例值。如果刻度为00else,则随刻度一起书写 例如,假设我有下面的数据帧 +--------------+--------+--------+ |id |日期|金额| +--------------+--------+--------+ |12345678911.00|11/24/17| 203.02| |12345678911.00|11/24/17| 13.52| |12345678912.00|11/24/17| 28.00| |

我想将数据帧写入CSV文件,并在写入文件时删除比例值。如果刻度为
00
else,则随刻度一起书写

例如,假设我有下面的数据帧

+--------------+--------+--------+
|id |日期|金额|
+--------------+--------+--------+
|12345678911.00|11/24/17|  203.02|
|12345678911.00|11/24/17|   13.52|
|12345678912.00|11/24/17|   28.00|
|12345678913.00|11/24/17|  105.71|
|12345678914.00|11/24/17|    7.68|
|12345678915.00|11/24/17|    0.18|
|12345678916.00|11/24/17|  530.38|
|12345678917.00|11/24/17|    1.79|
|12345678918.00|11/24/17|   35.00|
|12345678923.00|11/24/17|    8.84|
|12345678922.00|11/24/17|  150.47|
|12345678922.00|11/24/17|   39.00|
|12345678925.00|11/24/17|    6.46|
|12345678927.00|11/24/17|    2.59|
|12345678928.00|11/24/17|   25.00|
|12345678929.00|11/24/17|   44.04|
|12345678930.00|11/24/17|    3.90|
|12345678933.00|11/24/17|   50.00|
|12345678932.00|11/24/17|    6.26|
|12345678931.00|11/24/17|   10.90|
+--------------+--------+--------+
当我们写入csv文件时,对于十进制类型的所有列,输出应该从每行中删除
.00
。 我们通过传递模式从csv读取值,在模式中定义了几列
DecimalType

预期产出将为

+--------------+--------+--------+
|id |日期|金额|
+--------------+--------+--------+
|12345678911   |11/24/17|  203.02|
|12345678911   |11/24/17|   13.52|
|12345678912   |11/24/17|      28|
|12345678913   |11/24/17|  105.71|
|12345678914   |11/24/17|    7.68|
|12345678915   |11/24/17|    0.18|
|12345678916   |11/24/17|  530.38|
|12345678917   |11/24/17|    1.79|
|12345678918   |11/24/17|      35|
|12345678923   |11/24/17|    8.84|
|12345678922   |11/24/17|  150.47|
|12345678922   |11/24/17|      39|
|12345678925   |11/24/17|    6.46|
|12345678927   |11/24/17|    2.59|
|12345678928   |11/24/17|      25|
|12345678929   |11/24/17|   44.04|
|12345678930   |11/24/17|    3.90|
|12345678933   |11/24/17|      50|
|12345678932   |11/24/17|    6.26|
|12345678931   |11/24/17|   10.90|
+--------------+--------+--------+ 
我试着把它转换成StringType,但没有多大帮助。
欢迎您提出任何建议。

将结果替换为
regexp\u replace
,并将输出写入字符串是您能得到的最接近的结果:

from pyspark.sql.functions import regexp_replace

df = spark.createDataFrame(
    [("203.02", ), ("0.18", ), ("3", ), ("10.90", )],
    ("amount", )
).withColumn("amount", col("amount").cast("decimal(38, 2)"))

df.select(regexp_replace("amount", "\\.00$", "")).show()
#+-------------------------------+
#|regexp_replace(amount, \.00$, )|
#+-------------------------------+
#|                         203.02|
#|                           0.18|
#|                              3|
#|                          10.90|
#+-------------------------------+

将结果替换为
regexp\u replace
并将输出写入字符串是最接近您的结果:

from pyspark.sql.functions import regexp_replace

df = spark.createDataFrame(
    [("203.02", ), ("0.18", ), ("3", ), ("10.90", )],
    ("amount", )
).withColumn("amount", col("amount").cast("decimal(38, 2)"))

df.select(regexp_replace("amount", "\\.00$", "")).show()
#+-------------------------------+
#|regexp_replace(amount, \.00$, )|
#+-------------------------------+
#|                         203.02|
#|                           0.18|
#|                              3|
#|                          10.90|
#+-------------------------------+

也可以去掉额外的零,即,我想将其保存为
10.9
,而不是
10.90
“(\\.00$)(?是否可以去掉额外的零,即,我想将其保存为
10.9
”(\\.00$)(?