Python 3.x Python UTC到iso格式现在正常工作-强制转换错误
我从Datatricks中的datafactory获取UTC格式的datetime。 我试图将其转换为databricks时间戳并插入数据库Python 3.x Python UTC到iso格式现在正常工作-强制转换错误,python-3.x,pyspark,apache-spark-sql,azure-databricks,Python 3.x,Pyspark,Apache Spark Sql,Azure Databricks,我从Datatricks中的datafactory获取UTC格式的datetime。 我试图将其转换为databricks时间戳并插入数据库 Format that i am receiving: 2020-11-02T01:00:00Z Convert into : 2020-11-02T01:00:00.000+0000 (iso format) 我试图将字符串转换为isoformat() 然后 spark.sql("INSERT INTO test VALUES (1, 1,
Format that i am receiving: 2020-11-02T01:00:00Z
Convert into : 2020-11-02T01:00:00.000+0000 (iso format)
我试图将字符串转换为isoformat()
然后
spark.sql("INSERT INTO test VALUES (1, 1, 'IMPORT','"+ slice_date_time.isoformat() +"','deltaload',0, '0')")
但当我尝试插入它时,我收到了错误:
- 无法安全地强制转换“开始时间”:字符串到时间戳
- 无法安全地强制转换“结束时间”:将字符串转换为时间戳李>
我还试着制作时间戳。但仍然存在相同的错误。使用Spark数据帧API:
# create dataframe
list_data = [(1, '2020-11-02T01:00:00Z'), (2, '2020-11-03T01:00:00Z'), (3, '2020-11-04T01:00:00Z')]
df = spark.createDataFrame(list_data, ['id', 'utc_time'])
# make sure to set your timezone in spark conf
from pyspark.sql.functions import to_timestamp, date_format
spark.conf.set('spark.sql.session.timeZone', 'UTC')
df.select("utc_time").withColumn('iso_time', date_format(to_timestamp(df.utc_time, "yyyy-MM-dd'T'HH:mm:ssXXX"),
"yyyy-MM-dd'T'HH:mm:ss.SSSZ"
).alias('iso_time')).show(10, False)
+--------------------+----------------------------+
|utc_time |iso_time |
+--------------------+----------------------------+
|2020-11-02T01:00:00Z|2020-11-02T01:00:00.000+0000|
|2020-11-03T01:00:00Z|2020-11-03T01:00:00.000+0000|
|2020-11-04T01:00:00Z|2020-11-04T01:00:00.000+0000|
+--------------------+----------------------------+
尝试将iso_时间直接存储到数据库中,如果您的数据库支持不同的日期时间格式,请尝试进行调整
yyyy-MM-dd'HH:MM:ss.SSSZ
# create dataframe
list_data = [(1, '2020-11-02T01:00:00Z'), (2, '2020-11-03T01:00:00Z'), (3, '2020-11-04T01:00:00Z')]
df = spark.createDataFrame(list_data, ['id', 'utc_time'])
# make sure to set your timezone in spark conf
from pyspark.sql.functions import to_timestamp, date_format
spark.conf.set('spark.sql.session.timeZone', 'UTC')
df.select("utc_time").withColumn('iso_time', date_format(to_timestamp(df.utc_time, "yyyy-MM-dd'T'HH:mm:ssXXX"),
"yyyy-MM-dd'T'HH:mm:ss.SSSZ"
).alias('iso_time')).show(10, False)
+--------------------+----------------------------+
|utc_time |iso_time |
+--------------------+----------------------------+
|2020-11-02T01:00:00Z|2020-11-02T01:00:00.000+0000|
|2020-11-03T01:00:00Z|2020-11-03T01:00:00.000+0000|
|2020-11-04T01:00:00Z|2020-11-04T01:00:00.000+0000|
+--------------------+----------------------------+