Warning: file_get_contents(/data/phpspider/zhask/data//catemap/4/oop/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
当列确实存在时,Pyspark无法解析列名_Pyspark - Fatal编程技术网

当列确实存在时,Pyspark无法解析列名

当列确实存在时,Pyspark无法解析列名,pyspark,Pyspark,我有一些Pyspark代码正在处理一个示例csv BLOB,然后我决定将它指向一个更大的数据集。这一行: df= df.withColumn("TransactionDate", df["TransactionDate"].cast(TimestampType())) 现在抛出此错误: AnalysisException: u'Cannot resolve column name "TransactionDate" among ("TransactionDate","Country ...

我有一些Pyspark代码正在处理一个示例csv BLOB,然后我决定将它指向一个更大的数据集。这一行:

df= df.withColumn("TransactionDate", df["TransactionDate"].cast(TimestampType()))
现在抛出此错误:

AnalysisException: u'Cannot resolve column name "TransactionDate" among ("TransactionDate","Country ...

很明显,TransactionDate作为一列存在于数据集中,那么为什么它突然不起作用呢?

啊,好吧,我想出来了。如果您收到此问题,请检查您的分隔符。在我的新数据集中,它是“where as as In my small sample is was”|

df = spark.read.format(file_type).options(header='true', quote='"', delimiter=",",ignoreLeadingWhiteSpace='true',inferSchema='true').load(file_location)