Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/313.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/3/apache-spark/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 怪异值错误ValueError:以10为基数的int()的文本无效:'&引用;634&x201D';_Python_Apache Spark_Pyspark - Fatal编程技术网

Python 怪异值错误ValueError:以10为基数的int()的文本无效:'&引用;634&x201D';

Python 怪异值错误ValueError:以10为基数的int()的文本无效:'&引用;634&x201D';,python,apache-spark,pyspark,Python,Apache Spark,Pyspark,当字符串无法解析为整数时,通常会出现无效文本。但这个错误很奇怪。我在使用pyspark时得到了这个。 代码如下所示 %pyspark from pyspark import SQLContext from pyspark.sql.types import StructField from pyspark.sql.types import StructType from pyspark.sql.types import * from datetime import datetime lines =

当字符串无法解析为整数时,通常会出现无效文本。但这个错误很奇怪。我在使用pyspark时得到了这个。 代码如下所示

%pyspark
from pyspark import SQLContext
from pyspark.sql.types import StructField
from pyspark.sql.types import StructType
from pyspark.sql.types import *
from datetime import datetime
lines = sc.textFile("s3://citibike-nyc/nyc-bike-data/FinalMerge.csv")
def mk_int(s):
    s = s.strip()


    return int(s) if s else 0


# Each line is converted to a tuple.
header = lines.first()
data = lines.filter(lambda l: l!=header)
data=data.map(lambda l:l.split(","))
trip_data = data.map(lambda p: (mk_int(p[0]),datetime.strptime (p[1],'%Y-%m-%d %H:%M'),datetime.strptime (p[2],'%Y-%m-%d %H:%M'),mk_int(p[3]),p[4].strip(),float(p[5]),float(p[6]),mk_int(p[7]),p[8].strip(), float(p[9]),float(p[10]),int(p[11]),p[12].strip(),mk_int(p[13]),mk_int(p[14])))
trip_data.take(1)
对于它,我得到了ValueError:以10为基数的int()的文本无效:““634”
这里怎么了?

看起来你的字符串中有引号,一个是普通的,一个是卷曲的。哦,你是对的。非常感谢你,苏。所以我应该用replace来去掉它,我相信?应该可以。