Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/shell/5.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 熊猫阅读';使用dtype时是否分析逗号-小数分隔符?_Python_Pandas_Csv_Data Analysis - Fatal编程技术网

Python 熊猫阅读';使用dtype时是否分析逗号-小数分隔符?

Python 熊猫阅读';使用dtype时是否分析逗号-小数分隔符?,python,pandas,csv,data-analysis,Python,Pandas,Csv,Data Analysis,我想用逗号作为十进制分隔符打开一个csv文件。我传递decimal=“,”。但是,我总是得到一个错误,csv文件最后一行的浮点值没有成功解析(我尝试删除csv文件的一些行,错误总是发生在最后一行…)。 浮点值也在csv中引用,因此我传递quoting=csv.QUOTE_ALL 有人知道这里发生了什么吗?如果我删除dtype选项,它会工作,但是amount列是一个字符串 以下是我使用的代码(为了可读性而重新格式化): 我认为这是一个我不理解的熊猫行为问题,但问题似乎在于读取的数据。浮子柱也使用千

我想用逗号作为十进制分隔符打开一个csv文件。我传递
decimal=“,”
。但是,我总是得到一个错误,csv文件最后一行的浮点值没有成功解析(我尝试删除csv文件的一些行,错误总是发生在最后一行…)。 浮点值也在csv中引用,因此我传递
quoting=csv.QUOTE_ALL

有人知道这里发生了什么吗?如果我删除
dtype
选项,它会工作,但是
amount
列是一个字符串

以下是我使用的代码(为了可读性而重新格式化):


我认为这是一个我不理解的熊猫行为问题,但问题似乎在于读取的数据。浮子柱也使用千分分离器,如“1.345,50”。那些线路引起了麻烦。解决方案是使用read_csv选项
数千='.

不过,错误消息令人困惑,因为报告为不可转换为浮点的值取自csv文件的最后一行,而不是引起问题的那一行

作为参考,以下是错误消息:

    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-109-e77d6530ab30> in <module>()
      3 import csv
      4 dateparse = lambda x: pd.datetime.strptime(x, '%d/%m/%Y')
----> 5 expenses = pd.read_csv("/home/jovyan/work/test.csv", quoting=csv.QUOTE_ALL, decimal=",", header=None, na_values=[""],names=['date','category','amount','currency','note','tags'],parse_dates=["date"], date_parser=dateparse,  dtype={"date": str,"category": str, "amount":float, "currency": str, "note": str, "tags": str})
      6 expenses['tags']=expenses['tags'].str.split(",")
      7 #,  dtype={"date": str,"category": str, "amount":float, "currency": str, "note": str, "tags": str}

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
    676                     skip_blank_lines=skip_blank_lines)
    677 
--> 678         return _read(filepath_or_buffer, kwds)
    679 
    680     parser_f.__name__ = name

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    444 
    445     try:
--> 446         data = parser.read(nrows)
    447     finally:
    448         parser.close()

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in read(self, nrows)
   1034                 raise ValueError('skipfooter not supported for iteration')
   1035 
-> 1036         ret = self._engine.read(nrows)
   1037 
   1038         # May alter columns / col_dict

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in read(self, nrows)
   1846     def read(self, nrows=None):
   1847         try:
-> 1848             data = self._reader.read(nrows)
   1849         except StopIteration:
   1850             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

ValueError: could not convert string to float: '0,99'
---------------------------------------------------------------------------
TypeError回溯(最近一次调用上次)
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader._convert_tokens()
TypeError:无法根据“安全”规则将数组从dtype('O')强制转换为dtype('float64')
在处理上述异常期间,发生了另一个异常:
ValueError回溯(最近一次调用上次)
在()
3导入csv
4 dateparse=lambda x:pd.datetime.strtime(x,“%d/%m/%Y”)
---->5费用=pd.read_csv(“/home/jovyan/work/test.csv”,quoting=csv.QUOTE_ALL,decimal=“,”,header=None,na_values=[“”],name=['date','category','amount','currency','note','note','tags'],parse_dates=[“date”],date_parser=dateparse,dtype={“date”:str,“category”:str,“amount”:float”,“currency”:str,“note”:str,“tags”:str})
6费用['tags']=费用['tags'].str.split(“,”)
7#,dtype={“日期”:str,“类别”:str,“金额”:浮动,“货币”:str,“备注”:str,“标签”:str}
/解析器中的opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py(文件路径或缓冲区、sep、分隔符、标题、名称、索引列、usecols、squeeze、prefix、mangle\u dupe\u cols、数据类型、引擎、转换器、true\u值、false\u值、skipinitialspace、skiprows、nrows、na\u值、keep\u默认值、na\u筛选器、冗余、跳过空白行、解析日期、推断日期时间格式、keep\u日期列、日期解析器、dayfirst、迭代器、chunksize、压缩、千、十进制、行终止符、引号、引号、转义符、注释、编码、方言、元组、错误行、警告行、跳板、双引号、删除空格、低内存、内存映射、浮点精度)
676跳过空白行=跳过空白行)
677
-->678返回读取(文件路径或缓冲区,kwds)
679
680解析器名称
/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in_read(文件路径或缓冲区,kwds)
444
445试试:
-->446 data=parser.read(nrows)
447最后:
448解析器.close()
/读取中的opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py(self,nrows)
1034 raise VALUERROR('迭代不支持skipfooter')
1035
->1036 ret=自读数(nrows)
1037
1038#可更改列/列目录
/读取中的opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py(self,nrows)
1846 def读取(自身,nrows=无):
1847尝试:
->1848数据=自身读取(nrows)
1849除停止迭代外:
1850如果自我第一块:
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.textleader.read()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader._read_low_memory()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader._read_rows()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader。_convert_column_data()
pandas/_libs/parsers.pyx在pandas中。_libs.parsers.TextReader._convert_tokens()
ValueError:无法将字符串转换为浮点:“0,99”

在没有数据样本的情况下难以回答此问题-最好通过gdocs、dropbox或类似方式共享文件样本。可能吗?或者只是在本地检查最后一行,看看实际有哪些数据
    ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

TypeError: Cannot cast array from dtype('O') to dtype('float64') according to the rule 'safe'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
<ipython-input-109-e77d6530ab30> in <module>()
      3 import csv
      4 dateparse = lambda x: pd.datetime.strptime(x, '%d/%m/%Y')
----> 5 expenses = pd.read_csv("/home/jovyan/work/test.csv", quoting=csv.QUOTE_ALL, decimal=",", header=None, na_values=[""],names=['date','category','amount','currency','note','tags'],parse_dates=["date"], date_parser=dateparse,  dtype={"date": str,"category": str, "amount":float, "currency": str, "note": str, "tags": str})
      6 expenses['tags']=expenses['tags'].str.split(",")
      7 #,  dtype={"date": str,"category": str, "amount":float, "currency": str, "note": str, "tags": str}

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in parser_f(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, squeeze, prefix, mangle_dupe_cols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, dayfirst, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, escapechar, comment, encoding, dialect, tupleize_cols, error_bad_lines, warn_bad_lines, skipfooter, doublequote, delim_whitespace, low_memory, memory_map, float_precision)
    676                     skip_blank_lines=skip_blank_lines)
    677 
--> 678         return _read(filepath_or_buffer, kwds)
    679 
    680     parser_f.__name__ = name

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in _read(filepath_or_buffer, kwds)
    444 
    445     try:
--> 446         data = parser.read(nrows)
    447     finally:
    448         parser.close()

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in read(self, nrows)
   1034                 raise ValueError('skipfooter not supported for iteration')
   1035 
-> 1036         ret = self._engine.read(nrows)
   1037 
   1038         # May alter columns / col_dict

/opt/conda/lib/python3.6/site-packages/pandas/io/parsers.py in read(self, nrows)
   1846     def read(self, nrows=None):
   1847         try:
-> 1848             data = self._reader.read(nrows)
   1849         except StopIteration:
   1850             if self._first_chunk:

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader.read()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_low_memory()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._read_rows()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_column_data()

pandas/_libs/parsers.pyx in pandas._libs.parsers.TextReader._convert_tokens()

ValueError: could not convert string to float: '0,99'