Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/287.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将sparaFrame保存到CSV会引发索引器错误 用例子更新了问题_Python_Pandas - Fatal编程技术网

Python 将sparaFrame保存到CSV会引发索引器错误 用例子更新了问题

Python 将sparaFrame保存到CSV会引发索引器错误 用例子更新了问题,python,pandas,Python,Pandas,我试着重复我的问题。事实证明,它甚至与我的数据集的大小无关。这是重现我的问题的一个最小示例: >>> import pandas as pd >>> data = pd.SparseDataFrame({ 'user': ['a', 'b', 'c', 'd'], 'week': [4, 3, 2, 1] }, default_fill_value=0) >>> data.info() <class 'pandas.sparse.fra

我试着重复我的问题。事实证明,它甚至与我的数据集的大小无关。这是重现我的问题的一个最小示例:

>>> import pandas as pd
>>> data = pd.SparseDataFrame({ 'user': ['a', 'b', 'c', 'd'], 'week': [4, 3, 2, 1] }, default_fill_value=0)
>>> data.info()
<class 'pandas.sparse.frame.SparseDataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 2 columns):
user    4 non-null object
week    4 non-null int64
dtypes: int64(1), object(1)
memory usage: 144.0+ bytes
>>> data.to_csv('error.csv', index=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/site-packages/pandas/core/frame.py", line 1383, in to_csv
    formatter.save()
  File "/usr/local/lib/python3.6/site-packages/pandas/formats/format.py", line 1475, in save
    self._save()
  File "/usr/local/lib/python3.6/site-packages/pandas/formats/format.py", line 1576, in _save
    self._save_chunk(start_i, end_i)
  File "/usr/local/lib/python3.6/site-packages/pandas/formats/format.py", line 1590, in _save_chunk
    quoting=self.quoting)
  File "/usr/local/lib/python3.6/site-packages/pandas/core/internals.py", line 596, in to_native_types
    values = values[:, slicer]
  File "/usr/local/lib/python3.6/site-packages/pandas/sparse/array.py", line 401, in __getitem__
    data_slice = self.values[key]
IndexError: too many indices for array
当我试图将其保存到CSV文件时,它会引发一个
索引器
。这是因为数据太大了吗?指定
chunksize
无法解决此问题

>>> data.to_csv('../data/hashtags_binarized.csv', index=False)

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-58-550cc98888dc> in <module>()
----> 1 get_ipython().run_cell_magic('time', '', "data.to_csv('../data/hashtags_binarized.csv', index=False)")

/usr/local/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2113             magic_arg_s = self.var_expand(line, stack_depth)
   2114             with self.builtin_trap:
-> 2115                 result = fn(magic_arg_s, cell)
   2116             return result
   2117 

<decorator-gen-59> in time(self, line, cell, local_ns)

/usr/local/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    186     # but it's overkill for just that one bit of state.
    187     def magic_deco(arg):
--> 188         call = lambda f, *a, **k: f(*a, **k)
    189 
    190         if callable(arg):

/usr/local/lib/python3.6/site-packages/IPython/core/magics/execution.py in time(self, line, cell, local_ns)
   1179         if mode=='eval':
   1180             st = clock2()
-> 1181             out = eval(code, glob, local_ns)
   1182             end = clock2()
   1183         else:

<timed eval> in <module>()

/usr/local/lib/python3.6/site-packages/pandas/core/frame.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
   1381                                      doublequote=doublequote,
   1382                                      escapechar=escapechar, decimal=decimal)
-> 1383         formatter.save()
   1384 
   1385         if path_or_buf is None:

/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in save(self)
   1473                 self.writer = csv.writer(f, **writer_kwargs)
   1474 
-> 1475             self._save()
   1476 
   1477         finally:

/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in _save(self)
   1574                 break
   1575 
-> 1576             self._save_chunk(start_i, end_i)
   1577 
   1578     def _save_chunk(self, start_i, end_i):

/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in _save_chunk(self, start_i, end_i)
   1588                                   decimal=self.decimal,
   1589                                   date_format=self.date_format,
-> 1590                                   quoting=self.quoting)
   1591 
   1592             for col_loc, col in zip(b.mgr_locs, d):

/usr/local/lib/python3.6/site-packages/pandas/core/internals.py in to_native_types(self, slicer, na_rep, quoting, **kwargs)
    594         values = self.values
    595         if slicer is not None:
--> 596             values = values[:, slicer]
    597         mask = isnull(values)
    598 

/usr/local/lib/python3.6/site-packages/pandas/sparse/array.py in __getitem__(self, key)
    399             return self._get_val_at(key)
    400         elif isinstance(key, tuple):
--> 401             data_slice = self.values[key]
    402         else:
    403             if isinstance(key, SparseArray):

IndexError: too many indices for array
>>data.to_csv('../data/hashtags_binarized.csv',index=False)
---------------------------------------------------------------------------
索引器回溯(最后一次最近调用)
在()
---->1获取\u ipython()。运行\u cell\u magic('time','','data.to_csv('../data/hashtags\u binarized.csv',index=False'))
/run_cell_magic中的usr/local/lib/python3.6/site-packages/IPython/core/interactiveshell.py(self,magic_name,line,cell)
2113 magic_arg_s=self.var_expand(行、堆栈深度)
2114带自建存水弯:
->2115结果=fn(魔法参数,单元格)
2116返回结果
2117
及时(自身、线路、小区、本地)
/usr/local/lib/python3.6/site-packages/IPython/core/magic.py in(f,*a,**k)
186#但仅仅为了那一点点国家就太过分了。
187 def魔术装饰(arg):
-->188调用=λf,*a,**k:f(*a,**k)
189
190如果可调用(arg):
/usr/local/lib/python3.6/site-packages/IPython/core/magics/execution.py及时(self、line、cell、local)
1179如果模式=='eval':
1180 st=时钟2()
->1181 out=评估(代码、全局、本地)
1182结束=时钟2()
1183其他:
在()
/usr/local/lib/python3.6/site-packages/pandas/core/frame.py in to_csv(self、path或buf、sep、na_rep、float格式、列、标题、索引、索引标签、模式、编码、压缩、引号、行终止符、chunksize、tupleize cols、date格式、双引号、转义、十进制)
1381双引号=双引号,
1382 escapechar=escapechar,十进制=十进制)
->1383格式化程序。保存()
1384
1385如果路径_或_buf为无:
/保存中的usr/local/lib/python3.6/site-packages/pandas/formats/format.py(self)
1473 self.writer=csv.writer(f,**writer_-kwargs)
1474
->1475自我保存()
1476
1477最后:
/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in_save(self)
1574休息
1575
->1576自我保存块(开始、结束)
1577
1578 def_save_chunk(self、start_i、end_i):
/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in_save_chunk(self、start_i、end_i)
1588十进制=自十进制,
1589日期格式=self.date格式,
->1590报价=自报价)
1591
1592对于col_loc,col in zip(b.mgr_locs,d):
/usr/local/lib/python3.6/site-packages/pandas/core/internals.py到“本地”类型(self、slicer、na_rep、quoting,**kwargs)
594值=自我价值
595如果切片器不是无:
-->596值=值[:,切片器]
597掩码=isnull(值)
598
/usr/local/lib/python3.6/site-packages/pandas/sparse/array.py in_u__获取项目(self,key)
399返回自我。获取价值(键)
400 elif isinstance(键,元组):
-->401数据片=自身值[键]
402其他:
403如果存在(键,SparseArray):
索引器:数组的索引太多

使用另一个应用“toCSV”(“name.CSV”)创建CSV的选项,您将得到一个错误“sparStateAFrame”对象没有属性“toCSV”。因此,请使用“.to_dense().to_csv('name.csv'))


嘿你能拿出一段数据让我们重新创建错误吗?我添加了一个最小的例子。实际上,
to_pickle
是有效的,所以这可能只是一个bug…很好的发现。是的,我试着看看稀疏数据帧是否有to_csv函数,但找不到任何东西。但这并不意味着你做不到,只是熊猫文档有时候会有点糟糕。
>>> data.to_csv('../data/hashtags_binarized.csv', index=False)

---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-58-550cc98888dc> in <module>()
----> 1 get_ipython().run_cell_magic('time', '', "data.to_csv('../data/hashtags_binarized.csv', index=False)")

/usr/local/lib/python3.6/site-packages/IPython/core/interactiveshell.py in run_cell_magic(self, magic_name, line, cell)
   2113             magic_arg_s = self.var_expand(line, stack_depth)
   2114             with self.builtin_trap:
-> 2115                 result = fn(magic_arg_s, cell)
   2116             return result
   2117 

<decorator-gen-59> in time(self, line, cell, local_ns)

/usr/local/lib/python3.6/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
    186     # but it's overkill for just that one bit of state.
    187     def magic_deco(arg):
--> 188         call = lambda f, *a, **k: f(*a, **k)
    189 
    190         if callable(arg):

/usr/local/lib/python3.6/site-packages/IPython/core/magics/execution.py in time(self, line, cell, local_ns)
   1179         if mode=='eval':
   1180             st = clock2()
-> 1181             out = eval(code, glob, local_ns)
   1182             end = clock2()
   1183         else:

<timed eval> in <module>()

/usr/local/lib/python3.6/site-packages/pandas/core/frame.py in to_csv(self, path_or_buf, sep, na_rep, float_format, columns, header, index, index_label, mode, encoding, compression, quoting, quotechar, line_terminator, chunksize, tupleize_cols, date_format, doublequote, escapechar, decimal)
   1381                                      doublequote=doublequote,
   1382                                      escapechar=escapechar, decimal=decimal)
-> 1383         formatter.save()
   1384 
   1385         if path_or_buf is None:

/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in save(self)
   1473                 self.writer = csv.writer(f, **writer_kwargs)
   1474 
-> 1475             self._save()
   1476 
   1477         finally:

/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in _save(self)
   1574                 break
   1575 
-> 1576             self._save_chunk(start_i, end_i)
   1577 
   1578     def _save_chunk(self, start_i, end_i):

/usr/local/lib/python3.6/site-packages/pandas/formats/format.py in _save_chunk(self, start_i, end_i)
   1588                                   decimal=self.decimal,
   1589                                   date_format=self.date_format,
-> 1590                                   quoting=self.quoting)
   1591 
   1592             for col_loc, col in zip(b.mgr_locs, d):

/usr/local/lib/python3.6/site-packages/pandas/core/internals.py in to_native_types(self, slicer, na_rep, quoting, **kwargs)
    594         values = self.values
    595         if slicer is not None:
--> 596             values = values[:, slicer]
    597         mask = isnull(values)
    598 

/usr/local/lib/python3.6/site-packages/pandas/sparse/array.py in __getitem__(self, key)
    399             return self._get_val_at(key)
    400         elif isinstance(key, tuple):
--> 401             data_slice = self.values[key]
    402         else:
    403             if isinstance(key, SparseArray):

IndexError: too many indices for array
df.to_dense().to_csv("name.csv", index = False, sep=',', encoding='utf-8')