Python 重复的datetimeindex条目导致奇数异常

Python 重复的datetimeindex条目导致奇数异常,python,pandas,Python,Pandas,让我们以下面的人为示例为例,其中我创建了一个DataFrame,然后使用具有重复项的列创建了一个DatetimeIndex。然后,我将此数据帧放入面板,然后尝试在长轴上迭代 import pandas as pd import datetime as dt a = [1371215933513120, 1371215933513121, 1371215933513122, 1371215933513122] b = [1,2,3,4] df = pd.DataFrame({'a':a, 'b'

让我们以下面的人为示例为例,其中我创建了一个
DataFrame
,然后使用具有重复项的列创建了一个
DatetimeIndex
。然后,我将此
数据帧
放入
面板
,然后尝试在长轴上迭代

import pandas as pd
import datetime as dt

a = [1371215933513120, 1371215933513121, 1371215933513122, 1371215933513122]
b = [1,2,3,4]
df = pd.DataFrame({'a':a, 'b':b, 'c':[dt.datetime.fromtimestamp(t/1000000.) for t in a]})
df.index=pd.DatetimeIndex(df['c'])

d = OrderedDict()
d['x'] = df
p = pd.Panel(d)

for y in p.major_axis:
    print y
    print p.major_xs(y)
这将导致以下输出:

2013-06-14 15:18:53.513120
                            x
a            1371215933513120
b                           1
c  2013-06-14 15:18:53.513120
2013-06-14 15:18:53.513121
                            x
a            1371215933513121
b                           2
c  2013-06-14 15:18:53.513121
2013-06-14 15:18:53.513122
接着是一个(对我来说)相当隐晦的错误:

---------------------------------------------------------------------------
ValueError回溯(最近一次调用上次)
在()
13对于p.长轴上的y轴:
14打印y
--->15印刷大调(y)
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in________(self)
667如果py3compat.PY3:
668返回self.\uuuuuuuuuuuuuuuuuuuuuuuuuuuu()
-->669返回self.\uuuuu字节\uuuuu()
670
671定义字节(自):
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in____字节______(self)
677         """
678 encoding=com.get\u选项(“display.encoding”)
-->679返回self.\uuuuUnicode\uuuuu().encode(编码“替换”)
680
681定义unicode(自):
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in_uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu
692#这需要计算整个repr
所以除非rownum有界,否则不要这样做
-->694配合水平=自我。_repr_配合水平
695
696如果垂直安装和水平安装:
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in_repr_fits_horizontal_(self)
652 d=d.iloc[:最小值(最大行数、高度、长度(d))]
653
-->654 d.to_串(buf=buf)
655 value=buf.getvalue()
656 repr_width=max([len(l)表示l的值。拆分('\n'))
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in to_string(self、buf、columns、col_space、colSpace、header、index、na_rep、格式化程序、float_格式、稀疏、nanRep、索引_名称、对齐、强制unicode、线宽)
1489页眉=页眉,索引=索引,
1490线宽=线宽)
->1491格式化程序。到_字符串()
1492
1493如果buf为无:
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in to_string(self,force_unicode)
312文本=信息行
313其他:
-->314 strcols=self.\u到\u str\u列()
315如果self.line_width为无:
316文本=邻接(1,*strcols)
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in_to_str_列(self)
265对于枚举中的i,c(自列):
266如果是self.header:
-->267 fmt_值=自身格式_列(i)
268 cheader=str_列[i]
269
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in_format_col(self,i)
403 float_格式=self.float_格式,
404 na_rep=self.na_rep,
-->405空间=自列空间)
406
407 def to_html(self,classes=None):
/格式数组中的usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py(值、格式化程序、浮点格式、na_代表、数字、空格、对齐)
1319 justify=justify)
1320
->1321返回fmt_obj.get_result()
1322
1323
/获取结果(self)中的usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py
1335
1336 def get_结果(自身):
->1337 fmt_值=自身。_格式_字符串()
1338返回_make_fixed_width(fmt_值,自调整)
1339
/usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in_format_strings(self)
1362
1363打印“VAL:”,VAL
->1364 is_float=lib.map\u expert(vals,com.is_float)¬null(vals)
1365前导\u空格=is\u float.any()
1366
ValueError:操作数无法与形状(2)(2,3)一起广播
现在,在解释了我正在创建一个包含重复条目的索引之后,错误的来源是明确的。但是,如果不知道这一点,那么(对于像我这样的新手来说)要想弄清楚为什么会出现这种异常情况会更困难

这就引出了几个问题

  • 这真的是熊猫的预期行为吗?是禁止创建包含重复项的索引,还是仅仅禁止对它们进行迭代
  • 如果禁止创建这样的索引,那么在最初创建它时不应该引发异常吗
  • 如果迭代是不正确的,那么错误不应该提供更多信息吗
  • 我做错什么了吗

  • 文章的第二部分(以
    键开始),是什么代码生成的?哦,很抱歉。我在pandas代码中加入了一些调试语句,试图找出pandas中发生了什么,然后才意识到问题是由于重复的条目造成的。我将其更改为包含库存输出。好吧……0.12中的dup列支持会更好(但这仍然是错误的).Duplicates在很多情况下都是受支持的。通常,在单个级别索引中使用Duplicates不是一个好主意;请使用多索引。请参见此处:。我将此标记为一个错误。您能在此处说明您的目标是什么吗,我可以帮您创建一个结构来实现此目标。谢谢Jess。老实说,此工作流实际上也是我代码中的一个错误(我不应该有重复的时间戳。)问题是
    
    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-35-045aaae5a074> in <module>()
         13 for y in p.major_axis:
         14     print y
    ---> 15     print p.major_xs(y)
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in __str__(self)
        667         if py3compat.PY3:
        668             return self.__unicode__()
    --> 669         return self.__bytes__()
        670 
        671     def __bytes__(self):
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in __bytes__(self)
        677         """
        678         encoding = com.get_option("display.encoding")
    --> 679         return self.__unicode__().encode(encoding, 'replace')
        680 
        681     def __unicode__(self):
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in __unicode__(self)
        692             # This needs to compute the entire repr
        693             # so don't do it unless rownum is bounded
    --> 694             fits_horizontal = self._repr_fits_horizontal_()
        695 
        696         if fits_vertical and fits_horizontal:
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in _repr_fits_horizontal_(self)
        652             d=d.iloc[:min(max_rows, height,len(d))]
        653 
    --> 654         d.to_string(buf=buf)
        655         value = buf.getvalue()
        656         repr_width = max([len(l) for l in value.split('\n')])
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/frame.py in to_string(self, buf, columns, col_space, colSpace, header, index, na_rep, formatters, float_format, sparsify, nanRep, index_names, justify, force_unicode, line_width)
       1489                                            header=header, index=index,
       1490                                            line_width=line_width)
    -> 1491         formatter.to_string()
       1492 
       1493         if buf is None:
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in to_string(self, force_unicode)
        312             text = info_line
        313         else:
    --> 314             strcols = self._to_str_columns()
        315             if self.line_width is None:
        316                 text = adjoin(1, *strcols)
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in _to_str_columns(self)
        265         for i, c in enumerate(self.columns):
        266             if self.header:
    --> 267                 fmt_values = self._format_col(i)
        268                 cheader = str_columns[i]
        269 
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in _format_col(self, i)
        403                             float_format=self.float_format,
        404                             na_rep=self.na_rep,
    --> 405                             space=self.col_space)
        406 
        407     def to_html(self, classes=None):
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in format_array(values, formatter, float_format, na_rep, digits, space, justify)
       1319                         justify=justify)
       1320 
    -> 1321     return fmt_obj.get_result()
       1322 
       1323 
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in get_result(self)
       1335 
       1336     def get_result(self):
    -> 1337         fmt_values = self._format_strings()
       1338         return _make_fixed_width(fmt_values, self.justify)
       1339 
    
    /usr/local/lib/python2.7/dist-packages/pandas-0.11.0-py2.7-linux-x86_64.egg/pandas/core/format.py in _format_strings(self)
       1362 
       1363         print "vals:", vals
    -> 1364         is_float = lib.map_infer(vals, com.is_float) & notnull(vals)
       1365         leading_space = is_float.any()
       1366 
    
    ValueError: operands could not be broadcast together with shapes (2) (2,3)