Python多索引列_Python_Pandas_Jupyter Notebook

Python多索引列

python pandas jupyter-notebook

Python多索引列,python,pandas,jupyter-notebook,Python,Pandas,Jupyter Notebook,首先，我正在jupyter笔记本中使用python 3.50 我想创建一个数据框，用于在报表中显示一些数据。我希望它有两个索引列对不起，如果这个词是指它是不正确的。我不习惯和熊猫一起工作我有以下示例代码： frame = pd.DataFrame(np.arange(12).reshape(( 4, 3)), index =[['a', 'a', 'b', 'b'], [1, 2, 1, 2]], columns =[

首先，我正在jupyter笔记本中使用python 3.50

我想创建一个数据框，用于在报表中显示一些数据。我希望它有两个索引列对不起，如果这个词是指它是不正确的。我不习惯和熊猫一起工作

我有以下示例代码：

frame = pd.DataFrame(np.arange(12).reshape(( 4, 3)), 
                  index =[['a', 'a', 'b', 'b'], [1, 2, 1, 2]], 
                  columns =[['Ohio', 'Ohio', 'Ohio'], ['Green', 'Red', 'Green']])

但当我试图把它带到我的案例中时，它给了我一个错误：

cell_rise_Inv= pd.DataFrame([[0.00483211, 0.00511619, 0.00891821, 0.0449637, 0.205753], 
                             [0.00520049, 0.00561577, 0.010993, 0.0468998, 0.207461],
                             [0.00357213, 0.00429087, 0.0132186, 0.0536389, 0.21384],
                             [-0.0021868, -0.0011312, 0.0120546, 0.0647213, 0.224749],
                             [-0.0725403, -0.0700884, -0.0382486, 0.0899121, 0.313639]], 
                            index =[['transition [ns]','transition [ns]','transition [ns]','transition [ns]','transition [ns]'],
                                   [0.0005, 0.001, 0.01, 0.1, 0.5]],
                            columns =[[0.01, 0.02, 0.05, 0.1, 0.5],['capacitance [pF]','capacitance [pF]','capacitance [pF]','capacitance [pF]','capacitance [pF]']])
cell_rise_Inv

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-89-180a1ad88403> in <module>()
      6                             index =[['transition [ns]','transition [ns]','transition [ns]','transition [ns]','transition [ns]'],
      7                                    [0.0005, 0.001, 0.01, 0.1, 0.5]],
----> 8                             columns =[[0.01, 0.02, 0.05, 0.1, 0.5],['capacitance [pF]','capacitance [pF]','capacitance [pF]','capacitance [pF]','capacitance [pF]']])
      9 cell_rise_Inv

C:\Users\Josele\Anaconda3\lib\site-packages\pandas\core\frame.py in __init__(self, data, index, columns, dtype, copy)
    261                     if com.is_named_tuple(data[0]) and columns is None:
    262                         columns = data[0]._fields
--> 263                     arrays, columns = _to_arrays(data, columns, dtype=dtype)
    264                     columns = _ensure_index(columns)
    265 

C:\Users\Josele\Anaconda3\lib\site-packages\pandas\core\frame.py in _to_arrays(data, columns, coerce_float, dtype)
   5350     if isinstance(data[0], (list, tuple)):
   5351         return _list_to_arrays(data, columns, coerce_float=coerce_float,
-> 5352                                dtype=dtype)
   5353     elif isinstance(data[0], collections.Mapping):
   5354         return _list_of_dict_to_arrays(data, columns,

C:\Users\Josele\Anaconda3\lib\site-packages\pandas\core\frame.py in _list_to_arrays(data, columns, coerce_float, dtype)
   5429         content = list(lib.to_object_array(data).T)
   5430     return _convert_object_array(content, columns, dtype=dtype,
-> 5431                                  coerce_float=coerce_float)
   5432 
   5433 

C:\Users\Josele\Anaconda3\lib\site-packages\pandas\core\frame.py in _convert_object_array(content, columns, coerce_float, dtype)
   5487             # caller's responsibility to check for this...
   5488             raise AssertionError('%d columns passed, passed data had %s '
-> 5489                                  'columns' % (len(columns), len(content)))
   5490 
   5491     # provide soft conversion of object dtypes

AssertionError: 2 columns passed, passed data had 5 columns

有什么想法吗？我不明白为什么这个例子有效，而我的却不行

提前感谢：。

它看起来确实不一致。我会使用_数组中的pd.MultiIndex构造函数

代码和示例之间有一个主要区别：示例将numpy数组作为输入传递，而不是嵌套列表。事实上，添加np.array。。。在您的列表周围可以很好地工作：

cell_rise_Inv= pd.DataFrame( np.array([[0.00483211, 0.00511619, 0.00891821, 0.0449637, 0.205753], [0.00520049, 0.00561577, 0.010993, 0.0468998, 0.207461], [0.00357213, 0.00429087, 0.0132186, 0.0536389, 0.21384], [-0.0021868, -0.0011312, 0.0120546, 0.0647213, 0.224749], [-0.0725403, -0.0700884, -0.0382486, 0.0899121, 0.313639]]), index=[['transition [ns]'] * 5, [0.0005, 0.001, 0.01, 0.1, 0.5]], columns=[['capacitance [pF]'] * 5, [0.01, 0.02, 0.05, 0.1, 0.5]]) 我缩短了索引中重复的字符串并交换了索引级别的顺序，但这些都不是显著的更改

编辑做了一点调查。如果传入嵌套列表而不使用np.array调用，则调用将在没有列的情况下工作，即使列是1D列表。出于某种原因，两个元素的嵌套列表不会被解释为多索引，除非输入是数据数组

我根据这个问题向pandas提交了文件。

错误表明您没有传递与索引匹配的形状的数据：AssertionError:2列传递，传递的数据有5列。看起来您的索引重复了“电容[pF]”5次，而数据只有两列……还有，您可能需要切换标签“电容[pF]”的顺序和多索引中的数字。您的意思是什么？您能用所有输入显示您正在运行的实际线路吗？对不起，我误读了您的输出，因为所有输出都是错误的一部分。没有意识到你的输入在顶部。我觉得自己像个白痴，我的错。我没有完全阅读OP的输出。否决票被删除。“这一天过得很慢，我不想拿你出气。”物理学家对否决票发表评论并纠正错误表示赞赏。我最近错误地关闭了别人的帖子，因为我太累了。。。这是常有的事。 cell_rise_Inv= pd.DataFrame( np.array([[0.00483211, 0.00511619, 0.00891821, 0.0449637, 0.205753], [0.00520049, 0.00561577, 0.010993, 0.0468998, 0.207461], [0.00357213, 0.00429087, 0.0132186, 0.0536389, 0.21384], [-0.0021868, -0.0011312, 0.0120546, 0.0647213, 0.224749], [-0.0725403, -0.0700884, -0.0382486, 0.0899121, 0.313639]]), index=[['transition [ns]'] * 5, [0.0005, 0.001, 0.01, 0.1, 0.5]], columns=[['capacitance [pF]'] * 5, [0.01, 0.02, 0.05, 0.1, 0.5]])