Python 熊猫多索引数据框-插入到新列和子行的数据仅在列视图中可见？_Python_Pandas_Dataframe_Insert_Multi Index

Python 熊猫多索引数据框-插入到新列和子行的数据仅在列视图中可见？

python pandas dataframe

Python 熊猫多索引数据框-插入到新列和子行的数据仅在列视图中可见？,python,pandas,dataframe,insert,multi-index,Python,Pandas,Dataframe,Insert,Multi Index,我正在尝试将数据列表插入到多级数据帧中。它似乎工作得很好，但是当我查看整个数据帧时，新的子行不在那里。以下是一个例子：创建空的多索引数据帧： ind = pd.MultiIndex.from_product([['A','B','C'], ['a', 'b','c']]) #set up index df = pd.DataFrame(columns=['col1'], index=ind) #create empty df with multi-l

我正在尝试将数据列表插入到多级数据帧中。它似乎工作得很好，但是当我查看整个数据帧时，新的子行不在那里。以下是一个例子：

创建空的多索引数据帧：

ind = pd.MultiIndex.from_product([['A','B','C'], ['a', 'b','c']]) #set up index 
df = pd.DataFrame(columns=['col1'], index=ind)                    #create empty df with multi-level nested index
print(df)

插入新列可以正常工作：

newcol = 'col2'      #new column name
df[newcol] = np.nan  #fill new column with nans
print(df)

将数据插入现有子行可用于点数据，但不能用于列表：

df[newcol]['A','a'] = 1        #works with point data but not with list
print(df)

仅查看一列时，插入新的子行看起来正常：

df[newcol]['A','d'] = [1,2,3]  #insert into new sub-row 'd'
print(df[newcol])              #view just new column

但是在查看整个数据帧时它不可见-为什么

print(df)

此外，当我尝试插入数据的不同方法时，我会遇到以下问题：使用df.loc[]可以完美地用于单个数据点，但不能用于列表：

df.loc[('A','f'),  newcol] = 1          #create new row at [(row,sub-row),column] & insert point data
print(df)                               #works fine

相同的方法，但插入列表会返回错误：

df.loc[('A','f'),  newcol] = [1,2,3]    #create new row at [(row,sub-row),column] & insert list data

TypeError:numpy.float64类型的对象没有len（）

使用df.at[]返回点和列表数据的错误：

data.at[（'A'，'f'），newcol]=[1,2,3]#插入现有子行“f”

KeyError:（'A'，'f'）

当您执行

df[newcol]['A'，'d']=[1,2,3]

时，它是链接索引分配，因此结果是不可预测的。Pandas不能保证在进行链式索引时行为正确。当您运行该命令时，pandas将执行一条警告。如果您想知道，此警告甚至包括指向完整解释的链接。我不深入讨论细节，因为警告中的链接很好地解释了这个链接索引

将列表分配给单元格时，这总是一件痛苦的事情。然而，这是可行的。我猜你的问题是<代码> df.LOC [（a′，'f'），NeCLOC]＝[1,2,3] < /COD>，因为<代码> COL2是dType <代码>浮点，所以熊猫不考虑<代码> [1,2,3] < /Case>作为一个单独的对象<代码>列表< /> >。它将

[1,2,3]

视为多个数值的列表，因此失败。我不知道这是一个bug还是故意的

要解决

.loc

的问题，请将

col2

转换为dtype

object

，然后进行赋值

df['col2'] = df['col2'].astype('O')
df.loc[('A','f'),  'col2'] = [1,2,3]

print(df)

Out[1911]:
    col1       col2
A a  NaN        NaN
  b  NaN        NaN
  c  NaN        NaN
B a  NaN        NaN
  b  NaN        NaN
  c  NaN        NaN
C a  NaN        NaN
  b  NaN        NaN
  c  NaN        NaN
A f  NaN  [1, 2, 3]

print(df['col2'])

Out[1912]:
A  a          NaN
   b          NaN
   c          NaN
B  a          NaN
   b          NaN
   c          NaN
C  a          NaN
   b          NaN
   c          NaN
A  f    [1, 2, 3]
Name: col2, dtype: object

print(df)

    col1  col2
A a  NaN   1.0
  b  NaN   NaN
  c  NaN   NaN
B a  NaN   NaN
  b  NaN   NaN
  c  NaN   NaN
C a  NaN   NaN
  b  NaN   NaN
  c  NaN   NaN

df.loc[('A','f'),  newcol] = 1          #create new row at [(row,sub-row),column] & insert point data
print(df)                               #works fine

    col1  col2
A a  NaN   1.0
  b  NaN   NaN
  c  NaN   NaN
B a  NaN   NaN
  b  NaN   NaN
  c  NaN   NaN
C a  NaN   NaN
  b  NaN   NaN
  c  NaN   NaN
A f  NaN   1.0

df.loc[('A','f'),  newcol] = [1,2,3]    #create new row at [(row,sub-row),column] & insert list data

df['col2'] = df['col2'].astype('O')
df.loc[('A','f'),  'col2'] = [1,2,3]

print(df)

Out[1911]:
    col1       col2
A a  NaN        NaN
  b  NaN        NaN
  c  NaN        NaN
B a  NaN        NaN
  b  NaN        NaN
  c  NaN        NaN
C a  NaN        NaN
  b  NaN        NaN
  c  NaN        NaN
A f  NaN  [1, 2, 3]

print(df['col2'])

Out[1912]:
A  a          NaN
   b          NaN
   c          NaN
B  a          NaN
   b          NaN
   c          NaN
C  a          NaN
   b          NaN
   c          NaN
A  f    [1, 2, 3]
Name: col2, dtype: object