Python 熊猫中的透视数据帧？_Python_Pandas_Dataframe

Python 熊猫中的透视数据帧？

python pandas dataframe

Python 熊猫中的透视数据帧？,python,pandas,dataframe,Python,Pandas,Dataframe,等等可以从这个表中创建类似的内容吗 people1 trait1 YES people1 trait2 YES people1 trait3 NO people1 trait4 RED people2 trait1 NO people2 trait2 YES people2 trait4 BLACK 该文件太大，无法在excel中执行此操作，我尝试在pandas中执行此操作，但在这种情况下找不到帮助。我找到了pd.pivot_表函数，但无法生成工作代码。我试过并得到了各种各样的性爱（99%

等等

可以从这个表中创建类似的内容吗

people1 trait1 YES
people1 trait2 YES
people1 trait3 NO
people1 trait4 RED
people2 trait1 NO
people2 trait2 YES
people2 trait4 BLACK

该文件太大，无法在excel中执行此操作，我尝试在pandas中执行此操作，但在这种情况下找不到帮助。我找到了pd.pivot_表函数，但无法生成工作代码。我试过并得到了各种各样的性爱（99%是我的错）

有人能解释一下如何在我的情况下使用它吗？或者也许是比熊猫更好的选择+

编辑

        trait1, trait2, trait3, trait4 ...
people1  YES     YES     NO      RED
people2  NO      YES     -       BLACK
people3  -        -      YES     BLUE

我建议：

data1.pivot_表（index=1，columns=“name”，values='trait'，aggfunc='，'.join，fill_value='-'））

我得到：

I rebuild my frame:
1      'interpretation'     'trait'
p1           YES               t1
p1           BLACK             t2
p1           NO                t3
p2           NO                t1
p2           RED               t2
p2           NO                t3

如果我改变

data1.pivot_表（index=1，columns=“trait”，values='value'，aggfunc='，'.join，fill_value='-'）

我得到了错误的订单表，但没有错误：

TypeError: sequence item 0: expected str instance, float found

所以我认为，第一个选项是正确的，但我无法修复这个错误。当我输入df时，返回（O）表示所有列

我认为问题在于

trait

列中缺少值，所以

join

函数失败。因此，可能的解决方案是将缺少的值替换为空字符串：

     p1      p2    p3    p4
YES  trait1  t1
YES  t1      t2 etc.
NO
RED
No
...

你能添加你的代码吗，错误？通常：ValueError:Index包含重复的条目，即使我从df中删除了重复的条目，也无法重塑。我想我只是做错了什么。我看到了u标记的那篇文章，但我可以基于它解决我的问题。现在我的代码看起来：

data1.pivot（index=1，columns=2，values=3）。drop_duplicates（）

我的意思是pivot部分，但我仍在继续尝试，这是最简单的验证。我试着用它做任何事情。我建议使用

df=df.pivot\U表（index='col1'，columns='col2'，values='col3'，aggfunc='，'.join，fill_value='-'）

谢谢你的回答。它正在创建表格。我需要的是完全不同的，但我知道我将列与参数不匹配。我试试看。

print (data1)
    1   name trait
0  p1    YES   NaN <- missing value
1  p1  BLACK    t2
2  p1     NO    t3
3  p2     NO    t1
4  p2    RED    t2
5  p2     NO    t3

data1['trait'] = data1['trait'].fillna('')
df = data1.pivot_table(index=1, 
                       columns="name", 
                       values='trait', 
                       aggfunc=','.join, 
                       fill_value='-')
print (df)
1      p1     p2
name            
BLACK  t2      -
NO     t3  t1,t3
RED     -     t2
YES            -

data1['trait'] = data1['trait'].fillna('')
df = (data1.pivot_table(index=1, 
                       columns="name", 
                       values='trait', 
                       aggfunc=','.join, 
                       fill_value='-')
           .reset_index()
           .rename_axis(None, axis=1))
print (df)
    name  p1     p2
0  BLACK  t2      -
1     NO  t3  t1,t3
2    RED   -     t2
3    YES          -