Python 创建新列并将条件值放入dataframe_Python_Pandas

Python 创建新列并将条件值放入dataframe

python pandas

Python 创建新列并将条件值放入dataframe,python,pandas,Python,Pandas,我有数据帧： sepallength sepalwidth petallength petalwidth class cluster 0 5.1 3.5 1.4 0.2 Iris-setosa cluster1 1 4.9 3 1.4 0.2 Iris-setosa cluster1 2 4.7 3.2

我有数据帧：

 sepallength sepalwidth petallength petalwidth        class   cluster
0         5.1        3.5         1.4        0.2  Iris-setosa  cluster1
1         4.9          3         1.4        0.2  Iris-setosa  cluster1
2         4.7        3.2         1.3        0.2  Iris-setosa  cluster1
3         4.6        3.1         1.5        0.2  Iris-setosa  cluster1
4           5        3.6         1.4        0.2  Iris-setosa  cluster1
5         5.4        3.9         1.7        0.4  Iris-setosa  cluster1
6         4.6        3.4         1.4        0.3  Iris-setosa  cluster1
7           5        3.4         1.5        0.2  Iris-setosa  cluster1
8         4.4        2.9         1.4        0.2  Iris-setosa  cluster1
9         4.9        3.1         1.5        0.1  Iris-setosa  cluster1

还有一本字典：

{'cluster2': 'Iris-virginica', 'cluster0': 'Iris-versicolor', 'cluster1': 'Iris-setosa'}

我需要添加另一列，并用df['cluster']==key字典中的值填充它

我试过使用np.where

def countTruth(df):
    # dictionary mapping cluster to most frequent class

    clustersClass = df.groupby(['cluster'])['class'].agg(lambda x:x.value_counts().index[0]).to_dict()
    for eachKey in clustersClass:
        newv = clustersClass[eachKey]
        print df
        df['new'] = np.where(df['cluster']==eachKey , newv)

表示应同时给出x和y或两者都不给出的崩溃

我的最终目标是根据集群和类标签计算真正的正、真正的负、FP和FN。这是向..

呼叫

map

并传递指令的步骤：

In [326]:

d={'cluster2': 'Iris-virginica', 'cluster0': 'Iris-versicolor', 'cluster1': 'Iris-setosa'}
df['key'] = df['cluster'].map(d)
df
Out[326]:
   sepallength  sepalwidth  petallength  petalwidth        class   cluster  \
0          5.1         3.5          1.4         0.2  Iris-setosa  cluster1   
1          4.9         3.0          1.4         0.2  Iris-setosa  cluster1   
2          4.7         3.2          1.3         0.2  Iris-setosa  cluster1   
3          4.6         3.1          1.5         0.2  Iris-setosa  cluster1   
4          5.0         3.6          1.4         0.2  Iris-setosa  cluster1   
5          5.4         3.9          1.7         0.4  Iris-setosa  cluster1   
6          4.6         3.4          1.4         0.3  Iris-setosa  cluster1   
7          5.0         3.4          1.5         0.2  Iris-setosa  cluster1   
8          4.4         2.9          1.4         0.2  Iris-setosa  cluster1   
9          4.9         3.1          1.5         0.1  Iris-setosa  cluster1   

           key  
0  Iris-setosa  
1  Iris-setosa  
2  Iris-setosa  
3  Iris-setosa  
4  Iris-setosa  
5  Iris-setosa  
6  Iris-setosa  
7  Iris-setosa  
8  Iris-setosa  
9  Iris-setosa

太好了@EdChum你总是有答案，你理解问题：+1。我通常会查阅大熊猫的资料，但无法准确地找到我需要的东西，并采取相当复杂的方法着陆。有没有任何书籍/参考资料可以帮助我在熊猫数据操作方面进行培训。有，也有在线的，老实说，就像所有要尝试的东西一样。我不专业地使用熊猫，但我尽量回答问题，我从其他熊猫用户和开发者那里学到了很多。