Python 从数据帧中删除不同的对_Python_Pandas

Python 从数据帧中删除不同的对

python pandas

Python 从数据帧中删除不同的对,python,pandas,Python,Pandas,我有一个熊猫栏，它有两个带文本值的栏： import pandas as pd df = pd.DataFrame({"text": ["how are you", "this is an apple", "how are you", "hello my friend", "how are you", "this is an apple", "are you ok", "are you ok"], "type": ["question", "statemen

我有一个熊猫栏，它有两个带文本值的栏：

import pandas as pd

df = pd.DataFrame({"text": ["how are you", "this is an apple", "how are you", "hello my friend", "how are you", "this is an apple", "are you ok", "are you ok"],
                  "type": ["question", "statement", "question", "statement", "statement", "question", "question", "question"]})

print(df)

               text       type
0       how are you   question
1  this is an apple  statement
2       how are you   question
3   hello my friend  statement
4       how are you  statement
5  this is an apple   question
6        are you ok   question
7        are you ok   question

我想找到具有不同“type”列值的对（来自“text”列的2个或更多值）。例如，您可以看到值“你好”有“问题”和“陈述”。因此，我的结果应该是：

               text       type

3   hello my friend  statement
6        are you ok   question
7        are you ok   question

因为

“你还好吗”

和

“你好，我的朋友”

的文本值对于

“type”

具有唯一的值

我试图

删除重复项（）

，但效果不佳。我正在考虑按

“text”

列进行分组，但我不知道如何检查组是否具有不同的/非唯一的

“type”

列值。

这是

groupby（）.nunique（）

：

输出：

              text       type
3  hello my friend  statement
6       are you ok   question
7       are you ok   question

尝试不同的

pd.crosstab

s=(~pd.crosstab(df.text,df.type).ne(0).all(1))
df.loc[df.text.isin(s.index[s])]
              text       type
3  hello my friend  statement
6       are you ok   question
7       are you ok   question

s=(~pd.crosstab(df.text,df.type).ne(0).all(1))
df.loc[df.text.isin(s.index[s])]
              text       type
3  hello my friend  statement
6       are you ok   question
7       are you ok   question