Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/325.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 关于数据操作的查询_Python_Pandas - Fatal编程技术网

Python 关于数据操作的查询

Python 关于数据操作的查询,python,pandas,Python,Pandas,我想删除水果和颜色观察的重复组合,其中response=“error”您可以使用删除重复项 Ex: import pandas as pd df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'}, {'fruit': 'apple', 'color': 'red', 'response': 'wrong'}, {'fruit': 'pineapple', 'color': 'green', 'resp

我想删除水果和颜色观察的重复组合,其中response=“error”

您可以使用
删除重复项

Ex:

import pandas as pd

df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
     {'fruit': 'apple',  'color': 'red', 'response': 'wrong'},
     {'fruit': 'pineapple',  'color': 'green',  'response': 'True' },
     {'fruit': 'pineapple',  'color': 'green',  'response': 'wrong' },
     {'fruit': 'orange',  'color': 'orange',  'response': 'wrong' }]



df = pd.DataFrame(df)
import pandas as pd
df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
     {'fruit': 'apple',  'color': 'red', 'response': 'wrong'},
     {'fruit': 'pineapple',  'color': 'green',  'response': 'True' },
     {'fruit': 'pineapple',  'color': 'green',  'response': 'wrong' },
     {'fruit': 'orange',  'color': 'orange',  'response': 'wrong' }]

df = pd.DataFrame(df)
print(df.drop_duplicates(['fruit','color']))
输出:

import pandas as pd

df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
     {'fruit': 'apple',  'color': 'red', 'response': 'wrong'},
     {'fruit': 'pineapple',  'color': 'green',  'response': 'True' },
     {'fruit': 'pineapple',  'color': 'green',  'response': 'wrong' },
     {'fruit': 'orange',  'color': 'orange',  'response': 'wrong' }]



df = pd.DataFrame(df)
import pandas as pd
df = [{'fruit': 'apple', 'color': 'red', 'response': 'right'},
     {'fruit': 'apple',  'color': 'red', 'response': 'wrong'},
     {'fruit': 'pineapple',  'color': 'green',  'response': 'True' },
     {'fruit': 'pineapple',  'color': 'green',  'response': 'wrong' },
     {'fruit': 'orange',  'color': 'orange',  'response': 'wrong' }]

df = pd.DataFrame(df)
print(df.drop_duplicates(['fruit','color']))

首先对“response”列进行排序

    color      fruit response
0     red      apple    right
2   green  pineapple     True
4  orange     orange    wrong
输出

df.sort_values(['response'], inplace=True)
df.drop_duplicates(['color','fruit'], inplace = True)
df.sort_index(axis=0, inplace= True)
然后可以使用删除重复的值

   color      fruit response 
2   green  pineapple     True
0     red      apple    right
1     red      apple    wrong
3   green  pineapple    wrong
4  orange     orange    wrong
输出

df.sort_values(['response'], inplace=True)
df.drop_duplicates(['color','fruit'], inplace = True)
df.sort_index(axis=0, inplace= True)
您可以使用-

    color      fruit response
2   green  pineapple     True
0     red      apple    right
4  orange     orange    wrong
输出

df.sort_values(['response'], inplace=True)
df.drop_duplicates(['color','fruit'], inplace = True)
df.sort_index(axis=0, inplace= True)

这将为您提供所需的输出

预期结果:df=[{'fruit':'apple','color':'red','response':'right'},{'fruit':'菠萝','color':'green','response':'True'},{'fruit':'orange','color':'orange','response':'error'}]如果我改变观察的顺序,那么上面的语法将删除right并保留'error'df=[{'fruit':'apple','color':'red','response':'right'},{'fruit':'apple','color':'red','response':'error'},{'fruit':'菠萝','color':'green','response':'True'},{‘水果’:‘菠萝’,‘颜色’:‘绿色’,‘响应’:‘错误’,{‘水果’:‘橙色’,‘颜色’:‘橙色’,‘响应’:‘错误’,{‘水果’,‘颜色’:‘橙色’,‘响应’:‘正确’}预期结果:df=[{‘水果’:‘苹果’,‘颜色’:‘红色’,‘响应’:‘正确’,{‘水果’:‘菠萝’,‘颜色’:‘绿色’,‘响应’:‘正确’,{'fruit':'orange','color':'orange','response':'right'}]尝试:
df.drop\u duplicates(['fruit','color'],keep='first')
df.drop\u duplicates(['fruit','color'],keep='last')df drop\u duplicates(['fruit','color'],keep='last'))排序后,但没有用。似乎我必须按字母顺序对标签进行硬编码。我想避免在实际数据集中出现这种情况,因为没有顺序