Python 根据dataframe中列表列的条件筛选和删除行
下面是我正在处理的一个更大的数据帧的示例Python 根据dataframe中列表列的条件筛选和删除行,python,pandas,dataframe,Python,Pandas,Dataframe,下面是我正在处理的一个更大的数据帧的示例 import pandas as pd data = {"Trial": ['Trial_1', 'Trial_2', 'Trial 3', 'Trial 4'], "Results" : [[['a', 11.0, 1, 1.0], ['b', 12.0, 0, 6.0], ['c', 2.6, 0, 3.0]], [['d', 7.3, 1, 8.0], ['e', 13.0, 0, 5.0], ['f',
import pandas as pd
data = {"Trial": ['Trial_1', 'Trial_2', 'Trial 3', 'Trial 4'], "Results" : [[['a', 11.0, 1, 1.0], ['b', 12.0, 0, 6.0], ['c', 2.6, 0, 3.0]], [['d', 7.3, 1, 8.0], ['e', 13.0, 0, 5.0], ['f', 8.6, 0, 3.0]],
[['g', 9.1, 1, 1.0], ['h', 10.0, 0, 7.0], ['i', 95.6, 0, 5.0]], [['j', 6.6, 1, 1.0], ['k', 14.0, 0, 3.0], ['l', 8.1, 0, 9.0]]]}
df = pd.DataFrame(data)
2查询
Index Trial Results
1 Trial_2 [[d, 7.3, 1, 8.0], [e, 13.0, 0, 5.0], [f, 8.6, 0, 3.0]]
[place for places in df['Results'] for place in places if place[2] == 1 and place[3] != 1]
下面的函数收集条件的索引,然后您可以使用索引列表获取与条件匹配的数据帧,或者获取删除符合条件的行的数据帧。在每行上使用apply()并遍历列表列表。如果第一个列表符合条件,则可以清除for循环,而不必根据其余列表完成for循环,但我没有深入到练习中
idxs = [] # for collecting indices
def loop_results(x):
for res in x['Results']:
if res[2] ==1 and res[3] != 1:
idxs.append(x.name) # here, .name is the index value
df_temp = df.apply(loop_results, axis=1) # apply the function to each row
idxs = list(set(idxs)) # if there are duplicates, set() will remove them
df_match = df.loc[idxs] # matched criteria
df_unmatched = df.drop(idxs, axis=0) # drops rows matching criteria
下面的函数收集条件的索引,然后您可以使用索引列表获取与条件匹配的数据帧,或者获取删除符合条件的行的数据帧。在每行上使用apply()并遍历列表列表。如果第一个列表符合条件,则可以清除for循环,而不必根据其余列表完成for循环,但我没有深入到练习中
idxs = [] # for collecting indices
def loop_results(x):
for res in x['Results']:
if res[2] ==1 and res[3] != 1:
idxs.append(x.name) # here, .name is the index value
df_temp = df.apply(loop_results, axis=1) # apply the function to each row
idxs = list(set(idxs)) # if there are duplicates, set() will remove them
df_match = df.loc[idxs] # matched criteria
df_unmatched = df.drop(idxs, axis=0) # drops rows matching criteria
您可以使用
query\u check
对结果使用apply
,您可以根据过滤逻辑中的任何更改进一步修改
import pandas as pd
data = {"Trial": ['Trial_1', 'Trial_2', 'Trial 3', 'Trial 4'], "Results" : [[['a', 11.0, 1, 1.0], ['b', 12.0, 0, 6.0], ['c', 2.6, 0, 3.0]], [['d', 7.3, 1, 8.0], ['e', 13.0, 0, 5.0], ['f', 8.6, 0, 3.0]],
[['g', 9.1, 1, 1.0], ['h', 10.0, 0, 7.0], ['i', 95.6, 0, 5.0]], [['j', 6.6, 1, 1.0], ['k', 14.0, 0, 3.0], ['l', 8.1, 0, 9.0]]]}
df = pd.DataFrame(data)
def query_check(inp):
for i,lst in enumerate(inp.values):
if isinstance(lst,list):
if lst[i][2] == 1 and lst[i][3] != 1:
return True
return False
df['Flag'] = df[['Results']].apply(query_check,axis=1)
一旦创建了标志
列,就可以进行进一步筛选-
查询-1
查询-2
您可以使用query\u check
对结果使用apply
,您可以根据过滤逻辑中的任何更改进一步修改
import pandas as pd
data = {"Trial": ['Trial_1', 'Trial_2', 'Trial 3', 'Trial 4'], "Results" : [[['a', 11.0, 1, 1.0], ['b', 12.0, 0, 6.0], ['c', 2.6, 0, 3.0]], [['d', 7.3, 1, 8.0], ['e', 13.0, 0, 5.0], ['f', 8.6, 0, 3.0]],
[['g', 9.1, 1, 1.0], ['h', 10.0, 0, 7.0], ['i', 95.6, 0, 5.0]], [['j', 6.6, 1, 1.0], ['k', 14.0, 0, 3.0], ['l', 8.1, 0, 9.0]]]}
df = pd.DataFrame(data)
def query_check(inp):
for i,lst in enumerate(inp.values):
if isinstance(lst,list):
if lst[i][2] == 1 and lst[i][3] != 1:
return True
return False
df['Flag'] = df[['Results']].apply(query_check,axis=1)
一旦创建了标志
列,就可以进行进一步筛选-
查询-1
查询-2
非常感谢,特别是对您的评论,包括非常感谢,特别是对您的评论,包括非常感谢,这也很有效。非常感谢,这也很有效