Python 两个长度不等的数据帧的交集
如果有行匹配,我试图获得“游戏”和“示例”数据帧之间的交集。数据帧的大小不等,我不希望一行在相交时计数两次 例如, 示例数据帧有行Python 两个长度不等的数据帧的交集,python,pandas,Python,Pandas,如果有行匹配,我试图获得“游戏”和“示例”数据帧之间的交集。数据帧的大小不等,我不希望一行在相交时计数两次 例如, 示例数据帧有行[0,1,1],[1,1,0],[1,0,1],[0,1,1] 游戏数据帧有行[1,1,0],[1,1,0],[1,0,1],[1,1,1],[1,0,1] 现在,交叉点数据帧应该有行[1,1,0],[1,0,1] import pandas as pd import numpy as np import random trials = 1000 games = 3
[0,1,1],[1,1,0],[1,0,1],[0,1,1]
游戏数据帧有行[1,1,0],[1,1,0],[1,0,1],[1,1,1],[1,0,1]
现在,交叉点数据帧应该有行[1,1,0],[1,0,1]
import pandas as pd
import numpy as np
import random
trials = 1000
games = 3
data = pd.DataFrame()
for i in range(trials):
for j in range(games):
data.loc[i,j] = random.choice([0,1])
sample = pd.DataFrame()
for i in range(trials):
for j in range(games):
if ((data.loc[i,:]).sum()) >= 2:
sample.loc[i,j] = data.loc[i,j]
game = pd.DataFrame()
for i in range(trials):
for j in range(games):
if (data.loc[i,0]) == 1:
game.loc[i,j] = data.loc[i,j]
intersection = pd.DataFrame()
for i in range(len(sample)):
if np.all(sample.iloc[i,:] == game.iloc[i,:]):
for j in range(games):
intersection.loc[i,j] = sample.loc[i,j]
您可以尝试使用条件检查第二个数据帧中的类似行
df1 = pd.DataFrame([[0,1,1],[1,1,0],[1,0,1],[0,1,1]])
df2 = pd.DataFrame([[1,1,0],[1,1,0],[1,0,1],[1,1,1],[1,0,1]])
df1[df1.isin(df2).all(1)]
输出:
这回答了你的问题吗?
0 1 2
1 1 1 0
2 1 0 1