Python 第一列数据帧中每两个元素的公共元素_Python_Pandas_Dataframe

Python 第一列数据帧中每两个元素的公共元素

python pandas dataframe

Python 第一列数据帧中每两个元素的公共元素,python,pandas,dataframe,Python,Pandas,Dataframe,我刚接触熊猫。我有以下数据帧组类型 G1A1 G1 a2 G1 a3 G2 a2 G2 a1 G3 a1 G4 a1 G5 a4 G5 a1给定一个数据帧： import pandas as pd df = pd.DataFrame([['G1', 'G1', 'G2', 'G2'], ['a1', 'a2', 'a1', 'a3']]).T df.columns = ['group', 'type'] 然后有两种选择： df.groupby('type').count() 或者，如果您

我刚接触熊猫。我有以下数据帧

组类型
G1A1
G1 a2
G1 a3
G2 a2
G2 a1
G3 a1
G4 a1
G5 a4
G5 a1给定一个数据帧：
import pandas as pd
df = pd.DataFrame([['G1', 'G1', 'G2', 'G2'], ['a1', 'a2', 'a1', 'a3']]).T
df.columns = ['group', 'type']

然后有两种选择：
df.groupby('type').count()

或者，如果您想明确了解它们：
df.groupby(['type', 'group']).count()

因此，您可以这样做，例如：
df1.loc['a1']

输出：
group
G1
G2

我认为你需要：
计数（G1、G2、2）
？你想写一个函数吗？我们将groupby的值分配给代码中的df1。如果不能用这种简单的方法完成输出操作：）。您可能会对itertools产品提出更好的解决方案：）是的，我在考虑itertools产品，但组合更好。
import itertools

#get all combinations of Group values
c = list(itertools.combinations(list(set(df['Group'])), 2))

df = df.set_index('Group')

#create list of tuples of intersections and lengths 
L = []
for a, b in c:
    d = np.intersect1d(df.loc[a], df.loc[b]).tolist()
    L.append((a,b, len(d), d))

#new DataFrame
df = pd.DataFrame(L, columns=['a','b','lens','common'])
print (df)
    a   b  lens    common
0  G2  G4     1      [a1]
1  G2  G1     2  [a1, a2]
2  G2  G3     1      [a1]
3  G2  G5     1      [a1]
4  G4  G1     1      [a1]
5  G4  G3     1      [a1]
6  G4  G5     1      [a1]
7  G1  G3     1      [a1]
8  G1  G5     1      [a1]
9  G3  G5     1      [a1]