Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/363.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181

Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/qt/6.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 将操作应用于所有字典键_Python_Pandas_Dataframe - Fatal编程技术网

Python 将操作应用于所有字典键

Python 将操作应用于所有字典键,python,pandas,dataframe,Python,Pandas,Dataframe,给定数据帧df: paper reference count 9384155 p25 r50 1 7434371 p98 r9 78 7433400 p7 r27 5 7431765 p101 r91 501 7422256 p22 r5 91 ... 我创建了一个字典,通过count将df子集: d

给定数据帧
df

           paper    reference   count
9384155    p25      r50         1
7434371    p98      r9          78
7433400    p7       r27         5
7431765    p101     r91         501
7422256    p22      r5          91
...
我创建了一个字典,通过
count
df
子集:

df_dict={key:df[df['count']==key] for key in df['count'].unique()}
对于
df_dict
中的每个子数据帧,我希望应用以下操作:

df_dict[i] = df_dict[i].drop(['count'], axis=1)

pairs = df_dict[i].merge(df_dict[i], on=["reference"])
pairs = pairs[pairs["paper_x"] < pairs["paper_y"]]
pairs = pairs.groupby(["paper_x", "paper_y"]).count().reset_index()
pairs.columns = ["paper1", "paper2", "common"]

refs = df_dict[i].groupby(["paper"]).count().reset_index()
refs.columns = ["paper", "freq"]

result = pairs.merge(refs, how="left", left_on="paper1", right_on="paper")
result = result.merge(refs, how="left", left_on="paper2", right_on="paper")
result = result[["paper1", "freq_x", "paper2", "freq_y", "common"]]
result.columns = ["paper1", "freq1", "paper2", "freq2", "common"]
但是该操作返回一个大数据帧,并且返回的值不正确


理想情况下,我想要一组多个
result
数据帧(用
count
分隔,就像它们在
df_dict
中一样)。

“我尝试在范围内(df['count'].max())为I运行一个循环。”请将此作为代码的一部分显示。抱歉,我认为这是非常冗余和冗长的。应用了编辑。“我尝试在范围内(df['count'].max())为I运行循环…”请将此作为代码的一部分显示。抱歉,我认为这是非常冗余和冗长的。应用编辑。
for i in range(df['count'].max()):
  try:
    c = df_dict[i]
    c = c.drop(['count'], axis=1)
    pairs = c.merge(c,on=['reference'])
    pairs = pairs[pairs["paper_x"] < pairs["paper_y"]]
    pairs = pairs.groupby(["paper_x", "paper_y"]).count().reset_index() 
    pairs.columns = ["paper1", "paper2", "common"]    
    refs = c.groupby(["paper"]).count().reset_index()
    refs.columns = ["paper", "freq"]
    result = pairs.merge(refs, how="left", left_on="paper1", right_on="paper")
    result = result.merge(refs, how="left", left_on="paper2", right_on="paper")
    result = result[["paper1", "freq_x", "paper2", "freq_y", "common"]]
    result.columns = ["paper1", "freq1", "paper2", "freq2", "common"]
  except KeyError:
    continue