Python 创建两个下降气泡图,并根据在同一数据集中找到的值将它们连接起来
我有以下数据,其中我想使用气泡图(较大的值有较大的气泡)在动物和食物之间按降序创建一个并排比较,并用线分别连接它们 数据表格式:Python 创建两个下降气泡图,并根据在同一数据集中找到的值将它们连接起来,python,matplotlib,seaborn,data-visualization,scatter-plot,Python,Matplotlib,Seaborn,Data Visualization,Scatter Plot,我有以下数据,其中我想使用气泡图(较大的值有较大的气泡)在动物和食物之间按降序创建一个并排比较,并用线分别连接它们 数据表格式: Animal Food Cat Hard_Food Cat Hard_Food Cat Hard_Food Cat Hard_Food Cat Soft_Food Cat Soft_Food Cat Soft_Food Cat Mouse Cat Soft_Food Cat Soft_Food Dog Hard_Food Dog Hard_Food Dog Hard
Animal Food
Cat Hard_Food
Cat Hard_Food
Cat Hard_Food
Cat Hard_Food
Cat Soft_Food
Cat Soft_Food
Cat Soft_Food
Cat Mouse
Cat Soft_Food
Cat Soft_Food
Dog Hard_Food
Dog Hard_Food
Dog Hard_Food
Dog Hard_Food
Dog Soft_Food
Dog Soft_Food
Dog Soft_Food
Dog Soft_Food
Dog Meat
Snake Mouse
Snake Meat
Snake Meat
Snake Meat
汇总表格式:
Hard_Food Meat Soft_Food Mouse Grand Total
Cat 4 0 5 1 10
Dog 4 1 4 0 9
Snake 0 3 0 1 4
GrandTotal 8 4 9 2 23
Python数据帧:
ani_foo = {'Animal': ['Cat','Cat','Cat','Cat','Cat','Cat','Cat','Cat','Cat','Cat','Dog','Dog','Dog','Dog','Dog','Dog','Dog','Dog','Dog','Snake','Snake','Snake','Snake'],
'Food': ['Hard_Food','Hard_Food','Hard_Food','Hard_Food','Soft_Food','Soft_Food','Soft_Food','Mouse','Soft_Food','Soft_Food','Hard_Food','Hard_Food','Hard_Food','Hard_Food','Soft_Food','Soft_Food','Soft_Food','Soft_Food','Meat','Mouse','Meat','Meat','Meat']
}
df = pd.DataFrame(ani_foo, columns = ['Animal', 'Food'])
所需输出(通过Excel手动创建):
您可以尝试使用networkx库在两部分网络上创建:
import pandas as pd
import networkx as nx
import matplotlib.pyplot as plt
ani_foo = {'Animal': ['Cat','Cat','Cat','Cat','Cat','Cat','Cat','Cat','Cat','Cat','Dog','Dog','Dog','Dog','Dog','Dog','Dog','Dog','Dog','Snake','Snake','Snake','Snake'],
'Food': ['Hard_Food','Hard_Food','Hard_Food','Hard_Food','Soft_Food','Soft_Food','Soft_Food','Mouse','Soft_Food','Soft_Food','Hard_Food','Hard_Food','Hard_Food','Hard_Food','Soft_Food','Soft_Food','Soft_Food','Soft_Food','Meat','Mouse','Meat','Meat','Meat']
}
df = pd.DataFrame(ani_foo, columns = ['Animal', 'Food'])
fig, ax = plt.subplots(figsize=(15,8))
G = nx.from_pandas_edgelist(df, 'Animal', 'Food')
G.add_nodes_from(df['Animal'], bipartite=0)
G.add_nodes_from(df['Food'], bipartite=1)
s = df.stack().value_counts()
s1 = s.index +'\n'+ s.astype(str)
pos = {node:[0, i] for i, node in enumerate(df['Animal'])}
pos.update({node:[1,i] for i, node in enumerate(df['Food'])})
color_dict = {'Cat':'g', 'Dog':'b', 'Snake':'y'}
ec = [color_dict[c] for i in G.edges for c in i if c in color_dict.keys()]
nx.draw_networkx(G,
node_size=[s[i]*250 for i in G.nodes],
pos=pos,
labels = s1.to_dict(),
node_color='lightblue',
edge_color=ec)
plt.axis('off')
输出:
看起来像是一个非常手动但直接的过程。谢谢分享!有没有办法调整绘图中的图像以避免标签被切断?我尝试了在Stackoverlfow上找到的各种选项,但我无法让它们发挥作用,我认为您可能有不同的方法。例如,如果将“ani_foo”中的第一个“cat”更改为“cattttttttttttttttt”,则“C”在图表中不再可见。