Python 根据dataframe的列值使用networkx创建图形
我有以下数据帧:Python 根据dataframe的列值使用networkx创建图形,python,pandas,networkx,Python,Pandas,Networkx,我有以下数据帧: import pandas as pd df = pd.DataFrame({'id_emp': [1,2,3,4,1], 'name_emp': ['x','y','z','w','x'], 'donnated_value':[1100,11000,500,300,1000], 'refound_value':[22000,22000,50000,450,90]
import pandas as pd
df = pd.DataFrame({'id_emp': [1,2,3,4,1],
'name_emp': ['x','y','z','w','x'],
'donnated_value':[1100,11000,500,300,1000],
'refound_value':[22000,22000,50000,450,90]
})
df['return_percentagem'] = 100 *
df['refound_value']/df['donnated_value']
df['classification_roi'] = ''
def comunidade(i):
if i < 50:
return 'Bad Investment'
elif i >=50 and i < 100:
return 'Median Investment'
elif i >= 100:
return 'Good Investment'
df['classification_roi'] = df['return_percentagem'].map(comunidade)
df
欢迎提供任何帮助。在这里,我没有使用来自edgelist的
。相反,列出理解和for循环:
import matplotlib.pyplot as plt
import networkx as nx
import itertools
G = nx.Graph()
# use index to name nodes, rather than id_emp, otherwise
# multiple nodes would end up having the same name
G.add_nodes_from([a for a in df.index])
#create edges:
#same employee edges
for ie in set(df['id_emp']):
indices = df[df['id_emp']==ie].index
G.add_edges_from(itertools.product(indices,indices))
# same classification edges
for cr in set(df['classification_roi']):
indices = df[df['classification_roi']==cr].index
G.add_edges_from(itertools.product(indices,indices))
nx.draw(G)
plt.show()
可选:着色,以区分节点
plt.subplot(121)
plt.title('coloured by id_emp')
nx.draw(G, node_color=df['id_emp'], cmap='viridis')
plt.subplot(122)
color_mapping = {
'Bad Investment': 0,
'Median Investment': 1,
'Good Investment':2}
plt.title('coloured by classification_roi')
nx.draw(G, node_color=df['classification_roi'].replace(color_mapping), cmap='RdYlBu')
在这里,我没有使用来自edgelist的。相反,列出理解和for循环:
import matplotlib.pyplot as plt
import networkx as nx
import itertools
G = nx.Graph()
# use index to name nodes, rather than id_emp, otherwise
# multiple nodes would end up having the same name
G.add_nodes_from([a for a in df.index])
#create edges:
#same employee edges
for ie in set(df['id_emp']):
indices = df[df['id_emp']==ie].index
G.add_edges_from(itertools.product(indices,indices))
# same classification edges
for cr in set(df['classification_roi']):
indices = df[df['classification_roi']==cr].index
G.add_edges_from(itertools.product(indices,indices))
nx.draw(G)
plt.show()
可选:着色,以区分节点
plt.subplot(121)
plt.title('coloured by id_emp')
nx.draw(G, node_color=df['id_emp'], cmap='viridis')
plt.subplot(122)
color_mapping = {
'Bad Investment': 0,
'Median Investment': 1,
'Good Investment':2}
plt.title('coloured by classification_roi')
nx.draw(G, node_color=df['classification_roi'].replace(color_mapping), cmap='RdYlBu')
是数值“返回百分比”列的。我想要带值的“id\u emp”在“返回百分比”列中低于50,以相互连接,以及与其他组的连接。从这些数据来看,网络应该是什么样子?为什么有两个id\u emp
s,其值1
?它们是两个节点还是一个节点?它们是两个节点,并根据“结果百分比”值的范围进行分组。我想根据本专栏的价值创建社区。例如:“结果百分比”值大于100的社区,51到100之间的社区。@ScottBoston我认为最好用这些值来解释“返回百分比”列的。我想要带值的“id\u emp”在“返回百分比”列中低于50,以相互连接,以及与其他组的连接。从这些数据来看,网络应该是什么样子?为什么有两个id\u emp
s,其值1
?它们是两个节点还是一个节点?它们是两个节点,并根据“结果百分比”值的范围进行分组。我想根据本专栏的价值创建社区。例如:“结果百分比”值大于100的社区,51到100之间的社区。@ScottBoston我认为这是更好的解释