Python 使用来自不同数据集的for循环进行绘图_Python_Pandas_Numpy_For Loop_Plot

Python 使用来自不同数据集的for循环进行绘图

python pandas numpy for-loop plot

Python 使用来自不同数据集的for循环进行绘图,python,pandas,numpy,for-loop,plot,Python,Pandas,Numpy,For Loop,Plot,我有两个数据集： df1= df2= id_now cluster_group 0 403 0 1 1249 1 2 1531 3 3 14318 3 我希望无法创建循环（或smth else）以：在df2中，值403仅属于一个集群组（0）转到df1，然后检查与403纬度-2点和经度-2点相关的所有点要点。并策划它们重复整个df1 df2 bu在一个图中绘制（每个簇的颜色不同）-我实际上可以管理这个，但如果您可以提供smth（？）

我有两个数据集：

df1=

df2=

    id_now  cluster_group
0   403     0
1   1249    1
2   1531    3
3   14318   3

我希望无法创建循环（或smth else）以：

在
```
df2
```
中，值
```
403
```
仅属于一个集群组
```
（0）
```
转到
```
df1
```
，然后检查与
```
403
```
纬度-2点和经度-2点相关的所有点要点。并策划它们
重复整个df1 df2 bu在一个图中绘制（每个簇的颜色不同）-我实际上可以管理这个，但如果您可以提供smth（？）

df2

和

中的p.S.属于同一集群。所以无论如何，我想用一种颜色（或一张地图）来绘制它的点

尝试：

每种颜色都代表

cluster\u组

以下是如何使用

pandas

和

matplotlib.pyplot

执行此操作

import pandas as pd
import matplotlib.pyplot as plt

#here I read the dataframe from a file, you read it in the way you prefer
df1 = pd.read_csv('data.txt', sep='\s+')
df2 = pd.read_csv('data2.txt', sep='\s+')

#the important piece of code is here:
for g, gdf in df2.groupby('cluster_group'):
    df1_to_plot = df1.loc[df1['id_first'].isin(gdf['id_now'])]
    plt.plot(df1_to_plot['latitude'], df1_to_plot['longitude'], label='Cluster {:d}'.format(g))

plt.legend()
plt.show()

如果您不熟悉和：

df2.groupby（'cluster\u group'）

在

df2

的子集上返回一个迭代器，每个子集都是通过对

'cluster\u group'

列中具有相同值的所有行进行分组而构建的

使用这些子集中的每一个子集

gdf

我选择

df1

的行，其中

'id_first'

列中的值包含在

gdf

中。这是通过

isin

方法完成的。此选择存储在数据框

df1_to_plot

中，其中包含要打印的数据

现在我可以使用

plt.plot

来实际绘制数据。Matplotlib将自行处理颜色。创建图例时，

legend

方法使用

label

参数

使用您提供的简单数据，此代码将生成以下图像（x轴为纬度，y轴为经度）：

如果你已经有了解决方案……你目前的解决方案是什么？你为什么要寻找替代方案？如果你解释了这些要点，你可以希望得到更好的答案。@Valentino我没有解决方案。这是internetAh的一张随机照片，好吧。因为你说“我能解决”，所以我认为你有解决方案。

n_clusters = 46

for k in range(0, n_clusters):
     ....

import pandas as pd
import matplotlib.pyplot as plt

#here I read the dataframe from a file, you read it in the way you prefer
df1 = pd.read_csv('data.txt', sep='\s+')
df2 = pd.read_csv('data2.txt', sep='\s+')

#the important piece of code is here:
for g, gdf in df2.groupby('cluster_group'):
    df1_to_plot = df1.loc[df1['id_first'].isin(gdf['id_now'])]
    plt.plot(df1_to_plot['latitude'], df1_to_plot['longitude'], label='Cluster {:d}'.format(g))

plt.legend()
plt.show()