Python 选择依赖于其他列值的值_Python_Matplotlib_Histogram

Python 选择依赖于其他列值的值

python matplotlib

Python 选择依赖于其他列值的值,python,matplotlib,histogram,Python,Matplotlib,Histogram,我有一个数据文件，看起来像： 3 24.5 3 23.7 3 21.87 3 24.3 3 10.45 6 11.2 6 22.5 6 20.95 我想使用第二列中的数据，但只使用第一列中值为3的数据。我的代码当前获取第二列中的所有数字，而我只希望这些数字旁边有相应的“3”值。我应该向代码中添加什么来进行区分？这是我的密码： filename = raw_input("Enter file na

我有一个数据文件，看起来像：

3       24.5
3       23.7
3       21.87
3       24.3
3       10.45
6       11.2
6       22.5
6       20.95

我想使用第二列中的数据，但只使用第一列中值为3的数据。我的代码当前获取第二列中的所有数字，而我只希望这些数字旁边有相应的“3”值。我应该向代码中添加什么来进行区分？这是我的密码：

filename = raw_input("Enter file name: ") + '.csv'
filepath = '/home/david/Desktop/' + filename

data = np.genfromtxt(filepath, delimiter=',',skip_header=1, dtype=float)

rownum = input("Enter row number to use: ")
line = [row[rownum] for row in data]
binw = input("Enter bin width: ")
bins=arange(int(min(line)-1), int(max(line)+1), binw)

pyplot.hist(line, bins=bins, alpha=0.5, color='g')

pyplot.show()

我使用第5行作为必须分析和绘制的数据。但是，第3行有我希望python为我筛选的“3”和“6”值。

首先，您实际上不是指列而不是行吗

使用返回numpy数组的

np.genfromtxt

读取数据后，可以使用

numpy.where

仅选择在特定列中包含特定值的行。如果第3列包含应用于筛选的数据，则首先执行以下操作

data = data[np.where(data[:,target_column] == target_value)]

这将选择位置

target\u列

的值为

target\u值

的所有行。有了这些值，代码就变成了

data = data[np.where(data[:,3] == 3)]

在此之后，您可以通过简单地写入来选择包含要打印的数据的列

# I'm renaming rownum to colnum
line = data[:,colnum]

这应该是一个公平的起点。

如果

data

是包含数据的

numpy

数组，那么

data[data[：，0]==3][：，1]

将为您提供第二列数据，其中第一列为3。谢谢！是的，我喜欢。抱歉弄错了