Python 检查列值是否在其他列值之间的范围内_Python_Python 3.x_Pandas_Matplotlib_Dataframe

Python 检查列值是否在其他列值之间的范围内

python python-3.x pandas matplotlib dataframe

Python 检查列值是否在其他列值之间的范围内,python,python-3.x,pandas,matplotlib,dataframe,Python,Python 3.x,Pandas,Matplotlib,Dataframe,我是一个新手，我相信有一个简单的方法可以做到这一点，我不知道，提前感谢你的帮助我得到了过去10年中，销售团队表现最高和最低的员工在一年中每天的历史最低和最高销售数字。2016年我也得到了同样的数据（下面的数据示例）最终目标是在matplotlib中绘制此数据，但我只想从2016_min列中获取低于hist_min中的值的值，同样，只有2016_max列的值大于hist_max列。数据如下所示： hist_min hist_max 2016_min 2016

我是一个新手，我相信有一个简单的方法可以做到这一点，我不知道，提前感谢你的帮助

我得到了过去10年中，销售团队表现最高和最低的员工在一年中每天的历史最低和最高销售数字。2016年我也得到了同样的数据（下面的数据示例）

最终目标是在matplotlib中绘制此数据，但我只想从2016_min列中获取低于hist_min中的值的值，同样，只有2016_max列的值大于hist_max列。数据如下所示：

              hist_min  hist_max   2016_min  2016_max
Day_of_Year
1               1000    10000         898     NULL
2                234      896        NULL     1000
3               1254    23666        1000    24000
4                930    78999        NULL     NULL
5                278    74588        NULL     NULL

我在其中放了'NULL'来表示空值，Nan可能更好，但我不知道matplot库是否可以处理'Nan'数字。。。这是下一步，我很快就会知道

提前感谢您的帮助， Me

与布尔掩码一起使用，布尔掩码返回

NaN

、

NULL

或

None

如果

True

print (df['2016_min'] > df['hist_min'])
Day_of_Year
1    False
2     True
3    False
4     True
5     True
dtype: bool

df['2016_min'] = df['2016_min'].mask(df['2016_min'] > df['hist_min'])
df['2016_max'] = df['2016_max'].mask(df['2016_max'] < df['hist_max'])
print (df)
             hist_min  hist_max  2016_min  2016_max
Day_of_Year                                        
1                1000     10000     898.0       NaN
2                 234       896       NaN    1000.0
3                1254     23666    1000.0   24000.0
4                 930     78999       NaN       NaN
5                 278     74588       NaN       NaN

print（df['2016_min']>df['hist_min']
一年中的第二天
1错误
2正确
3错误
4正确
5对
数据类型：bool
df['2016_-min']=df['2016_-min'].遮罩（df['2016_-min']>df['hist_-min'）
df['2016_max']=df['2016_max'].遮罩（df['2016_max']



df['2016_-min']=df['2016_-min'].掩码（df['2016_-min']>df['hist_-min']，'NULL'）
df['2016_max']=df['2016_max'].掩码（df['2016_max']


df['2016_-min']=df['2016_-min'].掩码（df['2016_-min]>df['hist_-min'，无）
df['2016_max']=df['2016_max'].遮罩（df['2016_max']
您可以根据条件索引数据帧
 df1 = df[df["2016_max"] > df["hist_max"]]

然后，可以使用matplotlib轻松绘制该图
u = u"""Day_of_Year      hist_min  hist_max   2016_min  2016_max
1               1000    10000         898     9000
2                234      896         300     1000
3               1254    23666        1000    24000
4                930    78999        1000     1050
5                278    74588         300     5000"""

import io
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv(io.StringIO(u), index_col=0, delim_whitespace=True)

df1 = df[df["2016_max"] > df["hist_max"]]
df2 = df[df["2016_min"] < df["hist_min"]]

fig, ax = plt.subplots()
ax.scatter(df1.index, df1["2016_max"], label="max. 2016")
ax.scatter(df2.index, df1["2016_min"], label="min. 2016")

plt.legend()
plt.show()

u=u”““历史年份中的日期历史最短时间2016年最短时间2016年最长时间
1               1000    10000         898     9000
2                234      896         300     1000
3               1254    23666        1000    24000
4                930    78999        1000     1050
5                278    74588         300     5000"""
输入io
作为pd进口熊猫
将matplotlib.pyplot作为plt导入
df=pd.read\u csv（io.StringIO（u），index\u col=0，delim\u whitespace=True）
df1=df[df[“2016年最大值”]>df[“历史最大值”]]
df2=df[df[“2016年最低”]


df['2016_min'] = df['2016_min'].mask(df['2016_min'] > df['hist_min'], None)
df['2016_max'] = df['2016_max'].mask(df['2016_max'] < df['hist_max'], None)
print (df)
             hist_min  hist_max 2016_min 2016_max
Day_of_Year                                      
1                1000     10000      898     None
2                 234       896     None     1000
3                1254     23666     1000    24000
4                 930     78999     None     None
5                 278     74588     None     None

 df1 = df[df["2016_max"] > df["hist_max"]]

u = u"""Day_of_Year      hist_min  hist_max   2016_min  2016_max
1               1000    10000         898     9000
2                234      896         300     1000
3               1254    23666        1000    24000
4                930    78999        1000     1050
5                278    74588         300     5000"""

import io
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv(io.StringIO(u), index_col=0, delim_whitespace=True)

df1 = df[df["2016_max"] > df["hist_max"]]
df2 = df[df["2016_min"] < df["hist_min"]]

fig, ax = plt.subplots()
ax.scatter(df1.index, df1["2016_max"], label="max. 2016")
ax.scatter(df2.index, df1["2016_min"], label="min. 2016")

plt.legend()
plt.show()