Python 如何在行上迭代以查找表中列的常量值_Python_Pandas_Loops

Python 如何在行上迭代以查找表中列的常量值

python pandas loops

Python 如何在行上迭代以查找表中列的常量值,python,pandas,loops,Python,Pandas,Loops,我有一个时间序列数据帧，我想找到与其他行中的值相匹配的行的常量值。假设这是DF： temp = [27.18, 27.18, 27.18, 27.18, 20.82, 20.82, 20.82, 20.82, 15.18, 15.18, 15.18, 15.18, 15.24, 15.24, 15.24, 15.24, 20.4 , 20.4 , 20.4 , 20.4 , 21.48, 21.48, 21.48, 21.48, 27.66, 27.66, 27.66

我有一个时间序列数据帧，我想找到与其他行中的值相匹配的行的常量值。假设这是DF：

temp = [27.18, 27.18, 27.18, 27.18, 20.82, 20.82, 20.82, 20.82, 15.18,
       15.18, 15.18, 15.18, 15.24, 15.24, 15.24, 15.24, 20.4 , 20.4 ,
       20.4 , 20.4 , 21.48, 21.48, 21.48, 21.48, 27.66, 27.66, 27.66,
       27.66, 27.9 , 27.9 , 27.9 , 27.9 , 27.9 , 27.9 , 27.9 , 27.9 ,
       27.84, 27.84, 27.84, 27.84, 27.84, 27.84, 27.84, 27.84, 21.72,
       21.72, 21.72, 21.72]
heat = [11.94, 12.  , 10.56,  6.  ,  6.  ,  6.  ,  6.  ,  6.  ,  6.  ,
        6.  ,  6.  ,  6.  ,  6.  ,  6.78,  9.  ,  9.  ,  9.  ,  9.  ,
        9.  ,  9.  ,  9.  , 11.58, 12.  , 11.94, 11.94, 12.  , 12.  ,
       11.94, 11.94, 12.  , 11.94, 12.  , 11.94, 12.  , 12.  , 11.94,
       12.  , 11.94, 11.94, 12.  , 11.94,  9.48,  9.  ,  9.  ,  9.  ,
        9.  ,  8.94,  9.  ]
date = ['2016-01-29 12:00:00', '2016-01-29 12:15:00',
       '2016-01-29 12:30:00', '2016-01-29 12:45:00',
       '2016-01-29 13:00:00', '2016-01-29 13:15:00',
       '2016-01-29 13:30:00', '2016-01-29 13:45:00',
       '2016-01-29 14:00:00', '2016-01-29 14:15:00',
       '2016-01-29 14:30:00', '2016-01-29 14:45:00',
       '2016-01-29 15:00:00', '2016-01-29 15:15:00',
       '2016-01-29 15:30:00', '2016-01-29 15:45:00',
       '2016-01-29 16:00:00', '2016-01-29 16:15:00',
       '2016-01-29 16:30:00', '2016-01-29 16:45:00',
       '2016-01-29 17:00:00', '2016-01-29 17:15:00',
       '2016-01-29 17:30:00', '2016-01-29 17:45:00',
       '2016-01-29 18:00:00', '2016-01-29 18:15:00',
       '2016-01-29 18:30:00', '2016-01-29 18:45:00',
       '2016-01-29 19:00:00', '2016-01-29 19:15:00',
       '2016-01-29 19:30:00', '2016-01-29 19:45:00',
       '2016-01-29 20:00:00', '2016-01-29 20:15:00',
       '2016-01-29 20:30:00', '2016-01-29 20:45:00',
       '2016-01-29 21:00:00', '2016-01-29 21:15:00',
       '2016-01-29 21:30:00', '2016-01-29 21:45:00',
       '2016-01-29 22:00:00', '2016-01-29 22:15:00',
       '2016-01-29 22:30:00', '2016-01-29 22:45:00',
       '2016-01-29 23:00:00', '2016-01-29 23:15:00',
       '2016-01-29 23:30:00', '2016-01-29 23:45:00']

df = pd.DataFrame(date, columns=['date'])

df.insert(1 ,'temp', temp, True)

df.insert(2, 'heat', heat, True )

df.index = df.date

del df['date']

情节如下所示：

我需要找到标记在两条黄线之间的区域，其中的值几乎是恒定的，没有渐变区域。我一直在使用移位法，但这不是很理想。你知道如何提前实现这个感谢吗。我正在尝试的移位方法


df.heat！=df.heat.shift（1））.cumsum（）

期望输出：

此绘图遮罩是您要查找的吗：

df[df.temp.duplicated() & df.heat.duplicated()].plot()

第二次尝试：

df= pd.DataFrame({"temp":temp,"heat":heat}, index= pd.to_datetime(date) )
thtemp=0.5  # threshold 
thheat=0.5 

crit= df.temp.diff().abs().lt(thtemp) & df.heat.diff().abs().lt(thheat) 

rng=np.arange(1,len(df)+1) 
df["const"]= np.where(crit.eq(False),rng,np.nan) 
df["const"]= df.const.ffill()

                      temp   heat  const
2016-01-29 12:00:00  27.18  11.94    1.0
2016-01-29 12:15:00  27.18  12.00    1.0
2016-01-29 12:30:00  27.18  10.56    3.0
2016-01-29 12:45:00  27.18   6.00    4.0
2016-01-29 13:00:00  20.82   6.00    5.0
2016-01-29 13:15:00  20.82   6.00    5.0
2016-01-29 13:30:00  20.82   6.00    5.0
2016-01-29 13:45:00  20.82   6.00    5.0
2016-01-29 14:00:00  15.18   6.00    9.0
2016-01-29 14:15:00  15.18   6.00    9.0
2016-01-29 14:30:00  15.18   6.00    9.0
2016-01-29 14:45:00  15.18   6.00    9.0
2016-01-29 15:00:00  15.24   6.00    9.0
                 ...
G= df.groupby(df.const)
for key,grp in G: 
    if len(grp)>1: 
          print(f"\t{grp.index[0]}\n\t{grp.index[-1]}\n") 

    2016-01-29 12:00:00
    2016-01-29 12:15:00

    2016-01-29 13:00:00
    2016-01-29 13:45:00

    2016-01-29 14:00:00
    2016-01-29 15:00:00

    2016-01-29 15:30:00
    2016-01-29 15:45:00

    2016-01-29 16:00:00
    2016-01-29 16:45:00

    2016-01-29 17:15:00
    2016-01-29 17:45:00

    2016-01-29 18:00:00
    2016-01-29 22:00:00

    2016-01-29 22:15:00
    2016-01-29 22:45:00

    2016-01-29 23:00:00
    2016-01-29 23:45:00

绘图：

vrep=13
#vrep= (df.temp.mean()+df.heat.mean())/2
for key,grp in G:

    if len(grp)>1:
        ser= grp.const.replace(key,vrep).reindex(df.index)
        plt.plot(ser.index,ser,color="orange", linewidth=2)

plt.plot(df.index,df.temp,color="darkgreen",label="temp")
plt.plot(df.index,df.heat,color="darkblue",label="heat")
plt.legend(loc="best")
plt.grid()
plt.show()

编辑：这是第一个解决方案，但没有提供所有常量段：

thtemp=0.5  # threshold
thheat=0.5

crit= df.temp.diff().abs().lt(thtemp) & df.heat.diff().abs().lt(thheat)

df["const"]= crit.astype(int).replace(0,np.nan)

展开当前接受答案，创建数据框

import pandas as pd

temp = [27.18, 27.18, 27.18, 27.18, 20.82, 20.82, 20.82, 20.82, 15.18,
       15.18, 15.18, 15.18, 15.24, 15.24, 15.24, 15.24, 20.4 , 20.4 ,
       20.4 , 20.4 , 21.48, 21.48, 21.48, 21.48, 27.66, 27.66, 27.66,
       27.66, 27.9 , 27.9 , 27.9 , 27.9 , 27.9 , 27.9 , 27.9 , 27.9 ,
       27.84, 27.84, 27.84, 27.84, 27.84, 27.84, 27.84, 27.84, 21.72,
       21.72, 21.72, 21.72]
heat = [11.94, 12.  , 10.56,  6.  ,  6.  ,  6.  ,  6.  ,  6.  ,  6.  ,
        6.  ,  6.  ,  6.  ,  6.  ,  6.78,  9.  ,  9.  ,  9.  ,  9.  ,
        9.  ,  9.  ,  9.  , 11.58, 12.  , 11.94, 11.94, 12.  , 12.  ,
       11.94, 11.94, 12.  , 11.94, 12.  , 11.94, 12.  , 12.  , 11.94,
       12.  , 11.94, 11.94, 12.  , 11.94,  9.48,  9.  ,  9.  ,  9.  ,
        9.  ,  8.94,  9.  ]
date = ['2016-01-29 12:00:00', '2016-01-29 12:15:00',
       '2016-01-29 12:30:00', '2016-01-29 12:45:00',
       '2016-01-29 13:00:00', '2016-01-29 13:15:00',
       '2016-01-29 13:30:00', '2016-01-29 13:45:00',
       '2016-01-29 14:00:00', '2016-01-29 14:15:00',
       '2016-01-29 14:30:00', '2016-01-29 14:45:00',
       '2016-01-29 15:00:00', '2016-01-29 15:15:00',
       '2016-01-29 15:30:00', '2016-01-29 15:45:00',
       '2016-01-29 16:00:00', '2016-01-29 16:15:00',
       '2016-01-29 16:30:00', '2016-01-29 16:45:00',
       '2016-01-29 17:00:00', '2016-01-29 17:15:00',
       '2016-01-29 17:30:00', '2016-01-29 17:45:00',
       '2016-01-29 18:00:00', '2016-01-29 18:15:00',
       '2016-01-29 18:30:00', '2016-01-29 18:45:00',
       '2016-01-29 19:00:00', '2016-01-29 19:15:00',
       '2016-01-29 19:30:00', '2016-01-29 19:45:00',
       '2016-01-29 20:00:00', '2016-01-29 20:15:00',
       '2016-01-29 20:30:00', '2016-01-29 20:45:00',
       '2016-01-29 21:00:00', '2016-01-29 21:15:00',
       '2016-01-29 21:30:00', '2016-01-29 21:45:00',
       '2016-01-29 22:00:00', '2016-01-29 22:15:00',
       '2016-01-29 22:30:00', '2016-01-29 22:45:00',
       '2016-01-29 23:00:00', '2016-01-29 23:15:00',
       '2016-01-29 23:30:00', '2016-01-29 23:45:00']

df = pd.DataFrame({'date': date, 'temp': temp, 'heat': heat})
df.index = pd.to_datetime(df['date'],infer_datetime_format=True)
del df['date']

创建值为常量时为True的布尔变量

thtemp=0.5  # threshold
thheat=0.5

df["const"] = df.temp.diff().abs().lt(thtemp) & df.heat.diff().abs().lt(thheat)
df.head()
                      temp   heat  const
date                                    
2016-01-29 12:00:00  27.18  11.94  False
2016-01-29 12:15:00  27.18  12.00   True
2016-01-29 12:30:00  27.18  10.56  False
2016-01-29 12:45:00  27.18   6.00  False
2016-01-29 13:00:00  20.82   6.00  False

当const==True时，打印并填充区域

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots()
ax.plot(df.index, df['temp'])
ax.plot(df.index, df['heat'])

ax.fill_between(df.index, 0, 1, where=df['const'], alpha=0.1, transform=ax.get_xaxis_transform())

plt.gcf().autofmt_xdate()
plt.show()

为什么要把数据框弄得如此复杂？这是一个原始数据的样本，有一些变化。我可以看到四个区域，其中df有恒定的行，最右边的一对黄线不是其中之一。我需要找到热量和温度相对恒定的区域。“相对恒定”！=“常数”。你需要非常清楚你想要什么通过这个函数我得到了一个更好的绘图，但是绘图中仍然有斜坡区域。绘图应该是直线，没有波动发生。这意味着，热量和温度都变得恒定。谢谢你的回答，它工作得很好。这里只有一件事：

2016-01-29 13:15:00 2016-01-29 13:45:00

常量从13:00:00开始。如果可能的话，我也能拿到吗。你是怎么画红色常数的line@Arpit缺失片段的问题已修复，请参见上文。非常感谢！这正是我要找的！

import matplotlib.pyplot as plt
import numpy as np

fig, ax = plt.subplots()
ax.plot(df.index, df['temp'])
ax.plot(df.index, df['heat'])

ax.fill_between(df.index, 0, 1, where=df['const'], alpha=0.1, transform=ax.get_xaxis_transform())

plt.gcf().autofmt_xdate()
plt.show()