Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/333.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 获取每行最后150行中所有正值的计数-熊猫_Python_Pandas - Fatal编程技术网

Python 获取每行最后150行中所有正值的计数-熊猫

Python 获取每行最后150行中所有正值的计数-熊猫,python,pandas,Python,Pandas,我有以下数据集,其中每行都有列日期和值。它同时具有+ve和-ve值。我必须得到最后150行的所有正值的计数。每行。因此前150行将具有空值。然后,以下行将具有最后150个+ve行的计数,类似地,-ve列将填充负值的计数,直到该行 我尝试使用: def get_count_of_all_150_positive_rows_before_this_row(row): df1 = row.tail(2) df1 = df1.to_frame() print(df1.tail()

我有以下数据集,其中每行都有列日期和值。它同时具有+ve-ve值。我必须得到最后150行的所有正值的计数。每行。因此前150行将具有空值。然后,以下行将具有最后150个+ve行的计数,类似地,-ve列将填充负值的计数,直到该行

我尝试使用:

def get_count_of_all_150_positive_rows_before_this_row(row):
    df1 = row.tail(2)
    df1 = df1.to_frame()
    print(df1.tail())
    # if df1['positive_values'] > 0:
        return (df1['positive_values'].count())


df.apply(get_count_of_all_150_positive_rows_before_this_row, axis=1)
数据集:

Date        values      positive_values    negative_values
01/01/08    0.12344     
02/01/08    -0.12344        
03/01/08    -0.1234433      
04/01/08    -0.12344        
05/01/08    -0.1234433      
06/01/08    -0.12344        
07/01/08    -0.1234433      
08/01/08    -0.12344        
09/01/08    -0.1234433      
10/01/08    0.12344     
11/01/08    -0.12344        
12/01/08    -0.1234433      
13/01/08    -0.12344        
14/01/08    -0.1234433      
15/01/08    -0.12344        
16/01/08    -0.1234433      
17/01/08    -0.12344        
18/01/08    -0.1234433      
19/01/08    0.12344     

这可能就是您正在寻找的:

import numpy as np

tail = df.tail(5)
pos = len(tail[df['values']>0])
neg = len(tail[df['values']<0])

df['pos_values'], df['neg_values'] = np.nan, np.nan
df.loc[df.index.values[-5:], 'pos_values'] = pos
df.loc[df.index.values[-5:], 'neg_values'] = neg

#         Date    values  pos_values  neg_values
# 0   01/01/08  0.123440         NaN         NaN
# 1   02/01/08 -0.123440         NaN         NaN
# 2   03/01/08 -0.123443         NaN         NaN
# 3   04/01/08 -0.123440         NaN         NaN
# 4   05/01/08 -0.123443         NaN         NaN
# 5   06/01/08 -0.123440         NaN         NaN
# 6   07/01/08 -0.123443         NaN         NaN
# 7   08/01/08 -0.123440         NaN         NaN
# 8   09/01/08 -0.123443         NaN         NaN
# 9   10/01/08  0.123440         NaN         NaN
# 10  11/01/08 -0.123440         NaN         NaN
# 11  12/01/08 -0.123443         NaN         NaN
# 12  13/01/08 -0.123440         NaN         NaN
# 13  14/01/08 -0.123443         NaN         NaN
# 14  15/01/08 -0.123440         1.0         4.0
# 15  16/01/08 -0.123443         1.0         4.0
# 16  17/01/08 -0.123440         1.0         4.0
# 17  18/01/08 -0.123443         1.0         4.0
# 18  19/01/08  0.123440         1.0         4.0
将numpy导入为np
尾部=测向尾部(5)
pos=len(尾部[df['values']>0])

neg=len(tail[df['values']这可能就是您想要的:

import numpy as np

tail = df.tail(5)
pos = len(tail[df['values']>0])
neg = len(tail[df['values']<0])

df['pos_values'], df['neg_values'] = np.nan, np.nan
df.loc[df.index.values[-5:], 'pos_values'] = pos
df.loc[df.index.values[-5:], 'neg_values'] = neg

#         Date    values  pos_values  neg_values
# 0   01/01/08  0.123440         NaN         NaN
# 1   02/01/08 -0.123440         NaN         NaN
# 2   03/01/08 -0.123443         NaN         NaN
# 3   04/01/08 -0.123440         NaN         NaN
# 4   05/01/08 -0.123443         NaN         NaN
# 5   06/01/08 -0.123440         NaN         NaN
# 6   07/01/08 -0.123443         NaN         NaN
# 7   08/01/08 -0.123440         NaN         NaN
# 8   09/01/08 -0.123443         NaN         NaN
# 9   10/01/08  0.123440         NaN         NaN
# 10  11/01/08 -0.123440         NaN         NaN
# 11  12/01/08 -0.123443         NaN         NaN
# 12  13/01/08 -0.123440         NaN         NaN
# 13  14/01/08 -0.123443         NaN         NaN
# 14  15/01/08 -0.123440         1.0         4.0
# 15  16/01/08 -0.123443         1.0         4.0
# 16  17/01/08 -0.123440         1.0         4.0
# 17  18/01/08 -0.123443         1.0         4.0
# 18  19/01/08  0.123440         1.0         4.0
将numpy导入为np
尾部=测向尾部(5)
pos=len(尾部[df['values']>0])
neg=len(tail[df['values']您希望使用pd.rolling()对给定上一个“period”计数的正和负进行滚动计数

period = 5
df['less_than_zero'] = (df['values']
                        .rolling(window=period, min_periods=period)
                        .agg(lambda x: (x < 0).sum()))

df['greater_than_zero'] = (df['values']
                          .rolling(window=period,min_periods=period)
                          .agg(lambda x: (x > 0).sum()))
注意:值得在样本数据集中添加几个0,以确保在这种情况下不会遗漏对它们的属性。(我们没有,但仍然)

您希望使用pd.rolling()对给定上一个“周期”计数的正值和负值执行滚动计数

period = 5
df['less_than_zero'] = (df['values']
                        .rolling(window=period, min_periods=period)
                        .agg(lambda x: (x < 0).sum()))

df['greater_than_zero'] = (df['values']
                          .rolling(window=period,min_periods=period)
                          .agg(lambda x: (x > 0).sum()))

注意:值得在示例数据集中添加几个0,以确保在这种情况下不会遗漏对它们的归属。(我们没有,但仍然)

您能为+-5而不是150行添加输出吗?15-20行也应该足够了。谢谢。我已经截断了您的数据。一个简单的示例并不需要所有的行。我们可以用5而不是150行来演示逻辑。@Jason,请提供所需的输出,假设最后5行,给出您问题中截断的数据。您能为+-5而不是150行添加输出吗?另外,15-20行应该足够了。谢谢。我已经截断了您的数据。一个最小的示例并不需要您的所有行。我们可以用5而不是150来演示逻辑。@Jason,请提供所需的输出,假设最后5行,给出您问题中截断的数据。