Python 在熊猫中,我如何计算连续的积极和消极因素?
在python或numpy中,是否有一个内置函数或函数组合可以计算一行中正值或负值的数量 这可以被认为类似于一个轮盘赌轮盘,轮盘上有一排黑色或红色的数字 输入系列数据示例:Python 在熊猫中,我如何计算连续的积极和消极因素?,python,pandas,numpy,Python,Pandas,Numpy,在python或numpy中,是否有一个内置函数或函数组合可以计算一行中正值或负值的数量 这可以被认为类似于一个轮盘赌轮盘,轮盘上有一排黑色或红色的数字 输入系列数据示例: Date 2000-01-07 -3.550049 2000-01-10 28.609863 2000-01-11 -2.189941 2000-01-12 4.419922 2000-01-13 17.690185 2000-01-14 41.219971 2000-01-18
Date
2000-01-07 -3.550049
2000-01-10 28.609863
2000-01-11 -2.189941
2000-01-12 4.419922
2000-01-13 17.690185
2000-01-14 41.219971
2000-01-18 0.000000
2000-01-19 -16.330078
2000-01-20 7.950195
2000-01-21 0.000000
2000-01-24 38.370117
2000-01-25 6.060059
2000-01-26 3.579834
2000-01-27 7.669922
2000-01-28 2.739991
2000-01-31 -8.039795
2000-02-01 10.239990
2000-02-02 -1.580078
2000-02-03 1.669922
2000-02-04 7.440186
2000-02-07 -0.940185
期望输出:
- in a row 5 times
+ in a row 4 times
++ in a row once
++++ in a row once
+++++++ in a row once
您可以使用函数
import itertools
l = [-3.550049, 28.609863, -2.189941, 4.419922, 17.690185, 41.219971, 0.000000, -16.330078, 7.950195, 0.000000, 38.370117, 6.060059, 3.579834, 7.669922, 2.739991, -8.039795, 10.239990, -1.580078, 1.669922, 7.440186, -0.940185]
r_pos = {}
r_neg = {}
for k, v in itertools.groupby(l, lambda e:e>0):
count = len(list(v))
r = r_pos
if k == False:
r = r_neg
if count not in r.keys():
r[count] = 0
r[count] += 1
for k, v in r_neg.items():
print '%s in a row %s time(s)' % ('-'*k, v)
for k, v in r_pos.items():
print '%s in a row %s time(s)' % ('+'*k, v)
输出
- in a row 6 time(s)
+ in a row 2 time(s)
++ in a row 1 time(s)
++++ in a row 1 time(s)
+++++++ in a row 1 time(s)
根据你认为的正值,你可以改变线<代码> lambda E:E>0 /代码>
< P>非负:from functools import reduce # For Python 3.x
ser = df['x'] >= 0
c = ser.expanding().apply(lambda r: reduce(lambda x, y: x + 1 if y else x * y, r))
c[ser & (ser != ser.shift(-1))].value_counts()
Out:
1.0 2
7.0 1
4.0 1
2.0 1
Name: x, dtype: int64
负片:
ser = df['x'] < 0
c = ser.expanding().apply(lambda r: reduce(lambda x, y: x + 1 if y else x * y, r))
c[ser & (ser != ser.shift(-1))].value_counts()
Out:
1.0 6
Name: x, dtype: int64
现在,为了确定转折点,条件是当前值与下一个值不同,并且是真的。如果你选择了这些,你就有了计数。到目前为止,这就是我想到的,它工作并输出一行中每个负值、正值和零值出现多少次的计数。也许有人可以使用上面ayhan和Ghilas发布的一些建议使它更简洁
from collections import Counter
ser = [-3.550049, 28.609863, -2.1, 89941,4.419922,17.690185,41.219971,0.000000,-16.330078,7.950195,0.000000,38.370117,6.060059,3.579834,7.669922,2.739991,-8.039795,10.239990,-1.580078, 1.669922, 7.440186,-0.940185]
c = 0
zeros, neg_counts, pos_counts = [], [], []
for i in range(len(ser)):
c+=1
s = np.sign(ser[i])
try:
if s != np.sign(ser[i+1]):
if s == 0:
zeros.append(c)
elif s == -1:
neg_counts.append(c)
elif s == 1:
pos_counts.append(c)
c = 0
except IndexError:
pos_counts.append(c) if s == 1 else neg_counts.append(c) if s ==-1 else zeros.append(c)
print(Counter(neg_counts))
print(Counter(pos_counts))
print(Counter(zeros))
输出:
当然,有
cumcount()
function您所说的一行中连续的正数和负数是什么意思?你能给我们一个样品箱吗?谢谢你的帮助。我没能让你的建议起作用,但把你的答案和下面Ghilas的答案结合起来,我就能找到一些适合我需要的东西。再次感谢!谢谢你的建议。把这个和前面的答案结合起来,我就能够把我需要的东西放在一起。令人惊叹的!
from collections import Counter
ser = [-3.550049, 28.609863, -2.1, 89941,4.419922,17.690185,41.219971,0.000000,-16.330078,7.950195,0.000000,38.370117,6.060059,3.579834,7.669922,2.739991,-8.039795,10.239990,-1.580078, 1.669922, 7.440186,-0.940185]
c = 0
zeros, neg_counts, pos_counts = [], [], []
for i in range(len(ser)):
c+=1
s = np.sign(ser[i])
try:
if s != np.sign(ser[i+1]):
if s == 0:
zeros.append(c)
elif s == -1:
neg_counts.append(c)
elif s == 1:
pos_counts.append(c)
c = 0
except IndexError:
pos_counts.append(c) if s == 1 else neg_counts.append(c) if s ==-1 else zeros.append(c)
print(Counter(neg_counts))
print(Counter(pos_counts))
print(Counter(zeros))
Counter({1: 5})
Counter({1: 3, 2: 1, 4: 1, 5: 1})
Counter({1: 2})