Python 由于记忆错误,熊猫的替代品适用
我有一个我希望应用于数据帧的函数:Python 由于记忆错误,熊猫的替代品适用,python,pandas,dataframe,apply,Python,Pandas,Dataframe,Apply,我有一个我希望应用于数据帧的函数: def DetermineMid(data, ts): if data['U'] == 0 and data['D'] > 0: mid = data['C'] + ts / 2 elif data['U'] > 0 and data['D'] == 0: mid = data['C'] - ts / 2 else: diff = data['A'] - data['B'
def DetermineMid(data, ts):
if data['U'] == 0 and data['D'] > 0:
mid = data['C'] + ts / 2
elif data['U'] > 0 and data['D'] == 0:
mid = data['C'] - ts / 2
else:
diff = data['A'] - data['B']
if diff == 0:
mid = data['C'] + 1
else:
mid = data['C']
return mid
我的df列是A、B、C、D、U
我的电话如下:
df = df.apply(DetermineMid, args=(5, ), axis=1).
在较小的数据帧上,这很好,但对于此数据帧:
def DetermineMid(data, ts):
if data['U'] == 0 and data['D'] > 0:
mid = data['C'] + ts / 2
elif data['U'] > 0 and data['D'] == 0:
mid = data['C'] - ts / 2
else:
diff = data['A'] - data['B']
if diff == 0:
mid = data['C'] + 1
else:
mid = data['C']
return mid
日期时间索引:2561527条,
2016-11-30 17:00:01至2017-11-29 16:00:00数据列(共6列)
列):Z浮动64
浮点数64
B浮动64
C浮动64
U int64
D int64
数据类型:float64(5)、int64(2)
内存使用:156.3 MB
没有 我收到了一份备忘录。我是否使用了“应用程序”错误?我认为apply只是在行中迭代,并基于行值创建一个值mid,然后删除所有旧值,因为我不再关心它们了
有更好的方法吗?使用
np。选择即
m1= (df['U']==0) & (df['D']>0)
m2 = (df['U']>0) & (df['D']==0)
m3 = (df['A']-df['B'] == 0 )
np.select([m1,m2,m3], [df['C']+ts/2, df['C']-ts/2, df['C']+1 ],df['C'])
使用np。选择
m1= (df['U']==0) & (df['D']>0)
m2 = (df['U']>0) & (df['D']==0)
m3 = (df['A']-df['B'] == 0 )
np.select([m1,m2,m3], [df['C']+ts/2, df['C']-ts/2, df['C']+1 ],df['C'])
太好了!非常感谢。快速跟进:1)你怎么知道用numpy代替熊猫?2) 如果我想保留原始值并创建这些值的新列,我是否应该做同样的事情,然后将这两个值合并?是的,你可以使用它创建一个新列,并且需要时间来理解向量化解决方案的概念给它一些时间这是完美的!非常感谢。快速跟进:1)你怎么知道用numpy代替熊猫?2) 如果我想保留原始值并创建这些值的新列,我是否应该做同样的事情,然后将这两个值合并?是的,你可以使用它创建一个新列,这需要时间来理解向量化解决方案的概念给它一些时间