Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/351.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 用字典中的值替换稀疏数据帧中的值的快速方法_Python_Pandas - Fatal编程技术网

Python 用字典中的值替换稀疏数据帧中的值的快速方法

Python 用字典中的值替换稀疏数据帧中的值的快速方法,python,pandas,Python,Pandas,我有一个非常稀疏的数据帧df,如下所示: Apples Bananas Pineapple Mango Mary Apples NaN NaN NaN Jane NaN Bananas NaN NaN Diego NaN NaN NaN Mango Guido NaN NaN Pineapple NaN 我想建立一个字典d,比如 d = {'Apples':3, 'Bananas

我有一个非常稀疏的数据帧
df
,如下所示:

       Apples Bananas Pineapple Mango
Mary   Apples     NaN       NaN   NaN
Jane      NaN Bananas       NaN   NaN
Diego     NaN     NaN       NaN Mango
Guido     NaN     NaN Pineapple   NaN
我想建立一个字典
d
,比如

d = {'Apples':3, 'Bananas':1, 'Pineapple':2, 'Mango': 15}
取得

       Apples Bananas Pineapple Mango
Mary        3     NaN       NaN   NaN
Jane      NaN       1       NaN   NaN
Diego     NaN     NaN       NaN    15
Guido     NaN     NaN         2   NaN
我能行

df.to_sparse().replace(d)

但已经超过30英尺了,还没有产出。我的数据帧有10000行乘以1500列,数据帧最初是135MB,在to_sparse()之后变成850kB。有没有更快的方法?

按更改的问题编辑-您可以使用
系列
多索引
-(
NaN
s值被删除)-然后用于重新整形:


df.replace(d)
?大家好,显然replace(d)不适合我。我将重新讨论这个问题,重点是一个比replace(d)@famargar更有效的替代解决方案-如果它非常稀疏,请使用
df=df.stack().map(d).unstack()
@cᴏʟᴅsᴘᴇᴇᴅ - 你能重新打开这个吗?或者我可以重新打开吗?@jezrael这是一个骗局,所以回答,如果你愿意,也要有时间安排。
np.random.seed(1235)
N = 1000
d = {'Apples':3, 'Bananas':1, 'Pineapple':2, 'Mango': 15}
df = pd.DataFrame(np.random.choice(list(d.keys()) + [np.nan], 
                  size=(N, N), 
                  p=(0.01,0.02,0.03,0.02,0.92)))
#print (df)

In [227]: %timeit df.replace(d)
1 loop, best of 3: 661 ms per loop

In [228]: %timeit df.stack().map(d).unstack()
1 loop, best of 3: 381 ms per loop