python可在掩码/where-recalc中为每个条目调用_Python_Pandas_Mask_Callable

python可在掩码/where-recalc中为每个条目调用

python pandas

python可在掩码/where-recalc中为每个条目调用,python,pandas,mask,callable,Python,Pandas,Mask,Callable,我正在将pandas mask/where函数与random（）一起用作可调用函数-是否可以获取此函数以计算每个条目的不同值，这样我就不会在每个掩码单元格中获得相同的随机数，或者只能使用apply e、 g 这将生成相同的随机数（在本例中为0.134364），并且在本例中，我想要六个不同的随机数您可以通过生成2d numpy数组，这里*表示将元组解包为整数： np.random.seed(1) arr = np.random.rand(*df_factor_history.shape)

我正在将pandas mask/where函数与

random（）

一起用作可调用函数-是否可以获取此函数以计算每个条目的不同值，这样我就不会在每个掩码单元格中获得相同的随机数，或者只能使用

apply

e、 g

这将生成相同的随机数（在本例中为

0.134364

），并且在本例中，我想要六个不同的随机数

您可以通过生成

2d numpy

数组，这里

表示将元组解包为整数：

np.random.seed(1)

arr = np.random.rand(*df_factor_history.shape)
print (arr)
[[4.17022005e-01 7.20324493e-01 1.14374817e-04]
 [3.02332573e-01 1.46755891e-01 9.23385948e-02]
 [1.86260211e-01 3.45560727e-01 3.96767474e-01]
 [5.38816734e-01 4.19194514e-01 6.85219500e-01]
 [2.04452250e-01 8.78117436e-01 2.73875932e-02]
 [6.70467510e-01 4.17304802e-01 5.58689828e-01]]

df = df_factor_history.where(df_valid.T.unstack(), arr)
print (df)
                        observation_1  observation_2  observation_3
as_at_date factor_name                                             
2020-10-01 A                 0.417022       0.720324       0.000114
           B                 1.000000       5.000000       3.000000
2020-11-01 A                 1.000000       5.000000       3.000000
           B                 0.538817       0.419195       0.685220
2020-12-01 A                 1.000000       5.000000       3.000000
           B                 1.000000       5.000000       3.000000

好的，制作一个全尺寸的随机数组，使用where-sweet确定的我需要的部分，非常感谢@jezrael提供的快速而优雅的解决方案

df_valid = pd.DataFrame([
    ['2020-09-01', True, True],
    ['2020-10-01', False, True],
    ['2020-11-01', True, False],
    ['2020-12-01', True, True],
    ['2021-01-01', True, True]], 
    columns = ['as_at_date', 'A', 'B'])

df_valid.set_index(['as_at_date'], inplace=True)

df_valid

from random import seed
from random import random
# seed random number generator
seed(1)

df_factor_history.where(df_valid.T.unstack(), random())

np.random.seed(1)

arr = np.random.rand(*df_factor_history.shape)
print (arr)
[[4.17022005e-01 7.20324493e-01 1.14374817e-04]
 [3.02332573e-01 1.46755891e-01 9.23385948e-02]
 [1.86260211e-01 3.45560727e-01 3.96767474e-01]
 [5.38816734e-01 4.19194514e-01 6.85219500e-01]
 [2.04452250e-01 8.78117436e-01 2.73875932e-02]
 [6.70467510e-01 4.17304802e-01 5.58689828e-01]]

df = df_factor_history.where(df_valid.T.unstack(), arr)
print (df)
                        observation_1  observation_2  observation_3
as_at_date factor_name                                             
2020-10-01 A                 0.417022       0.720324       0.000114
           B                 1.000000       5.000000       3.000000
2020-11-01 A                 1.000000       5.000000       3.000000
           B                 0.538817       0.419195       0.685220
2020-12-01 A                 1.000000       5.000000       3.000000
           B                 1.000000       5.000000       3.000000