python伪随机_Python_Python 3.x_Pandas_Random_Psychopy

python伪随机

python python-3.x pandas random

python伪随机,python,python-3.x,pandas,random,psychopy,Python,Python 3.x,Pandas,Random,Psychopy,目前，我的试验存在伪随机问题。我使用while循环创建12个文件，其中包含38行（或试验），符合1个标准： 1） maxcolor1expl在连续的3行中不能相同其中，color1expl是我的数据框中的一列当我只需要创建38行的文件时，下面的脚本似乎工作得很好 import pandas as pd n_dataset_int = 0 n_dataset = str(n_dataset_int) df_all_possible_trials = pd.read_excel('Group

目前，我的试验存在伪随机问题。我使用while循环创建12个文件，其中包含38行（或试验），符合1个标准：

1） max

color1expl

在连续的3行中不能相同

其中，

color1expl

是我的数据框中的一列

当我只需要创建38行的文件时，下面的脚本似乎工作得很好

import pandas as pd

n_dataset_int = 0
n_dataset = str(n_dataset_int)

df_all_possible_trials = pd.read_excel('GroupF' + n_dataset + '.xlsx') # this is my dataset with all possible trials


# creating the files
for iterations in range(0,12): #I need 12 files with pseudorandom combinations

    n_dataset_int += 1 #this keeps track of the number of iterations
    n_dataset = str(n_dataset_int) 

    df_experiment = df_all_possible_trials.sample(n=38) #38 is the total number of trials
    df_experiment.reset_index(drop=True, inplace=True)

    #max color1expl cannot be identical in 3 consecutive trials (maximum in 2 consecutive t.)

    randomized = False

    while not randomized: #thise while loop will make every time a randomization of the rows in the dataframe
        experimental_df_2 = df_experiment.sample(frac=1).reset_index(drop=True) 
        for i in range(0, len(experimental_df_2)):
            try:
                if i == len(experimental_df_2) - 1:
                    randomized = True
                elif (experimental_df_2['color1expl'][i] != experimental_df_2['color1expl'][i+1]) or (experimental_df_2['color1expl'][i] != experimental_df_2['color1expl'][i+2])
                    continue
                elif (experimental_df_2['color1expl'][i] == experimental_df_2['color1expl'][i+1]) and (experimental_df_2['color1expl'][i] == experimental_df_2['color1expl'][i+2]):
                    break    
            except:
                pass

    #export the excel file
    experimental_df_2.to_excel('GroupF_r' + n_dataset + '.xlsx', index=False) #creates a new

但是，当执行相同的过程时，将数字从

n=38

增加到

n=228

，脚本似乎会无限期地运行。到目前为止，已经有一天多的时间了，12份文件中没有一份是它制作的。可能是因为组合太多，无法尝试

有没有办法改进此脚本，使其能够处理更多的行？

我认为您可以更改生成随机样本（伪代码）的方式：

n=38#或其他任何东西
我的样本=[]
my_sample.append（pop_one_random_from（df_all_mable_trials））
my_sample.append（pop_one_random_from（df_all_mable_trials））
而len（我的样本）


如果我做对了，所有不同的样本（组合的总数，'len（df_all_mable_trials）choose n'）都有相同的被选择概率，这就是你要寻找的。而且它应该工作得更快。
除了：pass
似乎是个非常糟糕的主意，你为什么要这样做？谢谢你的建议。你认为这可能是个问题吗？我对编程不太感兴趣，所以我可能犯了一个错误，但在我看来，循环一直运行到生成12个文件，跳过了所有错误。从某种意义上说，如果try不起作用，循环是否会再次启动？问题不在于程序是否工作，而在于捕获所有可能的异常。你不想让你的程序悄悄地丢弃异常，比如MemoryError
，overflowerError
和OSError，对吗？
n = 38 # or anything else

my_sample = []

my_sample.append( pop_one_random_from(df_all_possible_trials) )
my_sample.append( pop_one_random_from(df_all_possible_trials) )

while len(my_sample) < n:
  next_one = pop_one_random_from(df_all_possible_trials)
  if next_one is equal to my_sample[-1] and my_sample[-2]:
    put next_one back to df_all_possible_trials
    continue
  else:
    my_sample.append( next_one )