有效迭代3.311031748 E+；Python中的12个组合_Python_Loops_Bigdata_Combinations

有效迭代3.311031748 E+；Python中的12个组合

python loops

有效迭代3.311031748 E+；Python中的12个组合,python,loops,bigdata,combinations,Python,Loops,Bigdata,Combinations,我收集了一个庞大的口袋妖怪数据集，我的目标是根据我构建的比率来确定“十大团队”——口袋妖怪BST（基本统计总数）：平均弱点。对于那些关心的人，我将平均弱点计算为每个类型的口袋妖怪弱点之和（0.25到飞行+1到水+2到钢铁+4到火，等等），然后除以18（游戏中可用的类型总数）举一个简单的例子——由以下三个口袋妖怪组成的团队：金勒、米米丘、马格尼宗，团队比例为1604.1365384615383 由于数据将用于竞争性游戏，我删除了所有未完全进化的口袋妖怪以及传奇/神话中的口袋妖怪。以下是我目前的流

我收集了一个庞大的口袋妖怪数据集，我的目标是根据我构建的比率来确定“十大团队”——口袋妖怪BST（基本统计总数）：平均弱点。对于那些关心的人，我将平均弱点计算为每个类型的口袋妖怪弱点之和（0.25到飞行+1到水+2到钢铁+4到火，等等），然后除以18（游戏中可用的类型总数）
举一个简单的例子——由以下三个口袋妖怪组成的团队：金勒、米米丘、马格尼宗，团队比例为1604.1365384615383
由于数据将用于竞争性游戏，我删除了所有未完全进化的口袋妖怪以及传奇/神话中的口袋妖怪。以下是我目前的流程：

创建一个包含所有可能的完全进化的口袋妖怪团队组合的集合

使用for循环迭代每个组合

前10个组合将自动添加到列表中

从第11个组合开始，我将把当前团队迭代添加到列表中，按降序对列表进行排序，然后删除比率最低的团队。这确保了在每次迭代后只保留前10名
显然，这一过程将需要相当长的时间才能运行。我想知道是否有更有效的方法来运行这个。最后，请参阅下面我的代码：

import itertools import pandas as pd df = pd.read_csv("Downloads/pokemon.csv") # read in csv of fully-evolved Pokemon data # list(df) # list of df column names - useful to see what data has been collected df = df[df["is_legendary"] == 0] # remove legendary pokemon - many legendaries are allowed in competitive play df = df[['abilities', # trim df to contain only the columns we care about 'against_bug', 'against_dark', 'against_dragon', 'against_electric', 'against_fairy', 'against_fight', 'against_fire', 'against_flying', 'against_ghost', 'against_grass', 'against_ground', 'against_ice', 'against_normal', 'against_poison', 'against_psychic', 'against_rock', 'against_steel', 'against_water', 'attack', 'defense', 'hp', 'name', 'sp_attack', 'sp_defense', 'speed', 'type1', 'type2']] df["bst"] = df["hp"] + df["attack"] + df["defense"] + df["sp_attack"] + df["sp_defense"] + df["speed"] # calculate BSTs df['average_weakness'] = (df['against_bug'] # calculates a Pokemon's 'average weakness' to other types + df['against_dark'] + df['against_dragon'] + df['against_electric'] + df['against_fairy'] + df['against_fight'] + df['against_fire'] + df['against_flying'] + df['against_ghost'] + df['against_grass'] + df['against_ground'] + df['against_ice'] + df['against_normal'] + df['against_poison'] + df['against_psychic'] + df['against_rock'] + df['against_steel'] + df['against_water']) / 18 df['bst-weakness-ratio'] = df['bst'] / df['average_weakness'] # ratio of BST:avg weakness - the higher the better names = df["name"] # pull out list of all names for creating combinations combinations = itertools.combinations(names, 6) # create all possible combinations of 6 pokemon teams top_10_teams = [] # list for storing top 10 teams for x in combinations: ratio = sum(df.loc[df['name'].isin(x)]['bst-weakness-ratio']) # pull out sum of team's ratio if(len(top_10_teams) != 10): top_10_teams.append((x, ratio)) # first 10 teams will automatically populate list else: top_10_teams.append((x, ratio)) # add team to list top_10_teams.sort(key=lambda x:x[1], reverse=True) # sort list by descending ratios del top_10_teams[-1] # drop team with the lowest ratio - only top 10 remain in list top_10_teams

在您的示例中，每个口袋妖怪都有一个bst_弱点比率，在计算团队价值时，您没有考虑成员之间相互抵消弱点，而是简单地将6个成员的比率相加？如果是这样的话，最好的团队不应该是拥有6个最好的口袋妖怪的团队吗？我不明白你为什么需要这些组合
尽管如此，我想在进入组合数学之前，你可以从你的列表中删除很多口袋妖怪。如果您有一个布尔数组（n_pokemons，n_types），用True表示每个Pokemon的弱点，那么您可以检查是否有一个Pokemon具有相同的弱点，但具有更好的bst值

# Loop over all pokemon and check if there are other pokemon # ... with the exact same weaknesses but better stats # -name -weaknesses -bst # pokemon A [0, 0, 1, 1, 0, ...], bst=34.85 -> delete A # pokemon B [0, 0, 1, 1, 0, ...], bst=43.58 # ... with a subset of the weaknesses and better stats # pokemon A [0, 0, 1, 1, 0, ...], bst=34.85 -> delete A # pokemon B [0, 0, 1, 0, 0, ...], bst=43.58
我用numpy写了一个小片段。bst的值和缺点如下随机选择。使用我的设置

n_pokemons = 1000 n_types = 18 n_min_weaknesses = 1 # number of minimal and maximal weaknesses for each Pokemon n_max_weaknesses = 4
名单上只剩下大约30-40只口袋妖怪。我不确定这对“真正的”口袋妖怪来说有多合理，但有了这样一个数字，组合搜索就更可行了

import numpy as np # Generate pokemons name_arr = np.array(['pikabra_{}'.format(i) for i in range(n_pokemons)]) # Random stats bst_arr = np.random.random(n_pokemons) * 100 # Random weaknesses weakness_array = np.zeros((n_pokemons, n_types), dtype=bool) # bool array indicating the weak types of each pokemon for i in range(n_pokemons): rnd_weaknesses = np.random.choice(np.arange(n_types), np.random.randint(n_min_weaknesses, n_max_weaknesses+1)) weakness_array[i, rnd_weaknesses] = True # Remove unnecessary pokemons i = 0 while i < n_pokemons: j = i + 1 while j < n_pokemons: del_idx = None combined_weaknesses = np.logical_or(weakness_array[i], weakness_array[j]) if np.all(weakness_array[i] == weakness_array[j]): if bst_arr[j] < bst_arr[i]: del_idx = i else: del_idx = j elif np.all(combined_weaknesses == weakness_array[i]) and bst_arr[j] < bst_arr[i]: del_idx = i elif np.all(combined_weaknesses == weakness_array[j]) and bst_arr[i] < bst_arr[j]: del_idx = j if del_idx is not None: name_arr = np.delete(name_arr, del_idx, axis=0) bst_arr = np.delete(bst_arr, del_idx, axis=0) weakness_array = np.delete(weakness_array, del_idx, axis=0) n_pokemons -= 1 if del_idx == i: i -= 1 break else: j -= 1 j += 1 i += 1 print(n_pokemons)

将numpy导入为np #生成口袋妖怪 name_arr=np.array（['pikabra_{}'。范围内i的格式（i）（n_pokemons）]） #随机统计 bst_arr=np.random.random（n_口袋妖怪）*100 #随机弱点弱点数组=np.zero（（n个口袋妖怪，n个类型），dtype=bool）#bool数组指示每个口袋妖怪的弱点类型对于范围内的i（n_口袋妖怪）： rnd_弱点=np.random.choice（np.arange（n_类型），np.random.randint（n_最小弱点，n_最大弱点+1））弱点数组[i，rnd\u弱点]=真 #移除不必要的口袋妖怪 i=0 而我
如果限制为2的组合，需要多长时间？如果可能的话，甚至是1个？我对口袋妖怪知之甚少，但我首先要说的是，肯定有一种方法可以避免迭代六个口袋妖怪的所有可能组合（例如，六火口袋妖怪肯定不会进入10个最佳团队）因此，您可以首先尝试想出一种方法来获得您当前拥有的3.31e12组合的子集！然后，我建议您将可能的组合分成更小的组（以避免遇到内存错误），并尝试使用
NumPy
数组而不是
pandas
dataframes来矢量化您想要做的事情。进展如何，您是否成功获得了完美的团队？