Python 如何处理一个数字猜测游戏(带有扭曲)算法?
更新(2020年7月):这个问题已经9年了,但仍然是我非常感兴趣的问题。从那时起,机器学习(RNN、CNN、GANS等)、新方法和廉价GPU的兴起,使新方法成为可能。我认为重新审视这个问题,看看是否有新的方法是很有趣的。 我正在学习编程(Python和算法),并试图从事一个我觉得有趣的项目。我已经创建了一些基本的Python脚本,但我不确定如何为我正在尝试构建的游戏找到解决方案 游戏将如何运作: 用户将获得具有值的项目。比如说,Python 如何处理一个数字猜测游戏(带有扭曲)算法?,python,algorithm,tensorflow,machine-learning,keras,Python,Algorithm,Tensorflow,Machine Learning,Keras,更新(2020年7月):这个问题已经9年了,但仍然是我非常感兴趣的问题。从那时起,机器学习(RNN、CNN、GANS等)、新方法和廉价GPU的兴起,使新方法成为可能。我认为重新审视这个问题,看看是否有新的方法是很有趣的。 我正在学习编程(Python和算法),并试图从事一个我觉得有趣的项目。我已经创建了一些基本的Python脚本,但我不确定如何为我正在尝试构建的游戏找到解决方案 游戏将如何运作: 用户将获得具有值的项目。比如说, Apple = 1 Pears = 2 Oranges = 3
Apple = 1
Pears = 2
Oranges = 3
然后,他们将有机会选择他们喜欢的任意组合(即100个苹果、20个梨和一个桔子)。计算机得到的唯一输出是总值(在本例中,当前为143美元)。计算机将试图猜测他们有什么。很明显,它在第一次转弯时无法正确地转弯
Value quantity(day1) value(day1)
Apple 1 100 100
Pears 2 20 40
Orange 3 1 3
Total 121 143
下一轮用户可以修改他们的数字,但不能超过总数量的5%(或者我们可能选择的其他百分比。例如,我将使用5%)。水果的价格可以(随机)改变,因此总价值也可以基于此而改变(为简单起见,在本例中我不改变水果价格)。使用上面的示例,在游戏的第2天,用户在第3天返回值152美元和164美元。下面是一个例子:
Quantity (day2) %change (day2) Value (day2) Quantity (day3) %change (day3) Value(day3)
104 104 106 106
21 42 23 46
2 6 4 12
127 4.96% 152 133 4.72% 164
*(我希望表格显示正确,我必须手动将它们隔开,希望它不只是在我的屏幕上进行,如果不起作用,请告诉我,我将尝试上传屏幕截图。)
我试图看看我是否能够计算出随着时间的推移,数量是多少(假设用户有耐心不断输入数字)。我知道现在我唯一的限制是总值不能超过5%,所以我现在不能在5%的准确度之内,所以用户将永远输入它
我到目前为止所做的事情
这是我目前的解决方案(不多)。基本上,我获取所有的值并计算出它们的所有可能组合(我完成了这部分)。然后我把所有可能的组合作为字典放到数据库中(比如143美元,可能有一个字典条目{apple:143,Pears:0,Oranges:0}..一直到{apple:0,Pears:1,Oranges:47}。每次我得到一个新的数字,我都会这样做,所以我有一个所有可能的列表
在使用上述规则的过程中,我如何找出最佳的解决方案?我想我需要一个适应度函数来自动比较两天的数据,并消除与前几天数据差异超过5%的可能性
问题:
所以,我的问题是,用户改变了总数,我有一个所有概率的列表,我应该如何处理?我需要学习什么?有什么算法或理论我可以使用,适用吗?或者,为了帮助我理解我的错误,你能建议我可以添加哪些规则来实现这个目标吗(如果不是现在的状态,我在考虑添加更多的水果,并说他们必须至少挑选3个,等等)?此外,我对遗传算法只有模糊的理解,但我认为我可以在这里使用它们,如果有什么我可以使用的话
我非常渴望学习,所以任何建议或提示都将不胜感激(只是请不要告诉我这个游戏是不可能的)
更新:收到反馈说这很难解决。所以我想我应该在游戏中添加另一个条件,不影响玩家的行为(游戏对他们来说保持不变),但每天水果的价值都会改变价格(随机).这会使问题更容易解决吗?因为在5%的移动和一定的水果价值变化范围内,随着时间的推移,只有少数组合是可能的
第一天,任何事情都是可能的,而且接近一个足够的范围几乎是不可能的,但是随着水果价格的变化,用户只能选择5%的变化,那么就不应该(随着时间的推移)范围越来越窄。在上面的例子中,如果价格波动足够大,我想我可以强行推出一个解决方案,给我一个可以猜测的范围,但我试图找出是否有更优雅的解决方案或其他解决方案可以随着时间的推移不断缩小这个范围
更新2:在阅读和询问之后,我认为这是一个隐马尔可夫/维特比问题,它跟踪水果价格的变化以及总金额(最后一个数据点的权重最大).但我不确定如何应用这种关系。我认为这种情况可能是错误的,但至少我开始怀疑这是某种类型的机器学习问题
更新3:我创建了一个测试用例(使用较小的数字)和一个生成器来帮助自动化用户生成的数据,我正在尝试从中创建一个图表,以查看更可能的结果
下面是代码,以及关于用户实际水果数量的总值和注释
#!/usr/bin/env python
import itertools
# Fruit price data
fruitPriceDay1 = {'Apple':1, 'Pears':2, 'Oranges':3}
fruitPriceDay2 = {'Apple':2, 'Pears':3, 'Oranges':4}
fruitPriceDay3 = {'Apple':2, 'Pears':4, 'Oranges':5}
# Generate possibilities for testing (warning...will not scale with large numbers)
def possibilityGenerator(target_sum, apple, pears, oranges):
allDayPossible = {}
counter = 1
apple_range = range(0, target_sum + 1, apple)
pears_range = range(0, target_sum + 1, pears)
oranges_range = range(0, target_sum + 1, oranges)
for i, j, k in itertools.product(apple_range, pears_range, oranges_range):
if i + j + k == target_sum:
currentPossible = {}
#print counter
#print 'Apple', ':', i/apple, ',', 'Pears', ':', j/pears, ',', 'Oranges', ':', k/oranges
currentPossible['apple'] = i/apple
currentPossible['pears'] = j/pears
currentPossible['oranges'] = k/oranges
#print currentPossible
allDayPossible[counter] = currentPossible
counter = counter +1
return allDayPossible
# Total sum being returned by user for value of fruits
totalSumDay1=26 # Computer does not know this but users quantities are apple: 20, pears 3, oranges 0 at the current prices of the day
totalSumDay2=51 # Computer does not know this but users quantities are apple: 21, pears 3, oranges 0 at the current prices of the day
totalSumDay3=61 # Computer does not know this but users quantities are apple: 20, pears 4, oranges 1 at the current prices of the day
graph = {}
graph['day1'] = possibilityGenerator(totalSumDay1, fruitPriceDay1['Apple'], fruitPriceDay1['Pears'], fruitPriceDay1['Oranges'] )
graph['day2'] = possibilityGenerator(totalSumDay2, fruitPriceDay2['Apple'], fruitPriceDay2['Pears'], fruitPriceDay2['Oranges'] )
graph['day3'] = possibilityGenerator(totalSumDay3, fruitPriceDay3['Apple'], fruitPriceDay3['Pears'], fruitPriceDay3['Oranges'] )
# Sample of dict = 1 : {'oranges': 0, 'apple': 0, 'pears': 0}..70 : {'oranges': 8, 'apple': 26, 'pears': 13}
print graph
我们将结合图论和概率: 在第一天,建立一组所有可行解。让我们将解集表示为A1={A1(1),A1(2),…,A1(n)} 第二天,您可以再次构建解决方案集A2 现在,对于A2中的每个元素,您需要检查是否可以从A1中的每个元素访问它(给定x%公差)。如果可以,请将A2(n)连接到A1(m)。如果无法从A1(m)中的任何节点访问它,则可以删除此节点 基本上,我们正在构建一个连通的有向无环图 图中所有路径的可能性都相同。只有当从Am到Am+1(从Am中的节点到Am+1中的节点)有一条边时,才能找到精确解 当然,有些节点出现在比其他节点更多的路径中。每个节点的概率可以直接根据包含
from __future__ import division
import random
import numpy
import scipy.stats
import pylab
# Assume Guesser knows prices and total
# Guesser must determine the quantities
# All of pylab is just for graphing, comment out if undesired
# Graphing only graphs first 2 FRUITS (first 2 dimensions)
NUM_FRUITS = 3
MAX_QUANTITY_CHANGE = .01 # Maximum percentage change that total quantity of fruit can change per iteration
MAX_QUANTITY = 100 # Bound for the sake of instantiating variables
MIN_QUANTITY_TOTAL = 10 # Prevent degenerate conditions where quantities all hit 0
MAX_FRUIT_PRICE = 1000 # Bound for the sake of instantiating variables
NUM_PARTICLES = 5000
NEW_PARTICLES = 500 # Num new particles to introduce each iteration after guessing
NUM_ITERATIONS = 20 # Max iterations to run
CHANGE_QUANTITIES = True
CHANGE_PRICES = True
'''
Change individual fruit quantities for a random amount of time
Never exceed changing fruit quantity by more than MAX_QUANTITY_CHANGE
'''
def updateQuantities(quantities):
old_total = max(sum(quantities), MIN_QUANTITY_TOTAL)
new_total = old_total
max_change = int(old_total * MAX_QUANTITY_CHANGE)
while random.random() > .005: # Stop Randomly
change_index = random.randint(0, len(quantities)-1)
change_val = random.randint(-1*max_change,max_change)
if quantities[change_index] + change_val >= 0: # Prevent negative quantities
quantities[change_index] += change_val
new_total += change_val
if abs((new_total / old_total) - 1) > MAX_QUANTITY_CHANGE:
quantities[change_index] -= change_val # Reverse the change
def totalPrice(prices, quantities):
return sum(prices*quantities)
def sampleParticleSet(particles, fruit_prices, current_total, num_to_sample):
# Assign weight to each particle using observation (observation is current_total)
# Weight is the probability of that particle (guess) given the current observation
# Determined by looking up the distance from the hyperplane (line, plane, hyperplane) in a
# probability density fxn for a normal distribution centered at 0
variance = 2
distances_to_current_hyperplane = [abs(numpy.dot(particle, fruit_prices)-current_total)/numpy.linalg.norm(fruit_prices) for particle in particles]
weights = numpy.array([scipy.stats.norm.pdf(distances_to_current_hyperplane[p], 0, variance) for p in range(0,NUM_PARTICLES)])
weight_sum = sum(weights) # No need to normalize, as relative weights are fine, so just sample un-normalized
# Create new particle set weighted by weights
belief_particles = []
belief_weights = []
for p in range(0, num_to_sample):
sample = random.uniform(0, weight_sum)
# sum across weights until we exceed our sample, the weight we just summed is the index of the particle we'll use
p_sum = 0
p_i = -1
while p_sum < sample:
p_i += 1
p_sum += weights[p_i]
belief_particles.append(particles[p_i])
belief_weights.append(weights[p_i])
return belief_particles, numpy.array(belief_weights)
'''
Generates new particles around the equation of the current prices and total (better particle generation than uniformly random)
'''
def generateNewParticles(current_total, fruit_prices, num_to_generate):
new_particles = []
max_values = [int(current_total/fruit_prices[n]) for n in range(0,NUM_FRUITS)]
for p in range(0, num_to_generate):
new_particle = numpy.array([random.uniform(1,max_values[n]) for n in range(0,NUM_FRUITS)])
new_particle[-1] = (current_total - sum([new_particle[i]*fruit_prices[i] for i in range(0, NUM_FRUITS-1)])) / fruit_prices[-1]
new_particles.append(new_particle)
return new_particles
# Initialize our data structures:
# Represents users first round of quantity selection
fruit_prices = numpy.array([random.randint(1,MAX_FRUIT_PRICE) for n in range(0,NUM_FRUITS)])
fruit_quantities = numpy.array([random.randint(1,MAX_QUANTITY) for n in range(0,NUM_FRUITS)])
current_total = totalPrice(fruit_prices, fruit_quantities)
success = False
particles = generateNewParticles(current_total, fruit_prices, NUM_PARTICLES) #[numpy.array([random.randint(1,MAX_QUANTITY) for n in range(0,NUM_FRUITS)]) for p in range(0,NUM_PARTICLES)]
guess = numpy.average(particles, axis=0)
guess = numpy.array([int(round(guess[n])) for n in range(0,NUM_FRUITS)])
print "Truth:", str(fruit_quantities)
print "Guess:", str(guess)
pylab.ion()
pylab.draw()
pylab.scatter([p[0] for p in particles], [p[1] for p in particles])
pylab.scatter([fruit_quantities[0]], [fruit_quantities[1]], s=150, c='g', marker='s')
pylab.scatter([guess[0]], [guess[1]], s=150, c='r', marker='s')
pylab.xlim(0, MAX_QUANTITY)
pylab.ylim(0, MAX_QUANTITY)
pylab.draw()
if not (guess == fruit_quantities).all():
for i in range(0,NUM_ITERATIONS):
print "------------------------", i
if CHANGE_PRICES:
fruit_prices = numpy.array([random.randint(1,MAX_FRUIT_PRICE) for n in range(0,NUM_FRUITS)])
if CHANGE_QUANTITIES:
updateQuantities(fruit_quantities)
map(updateQuantities, particles) # Particle Filter Prediction
print "Truth:", str(fruit_quantities)
current_total = totalPrice(fruit_prices, fruit_quantities)
# Guesser's Turn - Particle Filter:
# Prediction done above if CHANGE_QUANTITIES is True
# Update
belief_particles, belief_weights = sampleParticleSet(particles, fruit_prices, current_total, NUM_PARTICLES-NEW_PARTICLES)
new_particles = generateNewParticles(current_total, fruit_prices, NEW_PARTICLES)
# Make a guess:
guess = numpy.average(belief_particles, axis=0, weights=belief_weights) # Could optimize here by removing outliers or try using median
guess = numpy.array([int(round(guess[n])) for n in range(0,NUM_FRUITS)]) # convert to integers
print "Guess:", str(guess)
pylab.cla()
#pylab.scatter([p[0] for p in new_particles], [p[1] for p in new_particles], c='y') # Plot new particles
pylab.scatter([p[0] for p in belief_particles], [p[1] for p in belief_particles], s=belief_weights*50) # Plot current particles
pylab.scatter([fruit_quantities[0]], [fruit_quantities[1]], s=150, c='g', marker='s') # Plot truth
pylab.scatter([guess[0]], [guess[1]], s=150, c='r', marker='s') # Plot current guess
pylab.xlim(0, MAX_QUANTITY)
pylab.ylim(0, MAX_QUANTITY)
pylab.draw()
if (guess == fruit_quantities).all():
success = True
break
# Attach new particles to existing particles for next run:
belief_particles.extend(new_particles)
particles = belief_particles
else:
success = True
if success:
print "Correct Quantities guessed"
else:
print "Unable to get correct answer within", NUM_ITERATIONS, "iterations"
pylab.ioff()
pylab.show()
some_percent = 0.05
Day 1: basket: [3 2] prices: [10 7] total_value: 44
Day 2: basket: [x y] prices: [5 5] total_value: 25
Day 3: basket: [2 3] prices: [9 5] total_value: 33
Possible Solutions Day 2: [2 3], [3 2]
import itertools
import numpy as np
def gen_possible_combination(total, prices):
"""
Generates all possible combinations of numbers of items for
given prices constraint by total
"""
nitems = [range(total//p + 1) for p in prices]
prices_arr = np.array(prices)
combo = [x for x in itertools.product(
*nitems) if np.dot(np.array(x), prices_arr) == total]
return combo
def reduce(combo1, combo2, pct):
"""
Filters impossible transitions which are greater than pct
"""
combo = {}
for x in combo1:
for y in combo2:
if abs(sum(x) - sum(y))/sum(x) <= pct:
combo[y] = 1
return list(combo.keys())
def gen_items(n, total):
"""
Generates a list of items
"""
nums = [0] * n
t = 0
i = 0
while t < total:
if i < n - 1:
n1 = np.random.randint(0, total-t)
nums[i] = n1
t += n1
i += 1
else:
nums[i] = total - t
t = total
return nums
def main():
pct = 0.05
i = 0
done = False
n = 3
total_items = 26 # np.random.randint(26)
combo = None
while not done:
prices = [np.random.randint(1, 10) for _ in range(n)]
items = gen_items(n, total_items)
total = np.dot(np.array(prices), np.array(items))
combo1 = gen_possible_combination(total, prices)
if combo:
combo = reduce(combo, combo1, pct)
else:
combo = combo1
i += 1
print(i, 'Items:', items, 'Prices:', prices, 'Total:',
total, 'No. Possibilities:', len(combo))
if len(combo) == 1:
print('Solution', combo)
break
if np.random.random() < 0.5:
total_items = int(total_items * (1 + np.random.random()*pct))
else:
total_items = int(
np.ceil(total_items * (1 - np.random.random()*pct)))
if __name__ == "__main__":
main()