Python 游戏图像识别（在Flappy Bird中识别得分或游戏结束）_Python_Numpy_Image Processing_Reinforcement Learning

Python 游戏图像识别（在Flappy Bird中识别得分或游戏结束）

python numpy image-processing

Python 游戏图像识别（在Flappy Bird中识别得分或游戏结束）,python,numpy,image-processing,reinforcement-learning,Python,Numpy,Image Processing,Reinforcement Learning,为了熟悉强化学习，我正在实现基本的RL算法来玩游戏。我已经做好了一切准备，唯一的问题是如何实现奖励功能。我希望能够处理屏幕并识别是得分还是小鸟死了使用和opencv处理屏幕，返回一个。然后奖励函数需要为提供的数组分配奖励，但我不知道如何进行这是单个已处理图像的外观：我实现奖励功能的想法是，如果背景停止移动，鸟就死了。如果这只鸟在两条管道之间的空隙中，特工就得了一分。我如何在numpy计算中表达这个想法 def _calculate_reward(self, state): """

为了熟悉强化学习，我正在实现基本的RL算法来玩游戏。我已经做好了一切准备，唯一的问题是如何实现奖励功能。我希望能够处理屏幕并识别是得分还是小鸟死了

使用和opencv处理屏幕，返回一个。然后奖励函数需要为提供的数组分配奖励，但我不知道如何进行

这是单个已处理图像的外观：

我实现奖励功能的想法是，如果背景停止移动，鸟就死了。如果这只鸟在两条管道之间的空隙中，特工就得了一分。我如何在numpy计算中表达这个想法

def _calculate_reward(self, state):
    """"
    calculate the reward of the state. Flappy is dead when the screen has stopped moving, so when two consecutive frames
    are equal. A point is scored when an obstacle is above flappy, and before it wasn't. An object is above Flappy when
    there are two white pixels in the first 50 pixels on the first row.

    :param state: np.array shape = (1, height, width, 4) - > four consecutive processed frames
    :return reward: int representing the reward if a point is scored or if flappy has died.
    """
    if np.sum((state[0,:,:,3] - state[0,:,:,2])) == 0 and np.sum((state[0,:,:,2] - state[0,:,:,1])) == 0:
        print("flappy is dead")
        return -1000
    elif sum(state[0,0,:50,3]) == 510 and sum(state[0,0,:50,2]) == 510 and sum(state[0,0,:50,1]) != 510 and sum(state[0,0,:50,0]) != 510:
        print("point!")
        return 1000
    else:
        return 0

如果您使用的是OpenCV，试一试如何

谢谢，这似乎是朝着正确的方向推动！