Python 2.7 使用random.choice列出索引超出范围错误_Python 2.7_Q Learning

Python 2.7 使用random.choice列出索引超出范围错误

python-2.7

Python 2.7 使用random.choice列出索引超出范围错误,python-2.7,q-learning,Python 2.7,Q Learning,当我运行我的程序时，我得到了下面的错误，其中定义了下面的函数。我想这是最重要的 valid_actions = filter(lambda x: x != random.choice(maxQactions) 导致错误的部分。是否有人看到问题所在，或建议如何解决？谢谢错误： choose_action action = random.choice(valid_actions) File "/Users/UserName/anaconda/lib/python2.7/random.

当我运行我的程序时，我得到了下面的错误，其中定义了下面的函数。我想这是最重要的

valid_actions = filter(lambda x: x != random.choice(maxQactions)

导致错误的部分。是否有人看到问题所在，或建议如何解决？谢谢

错误：

choose_action
    action = random.choice(valid_actions)
  File "/Users/UserName/anaconda/lib/python2.7/random.py", line 275, in choice
    return seq[int(self.random() * len(seq))]  # raises IndexError if seq is empty
IndexError: list index out of range

代码：

def选择动作（自身、状态）：
self.state=状态
self.next_航路点=self.planner.next_航路点（）
action_selections=self.Q[状态]
maxQ=max（action_selections.items（），key=lambda x:x[1]）[1]
maxQactions=[]
对于操作，Q在self.Q[state].items（）中：
如果Q==maxQ：
maxQactions.append（操作）
如果是自学：
使用\u epsilon=random.random（）<1-self.epsilon选择\u
如果没有，请使用ε选择ε：
有效的_操作=过滤器（lambda x:x！=random.choice（maxQactions），
环境。有效的\u操作）
动作=随机选择（有效的动作）
其他：
动作=随机选择（最大动作）#最大动作
其他：
action=random.choice（环境.有效的\u操作）
返回动作

请参阅

如果seq为空，则random.choice（seq）将引发索引器。在您的情况下，索引器错误发生在

'动作=随机选择（有效的动作）'

我怀疑有效的行动是否为空

参见

如果seq为空，则random.choice（seq）将引发索引器。在您的情况下，索引器错误发生在

'动作=随机选择（有效的动作）'

我怀疑有效的行动是否为空

最有可能的是，

maxQactions

为空，您能签出它吗？最有可能的是，

maxQactions

为空，您能签出它吗？

def choose_action(self, state):


        self.state = state
        self.next_waypoint = self.planner.next_waypoint()

        action_selections = self.Q[state]

        maxQ = max(action_selections.items(), key=lambda x: x[1])[1]

        maxQactions = []
        for action, Q in self.Q[state].items():
            if Q == maxQ:
                maxQactions.append(action)


        if self.learning:
            choose_using_epsilon  = random.random() < 1 - self.epsilon
            if not choose_using_epsilon:
                valid_actions = filter(lambda x: x != random.choice(maxQactions), 
                    Environment.valid_actions)
                action = random.choice(valid_actions)
            else:
                action = random.choice(maxQactions) #maxQaction
        else:
            action = random.choice(Environment.valid_actions)
        return action