Python 模块'；绘图'；没有属性'；幕式和#x27；_Python_Plot_Attributes_Reinforcement Learning

Python 模块'；绘图'；没有属性'；幕式和#x27；

python plot

Python 模块'；绘图'；没有属性'；幕式和#x27；,python,plot,attributes,reinforcement-learning,Python,Plot,Attributes,Reinforcement Learning,我正在从我自己的定制环境中编写一个Q-Learning RL代码，但我在代码中遇到了这个错误（模块“plotting”没有属性“Epicodestats”）这是我的Q-Learning代码： pip install plotting import itertools import pandas as pd from collections import defaultdict import json import numpy as np from keras.models import S

我正在从我自己的定制环境中编写一个Q-Learning RL代码，但我在代码中遇到了这个错误（模块“plotting”没有属性“Epicodestats”）这是我的Q-Learning代码：

pip install plotting
import itertools 
import pandas as pd 
from collections import defaultdict 
import json
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense
from keras.optimizers import sgd
from FooEnv import FooEnv
import random
import sys 
sys.setrecursionlimit(10**6)
import time
import os
import matplotlib
from collections import namedtuple
from collections import deque, namedtuple
import plotting 
matplotlib.style.use('ggplot')

real_time_info = [0.0, 0.0, 0.0, 0.0]
start = [0.0, 0.0, 0.0, 0.0, 0.0, 1.0,1.0]
env = FooEnv(start,real_time_info)
num_actions=17
num_episodes=1000

def createEpsilonGreedyPolicy(Q, epsilon, num_actions): 
    """ 
    Creates an epsilon-greedy policy based 
    on a given Q-function and epsilon. 

    Returns a function that takes the state 
    as an input and returns the probabilities 
    for each action in the form of a numpy array  
    of length of the action space(set of possible actions). 
    """
    def policyFunction(state): 

        Action_probabilities = np.ones(num_actions, 
                dtype = float) * epsilon / num_actions 

        best_action = np.argmax(Q[state]) 
        Action_probabilities[best_action] += (1.0 - epsilon) 
        return Action_probabilities 

    return policyFunction


def qLearning(env, num_episodes, discount_factor = 1.0, 
                            alpha = 0.6, epsilon = 0.1): 
    """ 
    Q-Learning algorithm: Off-policy TD control. 
    Finds the optimal greedy policy while improving 
    following an epsilon-greedy policy"""

    # Action value function 
    # A nested dictionary that maps 
    # state -> (action -> action-value). 
    Q = defaultdict(lambda: np.zeros(env.action_space.n)) 

    # Keeps track of useful statistics 
    stats = plotting.EpisodeStats( 
        episode_lengths = np.zeros(num_episodes), 
        episode_rewards = np.zeros(num_episodes))    

    # Create an epsilon greedy policy function 
    # appropriately for environment action space 
    policy = createEpsilonGreedyPolicy(Q, epsilon, env.action_space.n) 

    # For every episode 
    for ith_episode in range(num_episodes): 

        # Reset the environment and pick the first action 
        state = env.reset() 

        for t in itertools.count(): 

            # get probabilities of all actions from current state 
            action_probabilities = policy(state) 

            # choose action according to 
            # the probability distribution 
            action = np.random.choice(np.arange( 
                    len(action_probabilities)), 
                    p = action_probabilities) 

            # take action and get reward, transit to next state 
            next_state, reward, done, _ = env.step(action) 

            # Update statistics 
            stats.episode_rewards[ith_episode] += reward 
            stats.episode_lengths[ith_episode] = t 

            # TD Update 
            best_next_action = np.argmax(Q[next_state])  
            td_target = reward + discount_factor * Q[next_state][best_next_action] 
            td_delta = td_target - Q[state][action] 
            Q[state][action] += alpha * td_delta 

            # done is True if episode terminated 
            if done: 
                break

            state = next_state 

    return Q, stats

这是我在互联网上找到的一个代码，我认为它应该工作得很好。但我在这一行有一个错误：

    **module 'plotting' has no attribute 'EpisodeStats'**

stats = plotting.EpisodeStats( 
            episode_lengths = np.zeros(num_episodes), 
            episode_rewards = np.zeros(num_episodes))

如果您能给我一些建议，我将不胜感激。

同样的问题。然后我从github找到了这个版本的。

将原来的plotting.py替换为该文件后，修复了此问题。

正在打印自定义类还是某个标准库？能否将链接添加到具有绘图类的文件？