Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/311.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python PyMC3:分层橄榄球模型?_Python_Bayesian_Pymc_Mcmc_Pymc3 - Fatal编程技术网

Python PyMC3:分层橄榄球模型?

Python PyMC3:分层橄榄球模型?,python,bayesian,pymc,mcmc,pymc3,Python,Bayesian,Pymc,Mcmc,Pymc3,我刚刚开始通读(我对sklearn(代码学习)更为熟悉),并发现: 以下是主要的PyMC3模型设置: with pm.Model() as model: # Global model parameters home = pm.Normal('home', 0, tau=.0001) tau_att = pm.Gamma('tau_att', .1, .1) tau_def = pm.Gamma('tau_def', .1, .1) intercept =

我刚刚开始通读(我对sklearn(代码学习)更为熟悉),并发现:

以下是主要的
PyMC3
模型设置:

with pm.Model() as model:
    # Global model parameters
    home = pm.Normal('home', 0, tau=.0001)
    tau_att = pm.Gamma('tau_att', .1, .1)
    tau_def = pm.Gamma('tau_def', .1, .1)
    intercept = pm.Normal('intercept', 0, tau=.0001)

    # Team-specific model parameters
    atts_star = pm.Normal('atts_star', mu=0, tau=tau_att, shape=num_teams)
    defs_star = pm.Normal('defs_star', mu=0, tau=tau_def, shape=num_teams)

    atts = pm.Deterministic('atts', atts_star - tt.mean(atts_star))
    defs = pm.Deterministic('defs', defs_star - tt.mean(defs_star))
    home_theta = tt.exp(intercept + home + atts[home_team] + defs[away_team])
    away_theta = tt.exp(intercept + atts[away_team] + defs[home_team])

    # Likelihood of observed data
    home_points = pm.Poisson('home_points', mu=home_theta, observed=observed_home_goals)
    away_points = pm.Poisson('away_points', mu=away_theta, observed=observed_away_goals)

    start = pm.find_MAP()
    step = pm.NUTS(state=start)
    trace = pm.sample(20000, step, init=start) 
我知道如何绘制
轨迹

pm.traceplot(trace[5000:])
并生成:

我不确定的是:我如何询问模型/后验模型的问题?

例如,我假设威尔士队与意大利队的比赛分数分布为:

# Wales vs Italy is the first matchup in our dataset
home_wales = ppc['home_points'][:, 0]
away_italy = ppc['away_points'][:, 0]
但是,对于原始数据中没有记录的匹配情况呢

  • 如果意大利队在主场迎战法国队,他们的得分分布是什么样的
  • 如果意大利队在主场迎战法国队,那么两支球队得分低于15分的频率是多少

感谢您提供的帮助/见解。

我相当肯定,在阅读了全文之后,我能够理解这一点。按顺序回答问题:

  • 是的,这就是威尔士对意大利的比赛的分布情况(因为这是观察数据中的第一场比赛)

  • 为了预测意大利队对法国队的比赛(因为这两支球队在我们的原始数据集中没有比赛),我们需要预测西塔队

  • 以下是更新模型的代码:

    # Setup the model similarly to the previous one...
    with pm.Model() as model:
        # Global model parameters
        home = pm.Normal('home', 0, tau=.0001)
        tau_att = pm.Gamma('tau_att', .1, .1)
        tau_def = pm.Gamma('tau_def', .1, .1)
        intercept = pm.Normal('intercept', 0, tau=.0001)
    
        # Team-specific model parameters
        atts_star = pm.Normal('atts_star', mu=0, tau=tau_att, shape=num_teams)
        defs_star = pm.Normal('defs_star', mu=0, tau=tau_def, shape=num_teams)
    
        atts = pm.Deterministic('atts', atts_star - tt.mean(atts_star))
        defs = pm.Deterministic('defs', defs_star - tt.mean(defs_star))
        home_theta = tt.exp(intercept + home + atts[home_team] + defs[away_team])
        away_theta = tt.exp(intercept + atts[away_team] + defs[home_team])
    
        # Likelihood of observed data
        home_points = pm.Poisson('home_points', mu=home_theta, observed=observed_home_goals)
        away_points = pm.Poisson('away_points', mu=away_theta, observed=observed_away_goals)
    
    # Now for predictions with no games played...
    with model:
        # IDs from `teams` DataFrame
        italy, france = 4, 1
        # New `thetas` for Italy vs France predictions
        pred_home_theta = tt.exp(intercept + home + atts[italy] + defs[france])
        pred_away_theta = tt.exp(intercept + atts[france] + defs[italy])
        pred_home_points = pm.Poisson('pred_home_points', mu=pred_home_theta)
        pred_away_points = pm.Poisson('pred_away_points', mu=pred_away_theta)
    
    # Sample the final model
    with model:
        start = pm.find_MAP()
        step = pm.NUTS(state=start)
        trace = pm.sample(20000, step, init=start)
    
    一旦完成
    跟踪
    ,我们就可以绘制预测:

    # Use 5,000 as MCMC burn in
    pred = pd.DataFrame({
        "italy": trace["pred_home_points"][5000:],
        "france": trace["pred_away_points"][5000:],
    })
    # Plot the distributions
    sns.kdeplot(pred.italy, shade=True, label="Italy")
    sns.kdeplot(pred.france, shade=True, label="France")
    plt.show()
    

    意大利队多久在主场获胜一次

    # 19% of the time
    (pred.italy > pred.france).mean()
    
    两支球队15岁以下的得分频率是多少

    # 0.7% of the time
    1.0 * len(pred[(pred.italy <= 15) & (pred.france <= 15)]) / len(pred)
    
    #0.7%的时间
    
    1.0*len(pred[(pred.italy)我觉得这很好。为什么不把它添加到PyMC3的文档中呢?
    # 19% of the time
    (pred.italy > pred.france).mean()
    
    # 0.7% of the time
    1.0 * len(pred[(pred.italy <= 15) & (pred.france <= 15)]) / len(pred)