Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/293.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
如何在python字典中迭代公式并将结果保存在dataFrame中?_Python_Dictionary_Iteration - Fatal编程技术网

如何在python字典中迭代公式并将结果保存在dataFrame中?

如何在python字典中迭代公式并将结果保存在dataFrame中?,python,dictionary,iteration,Python,Dictionary,Iteration,我有一本叫做“评论”的字典: 对于字典的每次检查(本例中为1和2),我需要在单词的值上迭代两个公式。这些公式将计算每次评审的“负后概率”和“正后概率” 公式如下: “负后验概率”=(负前验*pos)/(负前验*neg+pos前验*pos) ‘pos_post_prob’=(pos_previor*pos)/(neg_previor*neg+pos_previor*pos) 其中: “neg_prior”是在neg的上一个单词迭代中计算的“neg_post_prob”,并且 “pos_prior

我有一本叫做“评论”的字典:

对于字典的每次检查(本例中为1和2),我需要在单词的值上迭代两个公式。这些公式将计算每次评审的“负后概率”和“正后概率”

公式如下:

  • “负后验概率”=(负前验*pos)/(负前验*neg+pos前验*pos)
  • ‘pos_post_prob’=(pos_previor*pos)/(neg_previor*neg+pos_previor*pos)
  • 其中:

    • “neg_prior”是在neg的上一个单词迭代中计算的“neg_post_prob”,并且
    • “pos_prior”是在pos的上一个单词迭代中计算的“pos_post_prob”
    对于每次评审的第一个单词,优先级应等于0.5

    这是我的复习1和复习2的代码:

    #Review 1: 
    
    # the prior before starting the iteration is 0.5
    prior = 0.5
    
    # priors after the first word "like"
    neg_prior_like = (prior*0.0005) / (prior * 0.0005 + prior * 0.0025)
    pos_prior_like = (prior*0.0025) / (prior * 0.0005 + prior * 0.0025)
    
    
    # priors after the second word "the"
    neg_prior_like_the = (neg_prior_like * 0.5) / (neg_prior_like * 0.5 + pos_prior_like * 0.5)
    pos_prior_like_the = (pos_prior_like * 0.5) / (neg_prior_like * 0.5 + pos_prior_like * 0.5)
    
    
    # post_prob after last word "acting"
    neg_post_prob = (neg_prior_like_the * 0.5) / (neg_prior_like_the * 0.5 + pos_prior_like_the * 0.5)
    pos_post_prob = (pos_prior_like_the * 0.5) / (neg_prior_like_the * 0.5 + pos_prior_like_the * 0.5)
    
    
    validation = neg_post_prob + pos_post_prob
    
    但我期望的结果是:

    sentiment = {'review': [1, 2],
        'neg_post_prob': [0.17, 0.94],
        'pos_post_prob': [0.83, 0.06],
        'validation': [1, 1]
        }
    
    sentiment = pd.DataFrame(sentiment, columns = ['review', 'neg_post_prob', 'pos_post_prob', 'validation'])
    
    print (sentiment)
    

    使用来自functools模块

    代码

    from functools import reduce
    import pandas as pd
    
    def update(priors, values):
        '''
            Provides updated probabilities based upon previous pair of neg, pos
        '''
        # Previous neg, pos pair
        neg, pos = priors
        
        # New negative and positive (using OP update equation)
        scale = (pos *values[0] + neg * values[1])   # denominator
        new_neg = (neg*values[0]) / scale
        new_pos = (pos*values[1]) / scale
        return new_neg, new_pos                      # new update pair
        
    def calc(reviews):
        ''' Main function to perform calculations and 
            produce pandas data frame
        '''
        sentiment = {'review':[],
                     'neg_post_prob': [],
                     'pos_post_prob': [],
                     'validation': []}
        
        for review_id, word_values in reviews.items():
            # word_values is dictionary of negative/positive for words in review
            values = word_values.values()  # array of neg/pos values
            
            # Use reduce to iterative apply update function to sequence of value
            result = reduce(update, values, [0.5, 0.5])
            neg, pos = result
            validation = neg + pos
            
            # Update results
            sentiment['review'].append(review_id)
            sentiment['neg_post_prob'].append(neg)
            sentiment['pos_post_prob'].append(pos)
            sentiment['validation'].append(validation)
            
        
        return pd.DataFrame(sentiment)
            
    
    测试

    reviews= {1: {'like': [0.0005, 0.0025], 'the': [0.5, 0.5], 'acting': [0.5, 0.5]},
              2: {'plot': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'story': [0.5, 0.5]}}
    
    df = calc(reviews)
    
    df

    从functools模块使用

    代码

    from functools import reduce
    import pandas as pd
    
    def update(priors, values):
        '''
            Provides updated probabilities based upon previous pair of neg, pos
        '''
        # Previous neg, pos pair
        neg, pos = priors
        
        # New negative and positive (using OP update equation)
        scale = (pos *values[0] + neg * values[1])   # denominator
        new_neg = (neg*values[0]) / scale
        new_pos = (pos*values[1]) / scale
        return new_neg, new_pos                      # new update pair
        
    def calc(reviews):
        ''' Main function to perform calculations and 
            produce pandas data frame
        '''
        sentiment = {'review':[],
                     'neg_post_prob': [],
                     'pos_post_prob': [],
                     'validation': []}
        
        for review_id, word_values in reviews.items():
            # word_values is dictionary of negative/positive for words in review
            values = word_values.values()  # array of neg/pos values
            
            # Use reduce to iterative apply update function to sequence of value
            result = reduce(update, values, [0.5, 0.5])
            neg, pos = result
            validation = neg + pos
            
            # Update results
            sentiment['review'].append(review_id)
            sentiment['neg_post_prob'].append(neg)
            sentiment['pos_post_prob'].append(pos)
            sentiment['validation'].append(validation)
            
        
        return pd.DataFrame(sentiment)
            
    
    测试

    reviews= {1: {'like': [0.0005, 0.0025], 'the': [0.5, 0.5], 'acting': [0.5, 0.5]},
              2: {'plot': [0.5, 0.5], 'hate': [0.0029, 0.0002], 'story': [0.5, 0.5]}}
    
    df = calc(reviews)
    
    df

        review  neg_post_prob   pos_post_prob   validation
    0   1       0.166667        0.833333        1.0
    1   2       0.935484        0.064516        1.0