取Python给定文件中的平均分数_Python_String_Dictionary_Average

取Python给定文件中的平均分数

python string dictionary

取Python给定文件中的平均分数,python,string,dictionary,average,Python,String,Dictionary,Average,我正在做一个函数，它接受输入（字符串，dict）并返回一个浮点。该函数接受要计算的文件中的文本和单个单词的字典作为输入。函数必须返回整个文本的分数。也就是说，分数是出现的单词分数的平均值我有一个.csv文件，其中有一个单词列表，每个单词都有一个分数和标准偏差。在文件中，每一行都采用以下形式 word{TAB}score{TAB}standard_deviation 我把所有的字母都改成小写，并试图取所有分数的平均值到目前为止，我已经知道了这一点，但无法用正确的方法得出平均值： def ha

我正在做一个函数，它接受输入（字符串，dict）并返回一个浮点。该函数接受要计算的文件中的文本和单个单词的字典作为输入。函数必须返回整个文本的分数。也就是说，分数是出现的单词分数的平均值

我有一个.csv文件，其中有一个单词列表，每个单词都有一个分数和标准偏差。在文件中，每一行都采用以下形式

word{TAB}score{TAB}standard_deviation

我把所有的字母都改成小写，并试图取所有分数的平均值

到目前为止，我已经知道了这一点，但无法用正确的方法得出平均值：

def happiness_score(string , dict):
   sum = 0
   for word in string:
      dict = dict()
      if word in dict:
         sum += word
         word = string.lower()
         word,score,std = line.split()
         d[word]=float(score),float(std)
   return sum/len(dict)

我不确定你要做的确切的数学运算。我也不确定你是否能读懂这个文件

但希望这能提供一些指导

# to hold your variables
holder_dict = {}

# read the file:
with open("/path/to/file.csv", 'r') as csv_read:
    for line in csv_read.readlines():
        word, score, std = line.split('\t')
        if word in holder_dict.keys():
            holder_dict[word][0] += [float(score)]
            holder_dict[word][1] += [std]
        else:
            holder_dict[word] = [[float(score)],[std]]

# get average score
for word in holder_dict.keys():
    average_score = sum(holder_dict[word][0])/len(holder_dict[word][0])
    print "average score for word: %s is %.3f" % (word, average_score)

从我从你的解释中了解到，这可能是你需要的

def happiness_score(string, score_dict):
    total = 0
    count = 0
    for word in string.lower().split():
        if word in score_dict:
            total += score_dict[word]
            count += 1
    return total/count

def compile_score_dict(filename):
    score_dict = {}
    with open(filename) as csvfile:
        reader = csv.reader(csvfile, delimiter='\t')
        for row in reader:
            score_dict[row[0].lower()] = int(row[1])
    return score_dict

score_dict = compile_score_dict('filename.csv')
happiness_score('String to find score', score_dict)

不要将数据类型用作变量名（例如：

dict

）。这是非常令人困惑的。而且，如果总是计算为false，因为您正在为字符串中的每个单词将dict重置为空值，并且可能会发布您当前的代码。如果您正在发布示例，请准确地识别它们（例如：line.split（）、dict=dict（）等），这根本不可能运行！您不需要readlines（）。您只需执行“csv中的行”。您也可以使用defaultdict来避免在加载之前检查字典中是否有该单词。@juniper-yes。但是

readlines

如果我没有弄错的话，我已经为你做了

.strip（）

。否则，您将不得不以一种丑陋的方式处理

.strip（）

。据我所知，列表列表没有默认的dict。根据文档，readlines（）不会为您执行strip（）。可以将默认dict设置为“a=defaultdict（lambda:[0,0]）”。您使用的是什么python？我有2.6:（我一点也不了解你的幸福分数。这并不是真的在做什么。你只是在计算所有的分数。与每个单词的分数无关。