Python-字典的DataFrame中列之间的scipy pdist

Python-字典的DataFrame中列之间的scipy pdist,python,dictionary,pandas,scipy,dataframe,Python,Dictionary,Pandas,Scipy,Dataframe,我正在开发一个程序来计算电影评论之间的欧几里德距离。我想计算一个给定的审阅者和另一个给定的审阅者,以及一个给定的审阅者和所有其他人之间的差异。我将数据放在字典的数据框中,如下所示: { 'Nancy Pollock': { 'Lawrence of Arabia': 2.5, 'Gravity': 3.5, 'The Godfather': 3.0, 'Prometheus': 3.5, 'For a Few

我正在开发一个程序来计算电影评论之间的欧几里德距离。我想计算一个给定的审阅者和另一个给定的审阅者,以及一个给定的审阅者和所有其他人之间的差异。我将数据放在字典的数据框中,如下所示:

{
    'Nancy Pollock': {
        'Lawrence of Arabia': 2.5,
        'Gravity': 3.5,
        'The Godfather': 3.0,
        'Prometheus': 3.5,
        'For a Few Dollars More': 2.5,
        'The Guns of Navarone': 3.0
    },
    'Jack Holmes': {
        'Lawrence of Arabia': 3.0,
        'Gravity': 3.5,
        'The Godfather': 1.5,
        'Prometheus': 5.0,
        'The Guns of Navarone': 3.0,
        'For a Few Dollars More': 3.5
    },
    'Mary Doyle': {
        'Lawrence of Arabia': 2.5,
        'Gravity': 3.0,
        'Prometheus': 3.5,
        'The Guns of Navarone': 4.0
    },
    'Doug Redpath': {
        'Gravity': 3.5,
        'The Godfather': 3.0,
        'The Guns of Navarone': 4.5,
        'Prometheus': 4.0,
        'For a Few Dollars More': 2.5
    },
    'Jill Brown': {
        'Lawrence of Arabia': 3.0,
        'Gravity': 4.0,
        'The Godfather': 2.0,
        'Prometheus': 3.0,
        'The Guns of Navarone': 3.0,
        'For a Few Dollars More': 2.0
    },
    'Trevor Chappell': {
        'Lawrence of Arabia': 3.0,
        'Gravity': 4.0,
        'The Guns of Navarone': 3.0,
        'Prometheus': 5.0,
        'For a Few Dollars More': 3.5
    },
    'Peter': {
        'Gravity': 4.5,
        'For a Few Dollars More': 1.0,
        'Prometheus': 4.0
    }
}
我在这里相当迷茫,但我想知道的是如何制作一个函数,将每一本词典转换成pdist可以使用的格式。然后我可以研究如何遍历它。我目前掌握的代码如下:

import pandas as pd
from scipy.spatial.distance import pdist, squareform
f= open("reviews.txt")
d= eval(f.read())
#print(d)
df = pd.DataFrame(d)
print(df)
def getSimilarity():
    EcDist = pd.DataFrame(index=df.index) #container for results
    movieArray = df.values
    #some way of turning it into a format pdist can use
    EcDist = pdist#etc
    return EcDist

def getSimilarities():
    EcDist2 = pd.DataFrame(index=df.index)
    movieArrays = df.values
    #some way of turning it into a format pdist can use
    EcDist2 = pdist#etc
    return EcDist2

试着把这一点收紧一点:例如,给我们三篇评论,以及你想从这三篇评论中得到的格式样本。好的,我回家后会看一看。非常感谢。