Python 如何在函数中设置多索引形式以获得正确的输出
大家好,我正在尝试在python上练习多索引。为此,我首先定义了这个函数:Python 如何在函数中设置多索引形式以获得正确的输出,python,pandas,indexing,Python,Pandas,Indexing,大家好,我正在尝试在python上练习多索引。为此,我首先定义了这个函数: def update_G(R_, U_, V_): np.nan_to_num(R_) deviation=R_-V_.dot(np.transpose(U_)) return(deviation) def rmse(X): return np.sqrt(np.nanmean(X**2)) def max_update(X, Y, relative=True): if re
def update_G(R_, U_, V_):
np.nan_to_num(R_)
deviation=R_-V_.dot(np.transpose(U_))
return(deviation)
def rmse(X):
return np.sqrt(np.nanmean(X**2))
def max_update(X, Y, relative=True):
if relative:
updates = np.nan_to_num((X - Y)/Y)
else:
updates = np.nan_to_num(X - Y)
return np.linalg.norm(updates.ravel(), np.inf)
现在,这就是我的多索引实践的出发点,我试图定义以下使用上述函数的函数:
def compute_UV(Rdf, K=5, alpha=0.01, max_iteration=5000, diff_thr=1e-3):
R = Rdf.values
Rone = Rdf.replace(Rdf, 1) # keep data frame metadata
M, I = R.shape # number of movies and users
U = np.random.rand(I, K) # initialize with random numbers
V = np.random.rand(M, K) # initialize with random numbers
G = update_G(R, U, V) # calculate residual
track_rmse = []
track_update = []
for i in range(0, max_iteration):
Unew = update_U(G, U, V, alpha)
Gnew = update_G(R, U, V)
Vnew = update_V(G, U, V, alpha)
Gnew = update_G(R, U, V)
track_rmse += [{
'iteration':i,
'rmse': rmse(Gnew),
'max residual change': max_update(Gnew, G, relative=False)
}]
track_update += [{
'iteration':i,
'max update':max(max_update(Unew, U), max_update(Vnew, V))
}]
U = Unew
V = Vnew
G = Gnew
if track_update[-1]['max update'] < diff_thr:
break
track_rmse = pd.DataFrame(track_rmse)
track_update = pd.DataFrame(track_update)
kindex = pd.Index(range(0, K), name='k')
V = pd.DataFrame(V, index=..., columns=...) #where we want to input
U = pd.DataFrame(U, index=..., columns=...) #where we want to input
return {
'U':U, 'V':V,
'rmse': track_rmse,
'update': track_update
}
它将输出如下内容
MultiIndex([('rating', 1),
('rating', 85),
('rating', 269),
('rating', 271),
('rating', 301),
('rating', 312),
('rating', 328),
('rating', 339),
('rating', 389),
('rating', 650),
('rating', 716),
('rating', 727),
('rating', 178),
('rating', 299),
('rating', 387),
('rating', 883)],
names=[None, 'user id'])
到目前为止,我传递到U和V的最后一个实例中的所有数组都只返回一个索引而不是所需的多索引,有没有办法做到这一点?欢迎任何建议
MultiIndex([('rating', 1),
('rating', 85),
('rating', 269),
('rating', 271),
('rating', 301),
('rating', 312),
('rating', 328),
('rating', 339),
('rating', 389),
('rating', 650),
('rating', 716),
('rating', 727),
('rating', 178),
('rating', 299),
('rating', 387),
('rating', 883)],
names=[None, 'user id'])