Python 如何在函数中设置多索引形式以获得正确的输出

Python 如何在函数中设置多索引形式以获得正确的输出,python,pandas,indexing,Python,Pandas,Indexing,大家好,我正在尝试在python上练习多索引。为此,我首先定义了这个函数: def update_G(R_, U_, V_): np.nan_to_num(R_) deviation=R_-V_.dot(np.transpose(U_)) return(deviation) def rmse(X): return np.sqrt(np.nanmean(X**2)) def max_update(X, Y, relative=True): if re

大家好,我正在尝试在python上练习多索引。为此,我首先定义了这个函数:

def update_G(R_, U_, V_):
    np.nan_to_num(R_)
    deviation=R_-V_.dot(np.transpose(U_))
    return(deviation)

def rmse(X):

    return np.sqrt(np.nanmean(X**2))

def max_update(X, Y, relative=True):

    if relative:
        updates = np.nan_to_num((X - Y)/Y)
    else:
        updates = np.nan_to_num(X - Y)
            
    return np.linalg.norm(updates.ravel(), np.inf)

现在,这就是我的多索引实践的出发点,我试图定义以下使用上述函数的函数:

def compute_UV(Rdf, K=5, alpha=0.01, max_iteration=5000, diff_thr=1e-3):

    R = Rdf.values
    Rone = Rdf.replace(Rdf, 1) # keep data frame metadata

    M, I = R.shape            # number of movies and users
    U = np.random.rand(I, K)  # initialize with random numbers
    V = np.random.rand(M, K)  # initialize with random numbers
    G = update_G(R, U, V)     # calculate residual

    track_rmse = []
    track_update = []
    for i in range(0, max_iteration): 
        
        Unew = update_U(G, U, V, alpha)
        Gnew = update_G(R, U, V)

        Vnew = update_V(G, U, V, alpha)
        Gnew = update_G(R, U, V)

        track_rmse += [{
            'iteration':i, 
            'rmse': rmse(Gnew),
            'max residual change': max_update(Gnew, G, relative=False)
        }]
        track_update += [{
            'iteration':i, 
            'max update':max(max_update(Unew, U), max_update(Vnew, V))
        }]

        U = Unew
        V = Vnew
        G = Gnew
        
        if track_update[-1]['max update'] < diff_thr:
            break
        
    track_rmse = pd.DataFrame(track_rmse)
    track_update = pd.DataFrame(track_update)
    
    kindex = pd.Index(range(0, K), name='k')

    
    V = pd.DataFrame(V, index=..., columns=...) #where we want to input 
    U = pd.DataFrame(U, index=..., columns=...) #where we want to input
    
    return {
        'U':U, 'V':V,
        'rmse': track_rmse,
        'update': track_update
    }
 

它将输出如下内容

MultiIndex([('rating',   1),
            ('rating',  85),
            ('rating', 269),
            ('rating', 271),
            ('rating', 301),
            ('rating', 312),
            ('rating', 328),
            ('rating', 339),
            ('rating', 389),
            ('rating', 650),
            ('rating', 716),
            ('rating', 727),
            ('rating', 178),
            ('rating', 299),
            ('rating', 387),
            ('rating', 883)],
           names=[None, 'user id'])

到目前为止,我传递到U和V的最后一个实例中的所有数组都只返回一个索引而不是所需的多索引,有没有办法做到这一点?欢迎任何建议

MultiIndex([('rating',   1),
            ('rating',  85),
            ('rating', 269),
            ('rating', 271),
            ('rating', 301),
            ('rating', 312),
            ('rating', 328),
            ('rating', 339),
            ('rating', 389),
            ('rating', 650),
            ('rating', 716),
            ('rating', 727),
            ('rating', 178),
            ('rating', 299),
            ('rating', 387),
            ('rating', 883)],
           names=[None, 'user id'])