Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/python/281.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 行索引超出scipy csr_矩阵的矩阵维度_Python_Pandas_Scipy - Fatal编程技术网

Python 行索引超出scipy csr_矩阵的矩阵维度

Python 行索引超出scipy csr_矩阵的矩阵维度,python,pandas,scipy,Python,Pandas,Scipy,我是python和pandas的新手,我有以下问题 我有一个数据集 df = pd.read_csv('/home/nikoscha/Documents/ThesisR/dataset.csv', names=['response_nn','event','user']) 我正试图用以下代码创建一个csr_矩阵 # Create lists of all events, users adfnd respones events = list(np.sort(df.event_id.unique(

我是python和pandas的新手,我有以下问题

我有一个数据集

df = pd.read_csv('/home/nikoscha/Documents/ThesisR/dataset.csv', names=['response_nn','event','user'])
我正试图用以下代码创建一个csr_矩阵

# Create lists of all events, users adfnd respones
events = list(np.sort(df.event_id.unique()))
users = list(np.sort(df.user_id.unique()))
responses = list(df.responses)

# Get the rows and columns for our new matrix
rows = df.user_id.astype(float)
cols = df.event_id.astype(float)

# Contruct a sparse matrix for our users and items containing number of plays
data_sparse = sp.csr_matrix((responses, (rows, cols)), shape=(len(users), len(events)))
上述代码有效。但是当我得到一个训练数据集

mask = np.random.rand(len(df)) < 0.5
df = df[mask]
df = df.reset_index() 
df = df.drop(['index'], axis=1)
然后尝试构造稀疏矩阵,我得到以下错误

ValueError:行索引超出了矩阵维度


谁能解释一下原因吗?如scipy文件中所述,当csr\u矩阵初始化时,提前感谢您:

csr_矩阵((数据,(行索引,列索引)),[shape=(M,N)])

在scipy.sparse.csr.py中:

csr_matrix((data, (row_ind, col_ind)), [shape=(M, N)])  
        where `data`, `row_ind` and `col_ind` satisfy the  
        relationship `a[row_ind[k], col_ind[k]] = data[k]`.  
当csr初始化时,它将检查row_ind.max()和M之间的关系

同样在scipy.sparse.coo.py中:

if self.row.max() >= self.shape[0]:
                raise ValueError('row index exceeds matrix dimensions')
            if self.col.max() >= self.shape[1]:
                raise ValueError('column index exceeds matrix dimensions')
            if self.row.min() < 0:
                raise ValueError('negative row index found')
            if self.col.min() < 0:
                raise ValueError('negative column index found')
当第[0]=9行带注释时,它可以正常工作。希望有帮助。

您指定
len(用户)
作为矩阵行维度。但显然,
包含大于该值的值。(并不是因为它导致此错误,而是
应该是
astype(integer)
,而不是float。)
if self.row.max() >= self.shape[0]:
                raise ValueError('row index exceeds matrix dimensions')
            if self.col.max() >= self.shape[1]:
                raise ValueError('column index exceeds matrix dimensions')
            if self.row.min() < 0:
                raise ValueError('negative row index found')
            if self.col.min() < 0:
                raise ValueError('negative column index found')
a = np.random.random((8,2))
row = np.hstack((a[:,0],a[:,1]))
#row[0]=9
col = np.hstack([a[:,1],a[:,0]])
matrix = csr_matrix(([1]*row.shape[0], (row,col)),shape=(a.shape[0],a.shape[0]))