Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/python-3.x/19.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 索引5688超出大小为3706的轴0的界限_Python 3.x_Numpy_Recommendation Engine - Fatal编程技术网

Python 3.x 索引5688超出大小为3706的轴0的界限

Python 3.x 索引5688超出大小为3706的轴0的界限,python-3.x,numpy,recommendation-engine,Python 3.x,Numpy,Recommendation Engine,我正在训练创建推荐系统。我从网站上获取数据 用户数量=6040 |电影数量=3706 from sklearn import cross_validation as cv train_data, test_data = cv.train_test_split(df, test_size=0.25) 我尝试创建两个用户项目矩阵,一个用于培训,另一个用于测试 train_data_matrix = np.zeros(( n_users, n_items)) for line in train_da

我正在训练创建推荐系统。我从网站上获取数据

用户数量=6040 |电影数量=3706

from sklearn import cross_validation as cv
train_data, test_data = cv.train_test_split(df, test_size=0.25)
我尝试创建两个用户项目矩阵,一个用于培训,另一个用于测试

train_data_matrix = np.zeros(( n_users, n_items))
for line in train_data.itertuples():
    train_data_matrix[line[1]-1, line[2]-1] = line[3]  

test_data_matrix = np.zeros((n_users, n_items))
for line in test_data.itertuples():
    test_data_matrix[line[1]-1, line[2]-1] = line[3]
我得到了(完全回溯)

附言


错误消息告诉我们,
train\u data\u matrix
具有形状(3706,n),而
line[1]-1
为5688

IndexError: index 5688 is out of bounds for axis 0 with size 3706
train_data_matrix[line[1]-1, line[2]-1] = line[3]
所以问题是-为什么
行[1]
等于5689?或者在更大的上下文中,为什么
train\u data.itertuples()
会生成值如此大的行

我想知道你是否应该改用

train_data_matrix[line[0]-1, line[1]-1]

我不熟悉
itertuples
行的元素是什么?
train\u data的完整形状是什么?

您可以改用train\u data.pivot

train_data_matrix = train_data.pivot(index='user_id', columns='item_id', values='rating').fillna(0)

train_data_matrix-用户和电影id的唯一值矩阵。5689-这是用户id train_data.head()我在问题中回答了,但矩阵的行是按行计数而不是用户id索引的。我已经完成了train_数据。重置_索引(drop=True)但它不起作用
for line in train_data.itertuples():
    print (line)
Pandas(Index=483019, user_id=2968, item_id=2268, rating=5, timestamp=971107926)
Pandas(Index=943582, user_id=5689, item_id=3615, rating=3, timestamp=963719230)
Pandas(Index=116153, user_id=752, item_id=1147, rating=5, timestamp=975458000)
Pandas(Index=103250, user_id=686, item_id=1704, rating=5, timestamp=975601762)
IndexError: index 5688 is out of bounds for axis 0 with size 3706
train_data_matrix[line[1]-1, line[2]-1] = line[3]
train_data_matrix[line[0]-1, line[1]-1]
train_data_matrix = train_data.pivot(index='user_id', columns='item_id', values='rating').fillna(0)