Python 稀疏hstack和奇怪的数据类型转换错误
在处理一些文本数据时,我尝试将np数组(从熊猫系列)连接到csr矩阵 我已经完成了下面的工作Python 稀疏hstack和奇怪的数据类型转换错误,python,numpy,scipy,Python,Numpy,Scipy,在处理一些文本数据时,我尝试将np数组(从熊猫系列)连接到csr矩阵 我已经完成了下面的工作 #create a compatible sparse matrix from my np.array. #sparse.csr_matrix(X['link'].values) returns array size (1,7395) #transpose that array for (7395,1) X = sparse.csr_matrix(X['link'].values.transpose)
#create a compatible sparse matrix from my np.array.
#sparse.csr_matrix(X['link'].values) returns array size (1,7395)
#transpose that array for (7395,1)
X = sparse.csr_matrix(X['link'].values.transpose)
#bodies is a sparse.csr_matrix with shape (7395, 20000)
bodies = sparse.hstack((bodies,X))
但是,这一行给出了错误不支持类型转换:(dtype('O'),)
。我不知道这是什么意思?我怎样才能避开它
谢谢
import numpy as np
import pandas as pd
from scipy import sparse
d = {
"a": 30,
"b": 20,
"c": 10
}
s = pd.Series(d, index=["c", "b", "a"])
print s
--output:--
c 10
b 20
a 30
dtype: int64
my_ndarray = s.values
print my_ndarray
--output:--
[10 20 30]
X = sparse.csr_matrix(my_ndarray).transpose()
print X.todense()
--output:--
[[10]
[20]
[30]]
bodies = sparse.csr_matrix([
[0, 1],
[1, 0],
[0, 0]
])
print bodies.todense()
--output:--
[[0 1]
[1 0]
[0 0]]
result = sparse.hstack((bodies,X))
print result.todense()
--output:--
[[ 0 1 10]
[ 1 0 20]
[ 0 0 30]]
和写作:
X = sparse.csr_matrix(my_ndarray.transpose())
产生错误:
Traceback (most recent call last):
File "1.py", line 33, in <module>
result = sparse.hstack((bodies,X))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/construct.py", line 417, in hstack
return bmat([blocks], format=format, dtype=dtype)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/construct.py", line 515, in bmat
raise ValueError('blocks[%d,:] has incompatible row dimensions' % i)
ValueError: blocks[0,:] has incompatible row dimensions
回溯(最近一次呼叫最后一次):
文件“1.py”,第33行,在
结果=稀疏.hstack((实体,X))
hstack中的文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site packages/scipy/sparse/construct.py”,第417行
返回bmat([blocks],format=format,dtype=dtype)
bmat中的文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site packages/scipy/sparse/construct.py”,第515行
raise VALUERROR('块[%d,:]的行维度“%i”不兼容)
ValueError:块[0,:]的行维度不兼容
与之相比:
import numpy as np
import pandas as pd
from scipy import sparse
d = {
"a": "hello",
"b": "world",
"c": "goodbye"
}
s = pd.Series(d, index=["c", "b", "a"])
print s
--output:--
c goodbye
b world
a hello
my_ndarray = s.values
print my_ndarray
--output:--
[goodbye world hello]
X = sparse.csr_matrix(s.values).transpose()
--output:--
Traceback (most recent call last):
File "1.py", line 19, in <module>
X = sparse.csr_matrix(s.values).transpose()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 66, in __init__
self._set_self( self.__class__(coo_matrix(arg1, dtype=dtype)) )
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/compressed.py", line 30, in __init__
arg1 = arg1.asformat(self.format)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/base.py", line 203, in asformat
return getattr(self,'to' + format)()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/coo.py", line 312, in tocsr
data = np.empty(self.nnz, dtype=upcast(self.dtype))
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/sputils.py", line 53, in upcast
raise TypeError('no supported conversion for types: %r' % (args,))
TypeError: no supported conversion for types: (dtype('object'),)
将numpy导入为np
作为pd进口熊猫
从scipy导入稀疏
d={
“a”:“你好”,
“b”:“世界”,
c:“再见”
}
s=pd.系列(d,索引=[“c”,“b”,“a”])
印刷品
--输出:--
c再见
b世界
a你好
my_ndarray=s值
打印我的日历
--输出:--
[再见,世界你好]
X=稀疏.csr_矩阵(s.values).transpose()
--输出:--
回溯(最近一次呼叫最后一次):
文件“1.py”,第19行,在
X=稀疏.csr_矩阵(s.values).transpose()
文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site packages/scipy/sparse/compressed.py”,第66行,在__
self.\u set\u self(self.\u class\uuuuu(coo_矩阵(arg1,dtype=dtype)))
文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site packages/scipy/sparse/compressed.py”,第30行,在__
arg1=arg1.asformat(self.format)
文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/scipy/sparse/base.py”,第203行,格式为ASF
返回getattr(self,'to'+格式)()
tocsr中的文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site packages/scipy/sparse/coo.py”,第312行
data=np.empty(self.nnz,dtype=upcast(self.dtype))
文件“/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site packages/scipy/sparse/sputils.py”,第53行,向上转换
raise TypeError('不支持类型转换:%r'(参数,))
TypeError:不支持类型的转换:(dtype('object'),)
您自己没有提供这样的示例,这意味着您没有投入足够的工作来调试问题。以下是Saullo Castro的评论作为答案:
x = np.arange(12).reshape(1,12) # ndarray
sparse.csr_matrix(x)
Out[14]: <1x12 sparse matrix of type '<type 'numpy.int32'>'
with 11 stored elements in Compressed Sparse Row format>
x.transpose # function, not ndarray
Out[15]: <function transpose>
X = sparse.csr_matrix(x.transpose)
TypeError: no supported conversion for types: (dtype('O'),)
csr\u matrix
如果提供了转置数组,它就可以正常工作。当必须调用转置时,可以像.transpose()
或.T
那样调用它,否则您实际上是在传递函数对象来创建提供验证对象类型的数组……再次,我对numpy文档的糟糕程度感到震惊。我没有幸创建了一个稀疏的字符串数组,或者复制了您的错误。根据谷歌的说法,你是第一个遇到这种错误的人我也面临着同样的问题。这个答案正在偏离正轨。您使用的是transpose()
,而不是问题中的transpose
。@hpaulj,熊猫系列有一个value属性,它返回一个ndarray。ndarray没有转置属性,但它有转置()方法;从文档中可以看出,sparse.csr_matrix()方法没有将方法作为参数。但是,请随意发布一个描述ndarray的转置属性的链接,或者一个描述csr_matrix()如何将另一个类中的方法作为参数的链接。这是op留下的线索--我不知道他们是否在追逐兔子。谢谢你,我在手机上发布了这样一个解释性的答案!
# x.transpose() == x.T # ndarray
sparse.csr_matrix(x.transpose())
Out[17]: <12x1 sparse matrix of type '<type 'numpy.int32'>'
with 11 stored elements in Compressed Sparse Row format>
sparse.csr_matrix(x.T)
Out[18]: <12x1 sparse matrix of type '<type 'numpy.int32'>'
with 11 stored elements in Compressed Sparse Row format>
bodies = sparse.rand(12,3,format='csr',density=.1)
sparse.hstack((bodies,X))
Out[32]: <12x4 sparse matrix of type '<type 'numpy.float64'>'
with 14 stored elements in COOrdinate format>