Scikit learn sklearn RidgeCV(带样本重量)
我想用sklearn做一个加权岭回归。但是,当我调用fit方法时,代码会中断。我得到的例外是:Scikit learn sklearn RidgeCV(带样本重量),scikit-learn,Scikit Learn,我想用sklearn做一个加权岭回归。但是,当我调用fit方法时,代码会中断。我得到的例外是: Exception: Data must be 1-dimensional 但我确信(通过检查打印语句),我传递的数据具有正确的形状 print temp1.shape #(781, 21) print temp2.shape #(781,) print weights.shape #(781,) result=RidgeCV(normalize=True).fit
Exception: Data must be 1-dimensional
但我确信(通过检查打印语句),我传递的数据具有正确的形状
print temp1.shape #(781, 21)
print temp2.shape #(781,)
print weights.shape #(781,)
result=RidgeCV(normalize=True).fit(temp1,temp2,sample_weight=weights)
会出什么问题
以下是全部输出:
---------------------------------------------------------------------------
Exception Traceback (most recent call last)
<ipython-input-65-a5b1eba5d9cf> in <module>()
22
23
---> 24 result=RidgeCV(normalize=True).fit(temp2,temp1, sample_weight=weights)
25
26
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in fit(self, X, y, sample_weight)
868 gcv_mode=self.gcv_mode,
869 store_cv_values=self.store_cv_values)
--> 870 estimator.fit(X, y, sample_weight=sample_weight)
871 self.alpha_ = estimator.alpha_
872 if self.store_cv_values:
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in fit(self, X, y, sample_weight)
793 else alpha)
794 if error:
--> 795 out, c = _errors(weighted_alpha, y, v, Q, QT_y)
796 else:
797 out, c = _values(weighted_alpha, y, v, Q, QT_y)
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in _errors(self, alpha, y, v, Q, QT_y)
685 w = 1.0 / (v + alpha)
686 c = np.dot(Q, self._diag_dot(w, QT_y))
--> 687 G_diag = self._decomp_diag(w, Q)
688 # handle case where y is 2-d
689 if len(y.shape) != 1:
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in _decomp_diag(self, v_prime, Q)
672 def _decomp_diag(self, v_prime, Q):
673 # compute diagonal of the matrix: dot(Q, dot(diag(v_prime), Q^T))
--> 674 return (v_prime * Q ** 2).sum(axis=-1)
675
676 def _diag_dot(self, D, B):
/usr/local/lib/python2.7/dist-packages/pandas/core/ops.pyc in wrapper(left, right, name)
531 return left._constructor(wrap_results(na_op(lvalues, rvalues)),
532 index=left.index, name=left.name,
--> 533 dtype=dtype)
534 return wrapper
535
/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in __init__(self, data, index, dtype, name, copy, fastpath)
209 else:
210 data = _sanitize_array(data, index, dtype, copy,
--> 211 raise_cast_failure=True)
212
213 data = SingleBlockManager(data, index, fastpath=True)
/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in _sanitize_array(data, index, dtype, copy, raise_cast_failure)
2683 elif subarr.ndim > 1:
2684 if isinstance(data, np.ndarray):
-> 2685 raise Exception('Data must be 1-dimensional')
2686 else:
2687 subarr = _asarray_tuplesafe(data, dtype=dtype)
Exception: Data must be 1-dimensional
---------------------------------------------------------------------------
异常回溯(最后一次最近调用)
在()
22
23
--->24结果=脊线CV(标准化=真)。拟合(temp2、temp1、样本重量=重量)
25
26
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in-fit(self,X,y,sample_-weight)
868 gcv_模式=self.gcv_模式,
869存储(cv值=自存储(cv值)
-->870估计器拟合(X,y,样本权重=样本权重)
871 self.alpha=估计器.alpha_
872如果self.store_cv_值:
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in-fit(self,X,y,sample_-weight)
793(阿尔法)
794如果出现错误:
-->795输出,c=_误差(加权α,y,v,Q,QT_y)
796其他:
797 out,c=_值(加权_α,y,v,Q,QT_y)
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in_错误(self、alpha、y、v、Q、QT_)
685 w=1.0/(v+α)
686 c=np.dot(Q,自诊断点(w,QT,y))
-->687 G_diag=自分解图(w,Q)
688#处理y为二维的情况
689如果透镜(y形)!=1:
/usr/local/lib/python2.7/dist-packages/sklearn/linear_model/ridge.pyc in_decomp_diag(self,v_prime,Q)
672 def_decomp_diag(self,v_prime,Q):
673#计算矩阵的对角线:点(Q,点(diag(v#u素数),Q^T))
-->674返回(v_素数*Q**2).和(轴=-1)
675
676 def诊断点(自、D、B):
/包装器中的usr/local/lib/python2.7/dist-packages/pandas/core/ops.pyc(左、右、名称)
531返回左._构造函数(包装结果(na_op(左值,右值)),
532 index=left.index,name=left.name,
-->533数据类型=数据类型)
534返回包装器
535
/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc in_u___________(self、数据、索引、数据类型、名称、副本、快速路径)
209其他:
210数据=_清理_数组(数据、索引、数据类型、副本、,
-->211提升(铸造(失败=真)
212
213数据=SingleBlockManager(数据、索引、快速路径=True)
/usr/local/lib/python2.7/dist-packages/pandas/core/series.pyc在_sanitize_数组中(数据、索引、数据类型、复制、提升/转换失败)
2683 elif Subar.ndim>1:
2684如果存在(数据,np.ndarray):
->2685 raise异常('数据必须是一维')
2686其他:
2687 subar=\u asarray\u tuplesafe(数据,dtype=dtype)
例外:数据必须是一维的
错误似乎是由于样本权重
是熊猫系列而不是numpy数组造成的:
from sklearn.linear_model import RidgeCV
temp1 = pd.DataFrame(np.random.rand(781, 21))
temp2 = pd.Series(temp1.sum(1))
weights = pd.Series(1 + 0.1 * np.random.rand(781))
result = RidgeCV(normalize=True).fit(temp1, temp2,
sample_weight=weights)
# Exception: Data must be 1-dimensional
如果改用numpy数组,则错误会消失:
result = RidgeCV(normalize=True).fit(temp1, temp2,
sample_weight=weights.values)
这似乎是一个错误;我打开了一个来报告这一点。错误似乎是由于
样本权重
是熊猫系列而不是numpy数组:
from sklearn.linear_model import RidgeCV
temp1 = pd.DataFrame(np.random.rand(781, 21))
temp2 = pd.Series(temp1.sum(1))
weights = pd.Series(1 + 0.1 * np.random.rand(781))
result = RidgeCV(normalize=True).fit(temp1, temp2,
sample_weight=weights)
# Exception: Data must be 1-dimensional
如果改用numpy数组,则错误会消失:
result = RidgeCV(normalize=True).fit(temp1, temp2,
sample_weight=weights.values)
这似乎是一个错误;我已经打开了一个报告