Python 使用内部转换为数值的非数值目标构建回归器
我想建立一个线性回归模型,它接受时间戳作为目标,并在内部使用自1970-01-01 00:00:00(Python 使用内部转换为数值的非数值目标构建回归器,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,我想建立一个线性回归模型,它接受时间戳作为目标,并在内部使用自1970-01-01 00:00:00(pd.Timestamp(0))以来的秒数预测应返回时间戳 我尝试使用转换目标浏览器来实现这一点。但是,我遇到了一个TypeError:invalid type promotion我无法解决 演示代码: import pandas as pd import numpy as np from sklearn.compose import TransformedTargetRegressor fro
pd.Timestamp(0)
)以来的秒数<代码>预测应返回时间戳
我尝试使用转换目标浏览器来实现这一点。但是,我遇到了一个TypeError:invalid type promotion
我无法解决
演示代码:
import pandas as pd
import numpy as np
from sklearn.compose import TransformedTargetRegressor
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import FunctionTransformer
# helper function to convert a 2D numpy array of seconds to a 2D array of timestamps
def _to_timestamp(seconds: np.ndarray):
return pd.DataFrame(seconds).apply(pd.to_datetime, unit='s').values
# helper function to convert a 2D numpy array of timestamps to a 2D array of seconds
def _to_float(timestamps):
deltas = pd.DataFrame(timestamps).sub(pd.Timestamp(0))
return deltas.apply(lambda s: s.dt.total_seconds()).values
# build transformer from helper functions
TimeTransformer = FunctionTransformer(
func=_to_float,
inverse_func=_to_timestamp,
validate=True,
check_inverse=True
)
# make a LinearRegression chained with a TimeTransformer
def TimeTargetLinearRegression():
return TransformedTargetRegressor(
regressor=LinearRegression(),
transformer=TimeTransformer
)
# test run
if __name__ == '__main__':
model = TimeTargetLinearRegression()
X = np.array([[1], [2], [3]], dtype=float)
y = pd.date_range(start=0, periods=3, freq='s')
model.fit(X=X, y=y) # raises TypeError
输出:
Traceback (most recent call last):
File "<input>", line 1, in <module>
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
pydev_imports.execfile(filename, global_vars, local_vars) # execute the script
File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "C:/Users/actualpanda/.PyCharmCE2019.3/config/scratches/scratch2.py", line 36, in <module>
model.fit(X=X, y=y) # raises TypeError
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\compose\_target.py", line 185, in fit
self._fit_transformer(y_2d)
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\compose\_target.py", line 139, in _fit_transformer
self.transformer_.fit(y)
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\preprocessing\_function_transformer.py", line 125, in fit
self._check_inverse_transform(X)
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\preprocessing\_function_transformer.py", line 102, in _check_inverse_transform
if not _allclose_dense_sparse(X[idx_selected], X_round_trip):
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\utils\validation.py", line 1288, in _allclose_dense_sparse
return np.allclose(x, y, rtol=rtol, atol=atol)
File "<__array_function__ internals>", line 5, in allclose
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\numpy\core\numeric.py", line 2159, in allclose
res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
File "<__array_function__ internals>", line 5, in isclose
File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\numpy\core\numeric.py", line 2254, in isclose
dt = multiarray.result_type(y, 1.)
File "<__array_function__ internals>", line 5, in result_type
TypeError: invalid type promotion
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python ce\helpers\pydev\\u pydev_bundle\pydev_umd.py”,第197行,在运行文件中
pydev_imports.execfile(文件名、全局变量、本地变量)#执行脚本
文件“C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python ce\helpers\pydev\\u pydev\u imps\\u pydev\u execfile.py”,第18行,在execfile中
exec(编译(内容+“\n”,文件,'exec'),全局,loc)
文件“C:/Users/actualpanda/.PyCharmCE2019.3/config/scratches/scratch2.py”,第36行,在
model.fit(X=X,y=y)#引发类型错误
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\uulib\site packages\sklearn\compose\\u target.py”,第185行
自配变压器(y\U 2d)
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\uulib\site packages\sklearn\compose\\u target.py”,第139行,在fit\u transformer中
自耦变压器配合(y)
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\sklearn\preprocessing\\u function\u transformer.py”,第125行
自校验反变换(X)
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\uUlib\site packages\sklearn\preprocessing\\u function\u transformer.py”,第102行,在检查逆变换中
如果不是,则全封闭密集稀疏(X[idx\U选定],X\U往返行程):
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\sklearn\utils\validation.py”,第1288行,在\u allclose\u densite\u sparse中
返回np.allclose(x,y,rtol=rtol,atol=atol)
文件“”,第5行,全部关闭
allclose中的文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\numpy\core\numeric.py”,第2159行
res=all(isclose(a,b,rtol=rtol,atol=atol,equal_nan=equal_nan))
文件“”,第5行,在isclose中
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\numpy\core\numeric.py”,第2254行,在isclose中
dt=多数组。结果类型(y,1.)
文件“”,第5行,结果类型
TypeError:无效的类型升级
我正在寻找一个解释/解决类型错误的答案,如果我的方法有缺陷,建议一种构建回归器的方法,该回归器可以处理非数值目标(给定的变换和逆变换函数)
我知道我可以在回归器之外进行变换和逆变换,但我想把这个过程封装在一个整洁的,用户友好型模型,不会泄漏其内部结构。通过变换函数运行y
,然后使用逆变换函数,将输出与原始y
进行比较
当您设置check\u inverse=True
时,会发生这种“往返”比较,并将其传递给np.isclose
。这将生成错误
y = pd.date_range(start=0, periods=3, freq='s')
y_ = TimeTransformer.inverse_func(TimeTransformer.func(y))
np.isclose(y, y_)
# raises:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-68-90ab2804af58> in <module>
----> 1 np.isclose(y, y_)
<__array_function__ internals> in isclose(*args, **kwargs)
C:\Anaconda3\lib\site-packages\numpy\core\numeric.py in isclose(a, b, rtol, atol, equal_nan)
2264 # This will cause casting of x later. Also, make sure to allow subclasses
2265 # (e.g., for numpy.ma).
-> 2266 dt = multiarray.result_type(y, 1.)
2267 y = array(y, dtype=dt, copy=False, subok=True)
2268
<__array_function__ internals> in result_type(*args, **kwargs)
TypeError: invalid type promotion
我的猜测是,np.datetime64
dtype没有为这个方法实现
我在github页面上打开了一个问题。谢谢。我试过设置check\u inverse=False
,知道为什么我仍然会得到相同的错误吗?更新:需要为函数transformer
和transformedTargetRecessor
调用设置check\u inverse=False
。
y2 = np.array(y)
y_ == y2
# returns:
array([[ True],
[ True],
[ True]])
np.isclose(yy, y_)
# raises: TypeError: invalid type promotion
np.core.multiarray.result_type(y_, 1.)
# raises: TypeError: invalid type promotion