Warning: file_get_contents(/data/phpspider/zhask/data//catemap/8/meteor/3.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 使用内部转换为数值的非数值目标构建回归器_Python_Pandas_Scikit Learn - Fatal编程技术网

Python 使用内部转换为数值的非数值目标构建回归器

Python 使用内部转换为数值的非数值目标构建回归器,python,pandas,scikit-learn,Python,Pandas,Scikit Learn,我想建立一个线性回归模型,它接受时间戳作为目标,并在内部使用自1970-01-01 00:00:00(pd.Timestamp(0))以来的秒数预测应返回时间戳 我尝试使用转换目标浏览器来实现这一点。但是,我遇到了一个TypeError:invalid type promotion我无法解决 演示代码: import pandas as pd import numpy as np from sklearn.compose import TransformedTargetRegressor fro

我想建立一个线性回归模型,它接受时间戳作为目标,并在内部使用自1970-01-01 00:00:00(
pd.Timestamp(0)
)以来的秒数<代码>预测应返回时间戳

我尝试使用
转换目标浏览器
来实现这一点。但是,我遇到了一个
TypeError:invalid type promotion
我无法解决

演示代码:

import pandas as pd
import numpy as np
from sklearn.compose import TransformedTargetRegressor
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import FunctionTransformer

# helper function to convert a 2D numpy array of seconds to a 2D array of timestamps
def _to_timestamp(seconds: np.ndarray):
    return pd.DataFrame(seconds).apply(pd.to_datetime, unit='s').values

# helper function to convert a 2D numpy array of timestamps to a 2D array of seconds
def _to_float(timestamps):
    deltas = pd.DataFrame(timestamps).sub(pd.Timestamp(0))
    return deltas.apply(lambda s: s.dt.total_seconds()).values

# build transformer from helper functions
TimeTransformer = FunctionTransformer(
    func=_to_float,
    inverse_func=_to_timestamp,
    validate=True,
    check_inverse=True
)

# make a LinearRegression chained with a TimeTransformer
def TimeTargetLinearRegression():
    return TransformedTargetRegressor(
        regressor=LinearRegression(),
        transformer=TimeTransformer
    )

# test run
if __name__ == '__main__':
    model = TimeTargetLinearRegression()
    X = np.array([[1], [2], [3]], dtype=float)
    y = pd.date_range(start=0, periods=3, freq='s')
    model.fit(X=X, y=y) # raises TypeError
输出:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python-ce\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python-ce\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/actualpanda/.PyCharmCE2019.3/config/scratches/scratch2.py", line 36, in <module>
    model.fit(X=X, y=y) # raises TypeError
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\compose\_target.py", line 185, in fit
    self._fit_transformer(y_2d)
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\compose\_target.py", line 139, in _fit_transformer
    self.transformer_.fit(y)
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\preprocessing\_function_transformer.py", line 125, in fit
    self._check_inverse_transform(X)
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\preprocessing\_function_transformer.py", line 102, in _check_inverse_transform
    if not _allclose_dense_sparse(X[idx_selected], X_round_trip):
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\sklearn\utils\validation.py", line 1288, in _allclose_dense_sparse
    return np.allclose(x, y, rtol=rtol, atol=atol)
  File "<__array_function__ internals>", line 5, in allclose
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\numpy\core\numeric.py", line 2159, in allclose
    res = all(isclose(a, b, rtol=rtol, atol=atol, equal_nan=equal_nan))
  File "<__array_function__ internals>", line 5, in isclose
  File "C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox_\lib\site-packages\numpy\core\numeric.py", line 2254, in isclose
    dt = multiarray.result_type(y, 1.)
  File "<__array_function__ internals>", line 5, in result_type
TypeError: invalid type promotion
回溯(最近一次呼叫最后一次):
文件“”,第1行,在
文件“C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python ce\helpers\pydev\\u pydev_bundle\pydev_umd.py”,第197行,在运行文件中
pydev_imports.execfile(文件名、全局变量、本地变量)#执行脚本
文件“C:\Program Files\JetBrains\PyCharm Community Edition 2019.3.3\plugins\python ce\helpers\pydev\\u pydev\u imps\\u pydev\u execfile.py”,第18行,在execfile中
exec(编译(内容+“\n”,文件,'exec'),全局,loc)
文件“C:/Users/actualpanda/.PyCharmCE2019.3/config/scratches/scratch2.py”,第36行,在
model.fit(X=X,y=y)#引发类型错误
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\uulib\site packages\sklearn\compose\\u target.py”,第185行
自配变压器(y\U 2d)
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\uulib\site packages\sklearn\compose\\u target.py”,第139行,在fit\u transformer中
自耦变压器配合(y)
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\sklearn\preprocessing\\u function\u transformer.py”,第125行
自校验反变换(X)
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\uUlib\site packages\sklearn\preprocessing\\u function\u transformer.py”,第102行,在检查逆变换中
如果不是,则全封闭密集稀疏(X[idx\U选定],X\U往返行程):
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\sklearn\utils\validation.py”,第1288行,在\u allclose\u densite\u sparse中
返回np.allclose(x,y,rtol=rtol,atol=atol)
文件“”,第5行,全部关闭
allclose中的文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\numpy\core\numeric.py”,第2159行
res=all(isclose(a,b,rtol=rtol,atol=atol,equal_nan=equal_nan))
文件“”,第5行,在isclose中
文件“C:\Users\actualpanda\.virtualenvs\SomeProject--3333Ox\ulib\site packages\numpy\core\numeric.py”,第2254行,在isclose中
dt=多数组。结果类型(y,1.)
文件“”,第5行,结果类型
TypeError:无效的类型升级
我正在寻找一个解释/解决
类型错误的答案,如果我的方法有缺陷,建议一种构建回归器的方法,该回归器可以处理非数值目标(给定的变换和逆变换函数)


我知道我可以在回归器之外进行变换和逆变换,但我想把这个过程封装在一个整洁的,用户友好型模型,不会泄漏其内部结构。

通过变换函数运行
y
,然后使用逆变换函数,将输出与原始
y
进行比较

当您设置
check\u inverse=True
时,会发生这种“往返”比较,并将其传递给
np.isclose
。这将生成错误

y = pd.date_range(start=0, periods=3, freq='s')
y_ = TimeTransformer.inverse_func(TimeTransformer.func(y))

np.isclose(y, y_)
# raises:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-68-90ab2804af58> in <module>
----> 1 np.isclose(y, y_)

<__array_function__ internals> in isclose(*args, **kwargs)

C:\Anaconda3\lib\site-packages\numpy\core\numeric.py in isclose(a, b, rtol, atol, equal_nan)
   2264     # This will cause casting of x later. Also, make sure to allow subclasses
   2265     # (e.g., for numpy.ma).
-> 2266     dt = multiarray.result_type(y, 1.)
   2267     y = array(y, dtype=dt, copy=False, subok=True)
   2268

<__array_function__ internals> in result_type(*args, **kwargs)

TypeError: invalid type promotion
我的猜测是,
np.datetime64
dtype没有为这个方法实现


我在github页面上打开了一个问题。

谢谢。我试过设置
check\u inverse=False
,知道为什么我仍然会得到相同的错误吗?更新:需要为
函数transformer
transformedTargetRecessor
调用设置
check\u inverse=False
y2 = np.array(y)
y_ == y2 
# returns:
array([[ True],
       [ True],
       [ True]])

np.isclose(yy, y_)
# raises: TypeError: invalid type promotion

np.core.multiarray.result_type(y_, 1.)
# raises: TypeError: invalid type promotion