Warning: file_get_contents(/data/phpspider/zhask/data//catemap/2/joomla/2.json): failed to open stream: No such file or directory in /data/phpspider/zhask/libs/function.php on line 167

Warning: Invalid argument supplied for foreach() in /data/phpspider/zhask/libs/tag.function.php on line 1116

Notice: Undefined index: in /data/phpspider/zhask/libs/function.php on line 180

Warning: array_chunk() expects parameter 1 to be array, null given in /data/phpspider/zhask/libs/function.php on line 181
Python 3.x 学习预测尾巴而不是头部_Python 3.x_Pandas_Numpy_Scikit Learn - Fatal编程技术网

Python 3.x 学习预测尾巴而不是头部

Python 3.x 学习预测尾巴而不是头部,python-3.x,pandas,numpy,scikit-learn,Python 3.x,Pandas,Numpy,Scikit Learn,我试图用线性回归创建一个简单的预测。在我看来,这应该可以预测未来的价值,但我显然弄错了什么。它似乎是从数据帧的尾部获取数据,而不是最近的数据点。我用的是谷歌的股票价格和alpha_vantage的 获取股票信息的API ts = TimeSeries(key=api_key, output_format='pandas') df, df_meta = ts.get_daily(symbol='GOOGL', outputsize='full') df = df[['1. open', '2.

我试图用线性回归创建一个简单的预测。在我看来,这应该可以预测未来的价值,但我显然弄错了什么。它似乎是从数据帧的尾部获取数据,而不是最近的数据点。我用的是谷歌的股票价格和alpha_vantage的 获取股票信息的API

ts = TimeSeries(key=api_key, output_format='pandas')
df, df_meta = ts.get_daily(symbol='GOOGL', outputsize='full')

df = df[['1. open', '2. high', '3. low', '4. close', '5. volume']]
df['HL_PCT'] = (df['2. high'] - df['4. close']) / df['4. close'] * 100.0
df['PCT_change'] = (df['4. close'] - df['1. open']) / df['1. open'] * 100.0
df = df[['4. close', 'HL_PCT', 'PCT_change', '5. volume']]

forecast_col = '4. close'

df.fillna(-99999, inplace=True)

forecast_out = int(math.ceil(0.01*len(df)))
print(forecast_out)
# Moving columns negatively
df['label'] = df[forecast_col].shift(-forecast_out)

# Features
X = np.array(df.drop(['label'], 1))
X = preprocessing.scale(X)
X_lately = X[-forecast_out:]
X = X[:-forecast_out]

df.dropna(inplace=True)
# Labels
y = np.array(df['label'])
y = np.array(df['label'])

X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=0.2)

clf = LinearRegression(n_jobs=-1)
# Fit is synonymous with train
clf.fit(X_train, y_train)
# Score is synonymous with test
accuracy = clf.score(X_test, y_test)

forecast_set = clf.predict(X_lately)

print(forecast_set, accuracy, forecast_out)

它返回200年代左右的值,这显然不是对2020年的预测。

我发现我所遵循的教程中的数据框被翻转了,因此我最近的数据点是他最早的数据点。通过使用以下方式翻转自己的数据帧,轻松解决此问题:

df = df[::-1]

我发现我所遵循的教程中的数据框被翻转了,所以我最近的数据点是他最早的数据点。通过使用以下方式翻转自己的数据帧,轻松解决此问题:

df = df[::-1]